Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
Status
The first post of this thread is a WikiPost and can be edited by anyone with the appropiate permissions. Your edits will be public.

Syncretic

macrumors 6502
Apr 22, 2019
311
1,533
I found a few minutes to run through the BootROM code that handles NVRAM variables. Here are the highlights of what I found. (If I find more time (read: if a magical djinn appears and offers me infinite time), I'll post a detailed write-up with all the relevant protocol GUIDs, the flow, etc.) (Note that the following is based on a Mac Pro 5,1 144.0.0.0.0 BootROM; other Macs probably have similar NVRAM code, but the observations below may or may not apply. Your mileage may vary.)

First, some quick background so things will make a little more sense:
Flash memory's "erased" state sets all bits to 1 (e.g. hex 0xFF). You can change one or more bits in any flash byte to 0, but you can't change a 0 to a 1 without an erase operation. While individual bytes can be read or written, erases must occur in 4kB blocks - therefore, if you want to change one bit from 0 to 1, you have to erase the entire 4k block that contains it. Flash memory, like an SSD or the SPI BootROM in our Macs, physically wears out with repeated usage (the oxide layer between the transistors degenerates a tiny bit with each write, and degenerates significantly with each erase - not unlike the way writing and erasing words from a piece of paper wears the paper out over time). Because of this, steps are always taken to minimize the number of erasures that occur, and ideally, try to spread the erasures around so that the chip wears evenly (known as "wear leveling") and one tiny area doesn't wear out long before the rest of the chip does (known as "My Mac just became a brick.").

The EFI NVRAM area minimizes erasures this way: when a variable is written, a flag bit is set indicating that the variable is valid. When that variable gets deleted, that bit gets set to 0, indicating the variable is no longer valid and should be ignored. Variables are stored back-to-back in a list, with new variables written to the end of the list. Only when the list gets nearly full (or something goes wrong, or you manually clear the NVRAM) does an erasure occur - then all the valid variables get copied out to a RAM buffer, the NVRAM gets erased, then those valid variables get written back out to the NVRAM (now condensed, without all the deleted ones). Replacing an existing variable (such as changing your boot-args) marks the existing variable as invalid and creates a new variable at the end of the list - so, repeatedly changing your boot-args (or any other variable) without rebooting will reduce your NVRAM free space. Both EFI and MacOS create variables for various reasons; therefore, over time (and particularly when rebooting), the NVRAM variable space will repeatedly fill up and be condensed as part of normal operation. The VSS free space will naturally go down and up based on normal usage.

Highlights of my quick tour through the code:
  • Variables are limited to 2048 bytes, including the variable name (as stored in UTF-16) and a 32-byte header (which includes the variable's GUID). Normally, this isn't a problem; however, large certificates or panic reports may run into this limit.
  • There appears to be a limit of 122 variables in the VSS. This seems arbitrary; I can find no particular reason for this limitation to exist, or why the value 122 was chosen.
  • Setting an existing variable to the same value it already had does not change NVRAM at all. The system compares the name and the value, and if they're identical, it does nothing (again, to minimize NVRAM erasures).
  • Holding Cmd-Opt-P-R at boot does exactly two three things: first, it invokes the AppleRtcRam protocol to clear the RTCRAM variables; second, it writes 4 bytes of 0 to the beginning of the variable list in VSS1 (effectively invalidating the variable list); then, it reboots the system. There is no counter or other means to see how many reboots you've held Cmd-Opt-P-R through.
  • The NVRAM Data FFV contains two VSS stores, one FTW store, one Fsys store, and one Gaid store. All of those are defined by TianoCore; however, it appears that of those, only VSS is actually used by Apple's EFI. (EDIT: per @tsialex, Fsys is used to store system identifiers. I have not verified this, but I trust that he's correct. I was so focused on the VSS areas that I might easily have overlooked some Fsys code.)
  • While there are two VSS stores, the second one is only used during garbage collection (GC) (also known as Reclamation). During normal operation, only VSS1 is used. During GC, the condensed variables are written first to VSS2, then to VSS1, with a flag bit indicating whether or not the write was completed successfully. If all went well, VSS2 will be ignored until the next GC. If there was a failure during GC and VSS2 looks valid at the next boot, VSS2 will be copied to VSS1. If both VSS1 and VSS2 look problematic at boot time, both will be erased. (This is a crude form of fault-tolerant writing (FTW); I note that EFI defines a distinct FTW protocol, and the BootROM contains a dedicated FTW area, but Apple does not seem to use "real" FTW, only what I described above.)
  • NVRAM access is an EFI Runtime service. This means that the EFI code that loads at boot time stays resident and handles NVRAM requests even after the OS (e.g. MacOS) has loaded.
  • Normally, Garbage Collection occurs when the VSS has less than 2048 bytes free. However, Garbage Collection only occurs during boot time, not after the OS has loaded. If your VSS is nearly full and you run something (like an installer) that generates a lot of NVRAM variable activity, you're likely to encounter a failure.
  • It's hard to be certain, but it appears that variables larger than 2048 bytes might actually be able to cause problems, due to bugs/oversights in the EFI code. Various variable-handling routines check the size, then silently fail mid-execution if the variable is too large; if a huge variable slips past what should be the earliest size checks (which seems possible), it could potentially make a mess. I haven't experienced it firsthand, but the code doesn't look all that robust with respect to huge variables.
  • During system initialization, if VSS1 appears to contain no variables but is not completely erased (filled with 0xFF), the system will erase both VSS1 and VSS2 (erasure includes writing new headers as well). This is what occurs when Cmd-Opt-P-R is held at boot time. (Basically, if the variable area looks like it's invalid, the system just wipes out both VSS areas and starts with a clean slate.)
  • The RTCRAM contains variables used by the memory test, boot picker, encryption keys, etc. During normal operation, the RTCRAM is not cleared (including except when holding Cmd-Opt-P-R at boot).
  • If you create an NVRAM variable named ResetNVRam (note: case-sensitive name) containing any value, at the next reboot the system will completely erase and rebuild both VSS1 and VSS2 and erase the RTCRAM variable area. Aside from removing the BR2032 battery or writing a specialized program, this is the only way I know of to clear RTCRAM. Note that the VSS1/VSS2 clearing is unequivocal. (sudo nvram ResetNVRam=1 is sufficient to trigger this.) This is the "deepest" (most thorough) variable clearing you can get, short of re-flashing a virgin BootROM image. (EDIT: I looked again, and it's even in my notes - a Cmd-Opt-P-R boot does clear RTCRAM. I overlooked that part of my notes when I first made this post. (Oops!) So, to be clear: Cmd-Opt-P-R, when it works, triggers a reboot to erase the RTCRAM variables, then VSS1 & VSS2. Setting the ResetNVRam variable causes the next boot to erase VSS1 & VSS2, then erase the RTCRAM variables. For reasons unclear, Cmd-Opt-P-R sometimes fails to produce the desired erasures; this may be the result of a timing issue with the RTCRAM clearing, or the initial VSS invalidation silently failing.)
  • Even though the EFI firmware uses a filesystem of sorts (FFVs), the NVRAM-handling code makes many assumptions about the size of the flash chip, and where the NVRAM data physically resides. This means that the NVRAM Data FFV cannot easily be moved (unlike, say, the code FFVs that follow it) or resized. Because of this, the NVRAM Data areas (VSS) get erased frequently, while the rest of the chip does not, which will eventually become the source of a chip failure.
  • The BootROM "knows" about a short list of SPI flash chips, and it adjusts its commands/timing based on the chip it identifies. If you replace your SPI flash chip with one that the BootROM doesn't recognize, it might not work at all. The list (which, surprisingly, includes 8, 2, and 1 MB chips as well as the standard 4MB - I've marked the 4MB chips with asterisks) is:
    Code:
    Winbond 25X64
    WinBond 25X32 *
    Eon M25P16
    Eon M25P32 *
    Atmel 45DB321 *
    ST Micro M25P32 *
    ST Micro M25P16
    Macronix 25L6436
    Macronix 25L3205 *
    Macronix 25L1605
    Spansion MBM29DL32TF *
    SST 25VF032 *
    SST 25VF016
    SST 25VF080
    If your replacement chip doesn't have a JEDEC ID that matches one of those, you might have a big problem. (Note that it's the JEDEC ID that's important; entire families of chips might share the same JEDEC ID, so the list of models above isn't exclusive. For example, the MX25L3206E (not listed above) reports the same JEDEC ID as the MX25L3205 (which is listed above), so those two should be interchangeable.)

I was hoping to find a reasonably straightforward way to improve the "failing BootROM" situation by either enlarging the VSS areas or perhaps implementing some sort of wear-leveling. However, there's no handy table of addresses, offsets, and sizes that can be modified to fix the problem in one place - there are constants and assumptions scattered throughout the code, and overlooking any one of them would create a disastrous outcome. Enlarging the VSS is effectively out of the question. Adding dynamic wear leveling is also seriously problematic. However, I do hold out a slender thread of hope that it might be possible to relocate the VSS areas to a previously-unused area of the flash chip. That would effectively reset/extend the chip's "countdown to doom," since the frequently-erased area would be relocated to an area that's effectively brand-new.


Some fun facts:
  • If you had a Mac-compatible EFI video card, and your BootROM contained an EFI shell, holding the ESC key at boot time would boot into the EFI shell. (The key map is there; I suspect this was a testing/debugging hook that didn't get entirely removed.) If you're keen on the idea, you could potentially add an EFI shell to your BootROM and boot to it natively. (OpenCore or RefindPlus are a whole lot simpler, though.)
  • The Boot Chime sound is stored in the FFV 1F9CABF9-3F3C-4CFF-AED8-5FDF745A0DCC as a headless AIFF file (encoded with Apple IMA 4:1 (ADPCM)). If you wanted to change your boot chime to the first bars of "We Will Rock You" or "Baby Shark," you could just replace FFV 1F9CABF9-3F3C-4CFF-AED8-5FDF745A0DCC with your own choice of sounds. (I haven't tried this yet, but it should work.)

Finally, an interesting observation:
  • Based on examining the code, I can't see any reason why holding Cmd-Opt-P-R through 3+ boot chimes (aka a "Deep NVRAM Reset") would have any special effect. When holding Cmd-Opt-P-R, the first boot chime you hear is immediately after the RAM test, before the keyboard is initialized by EFI. When the keyboard is discovered and those keys are recognized, the RTCRAM variables are cleared, then the NVRAM variable space is invalidated and the system is rebooted. The second chime you hear is the boot that will reformat the VSS segments. Based on my analysis of the code, holding Cmd-Opt-P-R for more than two chimes simply performs redundant erasures of the VSS areas, increasing the chip wear without changing anything. (If repeated formatting does change something, that's probably indicative of a failing flash chip, or a silent failure of the Cmd-Opt-P-R code to actually invalidate the NVRAM variable space.)
    If you're able to boot your system to a desktop and open a Terminal window, typing
    sudo nvram ResetNVRam=1<enter> and then rebooting should clean your variable space even more than just as well as Cmd-Opt-P-R can, and with only the one reboot and one VSS erasure.
EDIT: I certainly believe it when @tsialex and others say they see a difference with multiple Cmd-Opt-P-R reboots; I'm not suggesting that they don't. I just don't see a code path that supports it. The closest I can come is that the invalidation of the VSS - where it writes 4 bytes of 0 and immediately reboots - could silently fail. The write doesn't rely on the VSS being consistent, so it should work even if it's corrupt; however, it doesn't check any return codes at all, just throws those zeros at the VSS and reboots without checking to see that they got written. That means that the next reboot might not see an invalid VSS, so you'll either see no improvement, or possibly standard Garbage Collection (if it got triggered). If that's the case, then the longer you hold down Cmd-Opt-P-R, the more likely you are to (eventually) get a successful NVRAM reset. I really don't see any mechanism for a cascading effect, where subsequent resets build upon the previous resets and somehow make things "cleaner..." (And, IMHO, the ResetNVRam variable trick is still better, if you have a bootable system. Cmd-Opt-P-R basically fouls the NVRAM and relies on the next reboot's startup code to catch it and fix it. The ResetNVRam variable causes the VSS and RTCRAM to be completely erased and rebuilt immediately, no ifs, ands, or buts.)
 
Last edited:

tsialex

Contributor
Original poster
Jun 13, 2016
13,454
13,601
I found a few minutes to run through the BootROM code that handles NVRAM variables. Here are the highlights of what I found. (If I find more time (read: if a magical djinn appears and offers me infinite time), I'll post a detailed write-up with all the relevant protocol GUIDs, the flow, etc.) (Note that the following is based on a Mac Pro 5,1 144.0.0.0.0 BootROM; other Macs probably have similar NVRAM code, but the observations below may or may not apply. Your mileage may vary.)

First, some quick background so things will make a little more sense:
Flash memory's "erased" state sets all bits to 1 (e.g. hex 0xFF). You can change one or more bits in any flash byte to 0, but you can't change a 0 to a 1 without an erase operation. While individual bytes can be read or written, erases must occur in 4kB blocks - therefore, if you want to change one bit from 0 to 1, you have to erase the entire 4k block that contains it. Flash memory, like an SSD or the SPI BootROM in our Macs, physically wears out with repeated usage (the oxide layer between the transistors degenerates a tiny bit with each write, and degenerates significantly with each erase - not unlike the way writing and erasing words from a piece of paper wears the paper out over time). Because of this, steps are always taken to minimize the number of erasures that occur, and ideally, try to spread the erasures around so that the chip wears evenly (known as "wear leveling") and one tiny area doesn't wear out long before the rest of the chip does (known as "My Mac just became a brick.").

The EFI NVRAM area minimizes erasures this way: when a variable is written, a flag bit is set indicating that the variable is valid. When that variable gets deleted, that bit gets set to 0, indicating the variable is no longer valid and should be ignored. Variables are stored back-to-back in a list, with new variables written to the end of the list. Only when the list gets nearly full (or something goes wrong, or you manually clear the NVRAM) does an erasure occur - then all the valid variables get copied out to a RAM buffer, the NVRAM gets erased, then those valid variables get written back out to the NVRAM (now condensed, without all the deleted ones). Replacing an existing variable (such as changing your boot-args) marks the existing variable as invalid and creates a new variable at the end of the list - so, repeatedly changing your boot-args (or any other variable) without rebooting will reduce your NVRAM free space. Both EFI and MacOS create variables for various reasons; therefore, over time (and particularly when rebooting), the NVRAM variable space will repeatedly fill up and be condensed as part of normal operation. The VSS free space will naturally go down and up based on normal usage.

Highlights of my quick tour through the code:
  • Variables are limited to 2048 bytes, including the variable name (as stored in UTF-16) and a 32-byte header (which includes the variable's GUID). Normally, this isn't a problem; however, large certificates or panic reports may run into this limit.
  • There appears to be a limit of 122 variables in the VSS. This seems arbitrary; I can find no particular reason for this limitation to exist, or why the value 122 was chosen.
  • Setting an existing variable to the same value it already had does not change NVRAM at all. The system compares the name and the value, and if they're identical, it does nothing (again, to minimize NVRAM erasures).
  • Holding Cmd-Opt-P-R at boot does exactly two things: first, it writes 4 bytes of 0 to the beginning of the variable list in VSS1 (effectively invalidating the variable list); then, it reboots the system. There is no counter or other means to see how many reboots you've held Cmd-Opt-P-R through.
  • The NVRAM Data FFV contains two VSS stores, one FTW store, one Fsys store, and one Gaid store. All of those are defined by TianoCore; however, it appears that of those, only VSS is actually used by Apple's EFI.
MP3,1 and MP4,1 log garbage collection runs with the FTW store. Apple used it at least until around MP41.0081.B03/B04 and B07 dumps already don't have it. All Macs, even M1 ones, store hardwareIDs with Fsys area, it's where the hardware descriptor, SSN, HWC, SON and more info, like repairs/refurbishment data are stored.
  • While there are two VSS stores, the second one is only used during garbage collection (GC) (also known as Reclamation). During normal operation, only VSS1 is used. During GC, the condensed variables are written first to VSS2, then to VSS1, with a flag bit indicating whether or not the write was completed successfully. If all went well, VSS2 will be ignored until the next GC. If there was a failure during GC and VSS2 looks valid at the next boot, VSS2 will be copied to VSS1. If both VSS1 and VSS2 look problematic at boot time, both will be erased. (This is a crude form of fault-tolerant writing (FTW); I note that EFI defines a distinct FTW protocol, and the BootROM contains a dedicated FTW area, but Apple does not seem to use "real" FTW, only what I described above.)
  • NVRAM access is an EFI Runtime service. This means that the EFI code that loads at boot time stays resident and handles NVRAM requests even after the OS (e.g. MacOS) has loaded.
  • Normally, Garbage Collection occurs when the VSS has less than 2048 bytes free. However, Garbage Collection only occurs during boot time, not after the OS has loaded. If your VSS is nearly full and you run something (like an installer) that generates a lot of NVRAM variable activity, you're likely to encounter a failure.
  • It's hard to be certain, but it appears that variables larger than 2048 bytes might actually be able to cause problems, due to bugs/oversights in the EFI code. Various variable-handling routines check the size, then silently fail mid-execution if the variable is too large; if a huge variable slips past what should be the earliest size checks (which seems possible), it could potentially make a mess. I haven't experienced it firsthand, but the code doesn't look all that robust with respect to huge variables.
  • During system initialization, if VSS1 appears to contain no variables but is not completely erased (filled with 0xFF), the system will erase both VSS1 and VSS2 (erasure includes writing new headers as well). This is what occurs when Cmd-Opt-P-R is held at boot time. (Basically, if the variable area looks like it's invalid, the system just wipes out both VSS areas and starts with a clean slate.)
  • The RTCRAM contains variables used by the memory test, boot picker, encryption keys, etc. During normal operation, the RTCRAM is not cleared (including when holding Cmd-Opt-P-R at boot).
  • If you create an NVRAM variable named ResetNVRam (note: case-sensitive name) containing any value, at the next reboot the system will completely erase and rebuild both VSS1 and VSS2 and erase the RTCRAM variable area. Aside from removing the BR2032 battery or writing a specialized program, this is the only way I know of to clear RTCRAM. Note that the VSS1/VSS2 clearing is unequivocal. (sudo nvram ResetNVRam=1 is sufficient to trigger this.) This is the "deepest" (most thorough) variable clearing you can get, short of re-flashing a virgin BootROM image.
  • Even though the EFI firmware uses a filesystem of sorts (FFVs), the NVRAM-handling code makes many assumptions about the size of the flash chip, and where the NVRAM data physically resides. This means that the NVRAM Data FFV cannot easily be moved (unlike, say, the code FFVs that follow it) or resized. Because of this, the NVRAM Data areas (VSS) get erased frequently, while the rest of the chip does not, which will eventually become the source of a chip failure.
  • The BootROM "knows" about a short list of SPI flash chips, and it adjusts its commands/timing based on the chip it identifies. If you replace your SPI flash chip with one that the BootROM doesn't recognize, it might not work at all. The list (which, surprisingly, includes 8, 2, and 1 MB chips as well as the standard 4MB - I've marked the 4MB chips with asterisks) is:
    Code:
    Winbond 25X64
    WinBond 25X32 *
    Eon M25P16
    Eon M25P32 *
    Atmel 45DB321 *
    ST Micro M25P32 *
    ST Micro M25P16
    Macronix 25L6436
    Macronix 25L3205 *
    Macronix 25L1605
    Spansion MBM29DL32TF *
    SST 25VF032 *
    SST 25VF016
    SST 25VF080
    If your replacement chip doesn't have a JEDEC ID that matches one of those, you might have a big problem.
Same efiflasher code is used for earlier and later Macs. Btw, Apple used Macronix MX25L3206E with mid-2012s.
I was hoping to find a reasonably straightforward way to improve the "failing BootROM" situation by either enlarging the VSS areas or perhaps implementing some sort of wear-leveling. However, there's no handy table of addresses, offsets, and sizes that can be modified to fix the problem in one place - there are constants and assumptions scattered throughout the code, and overlooking any one of them would create a disastrous outcome. Enlarging the VSS is effectively out of the question. Adding dynamic wear leveling is also seriously problematic. However, I do hold out a slender thread of hope that it might be possible to relocate the VSS areas to a previously-unused area of the flash chip.
Apple did that with MacPro6,1, there are multiple VSS stores used on rotation and clearly it's intended as a way to spread the wear over the SPI.
That would effectively reset/extend the chip's "countdown to doom," since the frequently-erased area would be relocated to an area that's effectively brand-new.


Some fun facts:
  • If you had a Mac-compatible EFI video card, and your BootROM contained an EFI shell, holding the ESC key at boot time would boot into the EFI shell. (The key map is there; I suspect this was a testing/debugging hook that didn't get entirely removed.) If you're keen on the idea, you could potentially add an EFI shell to your BootROM and boot to it natively. (OpenCore or RefindPlus are a whole lot simpler, though.)
  • The Boot Chime sound is stored in the FFV 1F9CABF9-3F3C-4CFF-AED8-5FDF745A0DCC as a headless AIFF file (encoded with Apple IMA 4:1 (ADPCM)). If you wanted to change your boot chime to the first bars of "We Will Rock You" or "Baby Shark," you could just replace FFV 1F9CABF9-3F3C-4CFF-AED8-5FDF745A0DCC with your own choice of sounds. (I haven't tried this yet, but it should work.)

Finally, an interesting observation:
  • Based on examining the code, I can't see any reason why holding Cmd-Opt-P-R through 3+ boot chimes (aka a "Deep NVRAM Reset") would have any special effect. When holding Cmd-Opt-P-R, the first boot chime you hear is immediately after the RAM test, before the keyboard is initialized by EFI. When the keyboard is discovered and those keys are recognized, the NVRAM variable space is invalidated and the system is rebooted. The second chime you hear is the boot that will reformat the VSS segments. Based on my analysis of the code, holding Cmd-Opt-P-R for more than two chimes simply performs redundant erasures of the VSS areas, increasing the chip wear without changing anything. (If repeated formatting does change something, that's probably indicative of a failing flash chip.)
    If you're able to boot your system to a desktop and open a Terminal window, typing
    sudo nvram ResetNVRam=1<enter> and then rebooting should clean your variable space even more than Cmd-Opt-P-R can, and with only the one reboot and one VSS erasure.
While you didn't (maybe yet) found it on the code, it's clear from the dumps (when you use the same source and do one or 4 continuous NVRAM resets) that are two different behaviors.

Btw, thx for looking at that!!!!
 
Last edited:

EvilMonk

macrumors 6502
Aug 28, 2006
330
64
Montreal, Canada
I found a few minutes to run through the BootROM code that handles NVRAM variables. Here are the highlights of what I found. (If I find more time (read: if a magical djinn appears and offers me infinite time), I'll post a detailed write-up with all the relevant protocol GUIDs, the flow, etc.) (Note that the following is based on a Mac Pro 5,1 144.0.0.0.0 BootROM; other Macs probably have similar NVRAM code, but the observations below may or may not apply. Your mileage may vary.)

First, some quick background so things will make a little more sense:
Flash memory's "erased" state sets all bits to 1 (e.g. hex 0xFF). You can change one or more bits in any flash byte to 0, but you can't change a 0 to a 1 without an erase operation. While individual bytes can be read or written, erases must occur in 4kB blocks - therefore, if you want to change one bit from 0 to 1, you have to erase the entire 4k block that contains it. Flash memory, like an SSD or the SPI BootROM in our Macs, physically wears out with repeated usage (the oxide layer between the transistors degenerates a tiny bit with each write, and degenerates significantly with each erase - not unlike the way writing and erasing words from a piece of paper wears the paper out over time). Because of this, steps are always taken to minimize the number of erasures that occur, and ideally, try to spread the erasures around so that the chip wears evenly (known as "wear leveling") and one tiny area doesn't wear out long before the rest of the chip does (known as "My Mac just became a brick.").

The EFI NVRAM area minimizes erasures this way: when a variable is written, a flag bit is set indicating that the variable is valid. When that variable gets deleted, that bit gets set to 0, indicating the variable is no longer valid and should be ignored. Variables are stored back-to-back in a list, with new variables written to the end of the list. Only when the list gets nearly full (or something goes wrong, or you manually clear the NVRAM) does an erasure occur - then all the valid variables get copied out to a RAM buffer, the NVRAM gets erased, then those valid variables get written back out to the NVRAM (now condensed, without all the deleted ones). Replacing an existing variable (such as changing your boot-args) marks the existing variable as invalid and creates a new variable at the end of the list - so, repeatedly changing your boot-args (or any other variable) without rebooting will reduce your NVRAM free space. Both EFI and MacOS create variables for various reasons; therefore, over time (and particularly when rebooting), the NVRAM variable space will repeatedly fill up and be condensed as part of normal operation. The VSS free space will naturally go down and up based on normal usage.

Highlights of my quick tour through the code:
  • Variables are limited to 2048 bytes, including the variable name (as stored in UTF-16) and a 32-byte header (which includes the variable's GUID). Normally, this isn't a problem; however, large certificates or panic reports may run into this limit.
  • There appears to be a limit of 122 variables in the VSS. This seems arbitrary; I can find no particular reason for this limitation to exist, or why the value 122 was chosen.
  • Setting an existing variable to the same value it already had does not change NVRAM at all. The system compares the name and the value, and if they're identical, it does nothing (again, to minimize NVRAM erasures).
  • Holding Cmd-Opt-P-R at boot does exactly two things: first, it writes 4 bytes of 0 to the beginning of the variable list in VSS1 (effectively invalidating the variable list); then, it reboots the system. There is no counter or other means to see how many reboots you've held Cmd-Opt-P-R through.
  • The NVRAM Data FFV contains two VSS stores, one FTW store, one Fsys store, and one Gaid store. All of those are defined by TianoCore; however, it appears that of those, only VSS is actually used by Apple's EFI.
  • While there are two VSS stores, the second one is only used during garbage collection (GC) (also known as Reclamation). During normal operation, only VSS1 is used. During GC, the condensed variables are written first to VSS2, then to VSS1, with a flag bit indicating whether or not the write was completed successfully. If all went well, VSS2 will be ignored until the next GC. If there was a failure during GC and VSS2 looks valid at the next boot, VSS2 will be copied to VSS1. If both VSS1 and VSS2 look problematic at boot time, both will be erased. (This is a crude form of fault-tolerant writing (FTW); I note that EFI defines a distinct FTW protocol, and the BootROM contains a dedicated FTW area, but Apple does not seem to use "real" FTW, only what I described above.)
  • NVRAM access is an EFI Runtime service. This means that the EFI code that loads at boot time stays resident and handles NVRAM requests even after the OS (e.g. MacOS) has loaded.
  • Normally, Garbage Collection occurs when the VSS has less than 2048 bytes free. However, Garbage Collection only occurs during boot time, not after the OS has loaded. If your VSS is nearly full and you run something (like an installer) that generates a lot of NVRAM variable activity, you're likely to encounter a failure.
  • It's hard to be certain, but it appears that variables larger than 2048 bytes might actually be able to cause problems, due to bugs/oversights in the EFI code. Various variable-handling routines check the size, then silently fail mid-execution if the variable is too large; if a huge variable slips past what should be the earliest size checks (which seems possible), it could potentially make a mess. I haven't experienced it firsthand, but the code doesn't look all that robust with respect to huge variables.
  • During system initialization, if VSS1 appears to contain no variables but is not completely erased (filled with 0xFF), the system will erase both VSS1 and VSS2 (erasure includes writing new headers as well). This is what occurs when Cmd-Opt-P-R is held at boot time. (Basically, if the variable area looks like it's invalid, the system just wipes out both VSS areas and starts with a clean slate.)
  • The RTCRAM contains variables used by the memory test, boot picker, encryption keys, etc. During normal operation, the RTCRAM is not cleared (including when holding Cmd-Opt-P-R at boot).
  • If you create an NVRAM variable named ResetNVRam (note: case-sensitive name) containing any value, at the next reboot the system will completely erase and rebuild both VSS1 and VSS2 and erase the RTCRAM variable area. Aside from removing the BR2032 battery or writing a specialized program, this is the only way I know of to clear RTCRAM. Note that the VSS1/VSS2 clearing is unequivocal. (sudo nvram ResetNVRam=1 is sufficient to trigger this.) This is the "deepest" (most thorough) variable clearing you can get, short of re-flashing a virgin BootROM image.
  • Even though the EFI firmware uses a filesystem of sorts (FFVs), the NVRAM-handling code makes many assumptions about the size of the flash chip, and where the NVRAM data physically resides. This means that the NVRAM Data FFV cannot easily be moved (unlike, say, the code FFVs that follow it) or resized. Because of this, the NVRAM Data areas (VSS) get erased frequently, while the rest of the chip does not, which will eventually become the source of a chip failure.
  • The BootROM "knows" about a short list of SPI flash chips, and it adjusts its commands/timing based on the chip it identifies. If you replace your SPI flash chip with one that the BootROM doesn't recognize, it might not work at all. The list (which, surprisingly, includes 8, 2, and 1 MB chips as well as the standard 4MB - I've marked the 4MB chips with asterisks) is:
    Code:
    Winbond 25X64
    WinBond 25X32 *
    Eon M25P16
    Eon M25P32 *
    Atmel 45DB321 *
    ST Micro M25P32 *
    ST Micro M25P16
    Macronix 25L6436
    Macronix 25L3205 *
    Macronix 25L1605
    Spansion MBM29DL32TF *
    SST 25VF032 *
    SST 25VF016
    SST 25VF080
    If your replacement chip doesn't have a JEDEC ID that matches one of those, you might have a big problem.

I was hoping to find a reasonably straightforward way to improve the "failing BootROM" situation by either enlarging the VSS areas or perhaps implementing some sort of wear-leveling. However, there's no handy table of addresses, offsets, and sizes that can be modified to fix the problem in one place - there are constants and assumptions scattered throughout the code, and overlooking any one of them would create a disastrous outcome. Enlarging the VSS is effectively out of the question. Adding dynamic wear leveling is also seriously problematic. However, I do hold out a slender thread of hope that it might be possible to relocate the VSS areas to a previously-unused area of the flash chip. That would effectively reset/extend the chip's "countdown to doom," since the frequently-erased area would be relocated to an area that's effectively brand-new.


Some fun facts:
  • If you had a Mac-compatible EFI video card, and your BootROM contained an EFI shell, holding the ESC key at boot time would boot into the EFI shell. (The key map is there; I suspect this was a testing/debugging hook that didn't get entirely removed.) If you're keen on the idea, you could potentially add an EFI shell to your BootROM and boot to it natively. (OpenCore or RefindPlus are a whole lot simpler, though.)
  • The Boot Chime sound is stored in the FFV 1F9CABF9-3F3C-4CFF-AED8-5FDF745A0DCC as a headless AIFF file (encoded with Apple IMA 4:1 (ADPCM)). If you wanted to change your boot chime to the first bars of "We Will Rock You" or "Baby Shark," you could just replace FFV 1F9CABF9-3F3C-4CFF-AED8-5FDF745A0DCC with your own choice of sounds. (I haven't tried this yet, but it should work.)

Finally, an interesting observation:
  • Based on examining the code, I can't see any reason why holding Cmd-Opt-P-R through 3+ boot chimes (aka a "Deep NVRAM Reset") would have any special effect. When holding Cmd-Opt-P-R, the first boot chime you hear is immediately after the RAM test, before the keyboard is initialized by EFI. When the keyboard is discovered and those keys are recognized, the NVRAM variable space is invalidated and the system is rebooted. The second chime you hear is the boot that will reformat the VSS segments. Based on my analysis of the code, holding Cmd-Opt-P-R for more than two chimes simply performs redundant erasures of the VSS areas, increasing the chip wear without changing anything. (If repeated formatting does change something, that's probably indicative of a failing flash chip.)
    If you're able to boot your system to a desktop and open a Terminal window, typing
    sudo nvram ResetNVRam=1<enter> and then rebooting should clean your variable space even more than Cmd-Opt-P-R can, and with only the one reboot and one VSS erasure.
Very interesting read, I love to get a better understanding from the workings of my machine. Thanks ??
 

Dayo

macrumors 68020
Dec 21, 2018
2,257
1,279
Setting an existing variable to the same value it already had does not change NVRAM at all. The system compares the name and the value, and if they're identical, it does nothing (again, to minimize NVRAM erasures).
Very useful to know. I have updated the upcoming release of RefindPlus to do a comparison on NVRAM variable write calls and only write when these are different to avoid hammering the NVRAM with redundant writes. Good to know the firmware already does this but will leave it in place.

As an aside, I found that Monterey can hit the NVRAM more than 30 times on a standard boot up ... with many of them seemingly the same item. Mojave gets up to 20 hits ... so not so much better actually.

I also added blocking the panic dumps as while the search for an ultimate fix goes on, I put my pragmatist's hat on and decided half a loaf is better than none. The change has gone in just before a release and therefore, it has been severely limited in scope due to not having been fully tested. Will look at actually porting it to OpenCore for their next v0.8.1 release.

If you're able to boot your system to a desktop and open a Terminal window, typing
sudo nvram ResetNVRam=1<enter> and then rebooting should clean your variable space even more than Cmd-Opt-P-R can, and with only the one reboot and one VSS erasure.
I suppose equivalent can be done from a Terminal window in recovery as well.
 
Last edited:
  • Like
Reactions: cdf

zzzippp

macrumors member
Jan 27, 2006
51
47
Portland, Oregon
As an aside, I found that Monterey can hit the NVRAM more that 30 times on a standard boot up ... with many of them seemingly the same item. Mojave gets up to 20 hits ... so not so much better actually.

What about macOS Big Sur?

I know it's somewhat off-topic, but my lack of in-depth knowledge about OpenCore leads me to wondering why MacPro3,1/4,1/5,1 users aren't taking advantage of OpenCore's emulated NVRAM feature to avoid any writes to physical NVRAM other than what's absolutely necessary? I assume there's a good reason preventing this since no popular OC on CMP solution enables it.
 

Macschrauber

macrumors 68030
Dec 27, 2015
2,980
1,487
Germany
I tested:
sudo nvram ResetNVRam=1

and it had no effect at all.

MemoryConfig was added (the circular log has gone 1 step further as with every regular reboot),

blessed EFI Bootloader was still blessed. Even Sound Volume was not changed.

Was a 4.1 with rebuilt 144.0.0.0 Firmware.

Booted natively into Mavericks (what I always use for firmware tests as it has no system integrity protection).
 
  • Like
Reactions: cdf

cdf

macrumors 68020
Jul 27, 2012
2,256
2,583
I assume there's a good reason preventing this since no popular OC on CMP solution enables it

The base config on the OpenCore thread has WriteFlash=false, but I suppose you mean LegacyEnable, which uses a plist for storing variables. This mode is indeed untested on the classic Mac Pro. The reason is that it is really just intended for machines without any NVRAM hardware and may lead to a very suboptimal experience.
 
  • Like
Reactions: zzzippp

tsialex

Contributor
Original poster
Jun 13, 2016
13,454
13,601
I tested:
sudo nvram ResetNVRam=1

and it had no effect at all.

MemoryConfig was added (the circular log has gone 1 step further as with every regular reboot),

blessed EFI Bootloader was still blessed. Even Sound Volume was not changed.

Was a 4.1 with rebuilt 144.0.0.0 Firmware.

Booted natively into Mavericks (what I always use for firmware tests as it has no system integrity protection).

Noticed the exact same behavior with my 2013 15" rMBP, I'll check with my Mac Pro later.


The base config on the OpenCore thread has WriteFlash=false, but I suppose you mean LegacyEnable, which uses a plist for storing variables. This mode is indeed untested on the classic Mac Pro. The reason is that it is really just intended for machines without any NVRAM hardware and may lead to a very suboptimal experience.

Also for UEFI PCs with very tiny NVRAM volumes, a problem of a lot of PCs - most Lenovo desktops up to Intel Broadwell have a 64 KB NVRAM volume, 1/3 of Mac Pro.
 
  • Like
Reactions: Bmju

Syncretic

macrumors 6502
Apr 22, 2019
311
1,533
I tested:
sudo nvram ResetNVRam=1

and it had no effect at all.

MemoryConfig was added (the circular log has gone 1 step further as with every regular reboot),

blessed EFI Bootloader was still blessed. Even Sound Volume was not changed.

Was a 4.1 with rebuilt 144.0.0.0 Firmware.

Booted natively into Mavericks (what I always use for firmware tests as it has no system integrity protection).

Noticed the exact same behavior with my 2013 15" rMBP, I'll check with my Mac Pro later.

Interesting. The code is definitely there, and it definitely works on my 4,1->5,1 with (reconstructed) 144.0.0.0.0 (EDIT: on my system, it works from MacOS proper; I've never tried it from Recovery). The value ("1") is ignored; it works just as well with sudo nvram ResetNVRam="InigoMontoya".

EDIT: And since this happens in the BootROM, the MacOS version should be irrelevant. The only obstacle here should be something that gets in between the OS setting the variable and the BootROM code (OpenCore being one possible example). On a vanilla system with SIP disabled, it should work.
 
Last edited:

Macschrauber

macrumors 68030
Dec 27, 2015
2,980
1,487
Germany
It seems to have worked in my case from Monterey recovery. After rebooting, my previously muted chime was restored and previous variables were cleared.

I tried with a running Mavericks system as this was my open test rig where I can replace the flash ic easily.
 

Macschrauber

macrumors 68030
Dec 27, 2015
2,980
1,487
Germany
ok, it has to be run from the Recovery

ran it from Mavericks Recovery, the Mac rebooted 2 times (can hear it from the GPU Fan howling 2 times)

Code:
before reset:

serial from firmware: CK2
Firmware MP51.007F.B03
old Bootblock of MP51.007F.B03
6 + 3 Memory Configs (ok)
0 + 0 xml (ok)
1 + 1 Kernel Panic Dumps Type A: Pointer Type
0 + 0 iCloud Tokkens (ok)
0 + 0 Microsoft Certificates (ok)
1 + 1 BluetoothActiveControllerInfos (ok)
1 + 1 BluetoothInternalControllerInfos (ok)
1 + 1 current-network (ok)
4 + 2 AAPL Path Properties (ok)
45184 Bytes free space of 65472
 


after ResetNVRam=1:

serial from firmware: CK2
Firmware MP51.007F.B03
old Bootblock of MP51.007F.B03
5 Memory Configs (ok)
0 xml (ok)
0 iCloud Tokkens (ok)
0 Microsoft Certificates (ok)
0 BluetoothActiveControllerInfos (ok)
0 BluetoothInternalControllerInfos (ok)
0 current-network (ok)
0 AAPL Path Properties (ok)
VSS2 is empty (ok after triple nvram reset or recent firmware rebuilt)
50816 Bytes free space of 65472

this is a Dual 5.1 with 8 populated ECC Ram Slots a 4 GB I have just on the bench.

Firmware untouched (yet).


at least it does what a forced garbage collection (4 times nvram reset) does.
 

cdf

macrumors 68020
Jul 27, 2012
2,256
2,583
can you run my dumper after it to see what effects it had.

This is two reboots after (needed to disable SIP):

Code:
Firmware 144.0.0.0 (latest)
Bootblock of 144.0.0.0 - rebuilt Firmware
Boot0001 is EFI\OC\OpenCore.efi
12 Memory Configs (ok)
0 xml (ok)
0 iCloud Tokkens (ok)
0 Microsoft Certificates (ok)
0 BluetoothActiveControllerInfos (ok)
0 BluetoothInternalControllerInfos (ok)
1 current-network (ok)
4 AAPL Path Properties (ok)
VSS2 is empty (ok after triple nvram reset or recent firmware rebuilt)
33024 Bytes free space of 65472
 

zzzippp

macrumors member
Jan 27, 2006
51
47
Portland, Oregon
The base config on the OpenCore thread has WriteFlash=false, but I suppose you mean LegacyEnable, which uses a plist for storing variables. This mode is indeed untested on the classic Mac Pro. The reason is that it is really just intended for machines without any NVRAM hardware and may lead to a very suboptimal experience.

Yes, I meant the LegacyEnable feature.

Strangely, just today when booted into macOS Mojave "natively", I discovered my physical NVRAM of my "new" 6-core MacPro5,1 (listed in my sig) had a number OC variables in it that I didn't expect to see (including efi-backup-boot-device & efi-backup-boot-device-data), despite the config.plist having WriteFlash=false.

From what I've read from @tsialex and elsewhere, WriteFlash=false only reduces the number of variables written to physical NVRAM, not prevents it completely. So I was just wondering aloud whether there's a way to avoid physical NVRAM writes altogether when using OC to boot macOS.
 

tsialex

Contributor
Original poster
Jun 13, 2016
13,454
13,601
Interesting, seems that nvram ResetNVRam=1 behaves exactly like a forced GC deep NVRAM reset. I'll check if it does anything when the circular log failed.

Edit: Not exactly and I used forced GC incorrectly here.
 
Last edited:

Macschrauber

macrumors 68030
Dec 27, 2015
2,980
1,487
Germany
I did the opposite:

I flashed back the firmware before I did ResetNVRam=1 and forced a GC by holding cmd-alt-p-r until 4 chimes.

Got exactly the same readings. Binwalk run of the nvram module is same to the letter.

So it seems that ResetNVRam=1 given in the Recovery (or createinstallmedia boot stick I guess) is a way to force the deep GC.

So it won't help for example with certificates if they have found their way in the 2nd store also.
 
Last edited:
  • Like
Reactions: cdf and tsialex

tsialex

Contributor
Original poster
Jun 13, 2016
13,454
13,601
Did some tests and nvram ResetNVRam=1 behaves exactly the same as a forced GC, with a long running working NVRAM. I'll check with edge cases later this week.

From what I can see, it will be very useful when you have a still bootable system and want to force GC with a Wireless KB or a KB that don't do continuous NVRAM resets.

Edit: Not exactly and I used forced GC incorrectly here.
 
Last edited:
  • Like
Reactions: zzzippp

Syncretic

macrumors 6502
Apr 22, 2019
311
1,533
This may sound pedantic, but Cmd-Opt-P-R is not the same as Garbage Collection. As far as I can tell, GC only occurs when the VSS free space falls below 2kB, and isn't set up to be triggered manually (short of writing enough large variables to fill the VSS within 2k of its limit). With GC, valid variables are copied to RAM, the VSS is erased, and the valid variables are copied from RAM back to the VSS (without the invalid/deleted variables, which increases the free space). If you set MyVariable=12345 before GC occurs, you'll still see MyVariable=12345 after GC occurs - nothing is lost during GC, it just condenses the valid variables and increases the free space.

Cmd-Opt-P-R (indirectly) causes the entire VSS to be erased. After that, the "usual suspects" will re-create their various NVRAM variables (MemoryConfig, SystemAudioVolumeDB, bluetooth*, etc.). If you set MyVariable=12345 before a Cmd-Opt-P-R reboot, you will not see MyVariable=12345 afterward, because the entire VSS got erased. (So, GC and Cmd-Opt-P-R both increase free space, but GC is non-destructive, while Cmd-Opt-P-R is the "scorched earth" solution.)

As I noted in an earlier post, my theory is that because the Cmd-Opt-P-R mechanism ignores return/error codes, it's entirely possible for the invalidate/reboot/rebuild sequence to fail because the initial invalidation silently fails, meaning that multiple Cmd-Opt-P-R reboots increase the odds of a successful VSS erasure. The ResetNVRam variable trick has the advantage of directly erasing the VSS immediately. (Also, as I noted in an edit to an earlier post, I have never tried the ResetNVRam trick in Recovery - on my system, it works from MacOS proper.)
 

Dayo

macrumors 68020
Dec 21, 2018
2,257
1,279
I meant the LegacyEnable feature.
I was just wondering aloud whether there's a way to avoid physical NVRAM writes altogether when using OC to boot macOS.
Valid point. Considering where things stand, LegacyEnable does need a review to reconfirm that it is not the pragmatic position to take on cMP.
 

Macschrauber

macrumors 68030
Dec 27, 2015
2,980
1,487
Germany
This may sound pedantic, but Cmd-Opt-P-R is not the same as Garbage Collection. As far as I can tell, GC only occurs when the VSS free space falls below 2kB, and isn't set up to be triggered manually (short of writing enough large variables to fill the VSS within 2k of its limit). With GC, valid variables are copied to RAM, the VSS is erased, and the valid variables are copied from RAM back to the VSS (without the invalid/deleted variables, which increases the free space). If you set MyVariable=12345 before GC occurs, you'll still see MyVariable=12345 after GC occurs - nothing is lost during GC, it just condenses the valid variables and increases the free space.

Cmd-Opt-P-R (indirectly) causes the entire VSS to be erased. After that, the "usual suspects" will re-create their various NVRAM variables (MemoryConfig, SystemAudioVolumeDB, bluetooth*, etc.). If you set MyVariable=12345 before a Cmd-Opt-P-R reboot, you will not see MyVariable=12345 afterward, because the entire VSS got erased. (So, GC and Cmd-Opt-P-R both increase free space, but GC is non-destructive, while Cmd-Opt-P-R is the "scorched earth" solution.)

As I noted in an earlier post, my theory is that because the Cmd-Opt-P-R mechanism ignores return/error codes, it's entirely possible for the invalidate/reboot/rebuild sequence to fail because the initial invalidation silently fails, meaning that multiple Cmd-Opt-P-R reboots increase the odds of a successful VSS erasure. The ResetNVRam variable trick has the advantage of directly erasing the VSS immediately. (Also, as I noted in an edit to an earlier post, I have never tried the ResetNVRam trick in Recovery - on my system, it works from MacOS proper.)

maybe I was not super clear.

I meant that deep nvram reset - GC what has been forced by a multiple nvram reset with alt-cmd-p-r

what leaves the $VSS2 store empy and is indicated by my dumper with triple nvram reset.

it seams the nvram dumps are identical when forcing a deep nvram reset with the dumps when setting ResetNVRam=1

this is a very useful finding.



the regular GC when the 1st stream is nearly full is a complete other thing what leaves vss2 not empty
 
  • Like
Reactions: Dayo

Bmju

macrumors 6502a
Dec 16, 2013
702
767
1649758401878.png

Code:
VOID
OcLoadNvramSupport (
  IN OC_STORAGE_CONTEXT  *Storage,
  IN OC_GLOBAL_CONFIG    *Config
  )
{
  if (Config->Nvram.LegacyEnable && Storage->FileSystem != NULL) {
    OcLoadLegacyNvram (Storage->FileSystem, Config);
  }
  OcDeleteNvram (Config);
  OcAddNvram (Config);
  OcReportVersion (Config);
}
 

Attachments

  • 1649758384478.png
    1649758384478.png
    11.1 KB · Views: 69
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.