MP 1,1-5,1 Mac OS 11.3 has broken support for older Mac Pros

Macschrauber · May 11, 2021

Dayo said:
Did you try with the RP supply_apfs flag instead of efi file as well?

Assume the efi was from 0.11.2

no and yes, I just added the APFS Efi Driver file. It was loaded and working.

Dayo · May 11, 2021

Might be worth using that as it would load the driver version consistent with the BS version.

You can't rule out issues coming from using an older version otherwise.

tsialex · May 11, 2021

Dewdman42 said:
So Alex when you built my boot rom you added the more up to date rom drivers from the 2012 era. Is it a possibility right now that those who are having these problems are all using the 2010 era? Just asking. I will not be trying any of this because I don’t have a spare machine that be offline from this kind of experimentation.

It's not BootROM related in any way - you can make it worse with a corrupted/full NVRAM, but you can't make it better.

All early-2009 to mid-2012 have the same PCIe related crashes, BootROM being pristine or not. You can flash the generic firmware upgrade (MP51.fd) and the crashes still happen.

JohnD · May 11, 2021

Macschrauber said:
the bootrom chip itself:

every reboot will write to the nvram section of the Firmware in that chip.

I dont recommend to let the box boot the whole night thru cause of the wear.

The 25L3205D is related for 100,000 P/E cycles - that's rewriting the entire chip 100,000 times. Assuming every reboot rewrites the entire chip (which it doesn't), a 2009 Mac Pro would have had to have been rebooted 23x every single day the past 12 years to exhaust that chip.

You can also buy a blank for $10 if you know someone with a SPI programmer, or buy one already flashed off eBay (with someone else's, likely blacklisted, serial number), solder it on, then flash your backup onto it.

But still, not a good idea to leave it in a reboot cycle all night.

tsialex · May 11, 2021

JohnD said:
The 25L3205D is related for 100,000 P/E cycles - that's rewriting the entire chip 100,000 times. Assuming every reboot rewrites the entire chip (which it doesn't), a 2009 Mac Pro would have had to have been rebooted 23x every single day the past 12 years to exhaust that chip.

You can also buy a blank for $10 if you know someone with a SPI programmer, or buy one already flashed off eBay (with someone else's, likely blacklisted, serial number), solder it on, then flash your backup onto it.

But still, not a good idea to leave it in a reboot cycle all night.

You understood it wrong, it's not the whole chip that can be rewritten 100k times, but the NAND cell/sector (it's a sectored chip) is certified to have endurance over 100K cycles of erase/rewrite (JEDEC A117).

Since there is no wear levelling with SPI flash memories of this era, the sectors never change, and the NVRAM volume area easily can get to the 100K cycles after all those years and it's why so much early-2009/mid-2010s are having dead backplanes.

This Infineon paper "Endurance and Data Retention Characterization of Infineon Flash Memory" is more comprehensible than the JEDEC A117 standard:

https://www.cypress.com/file/369306/download

The endurance specification of a flash device should be evaluated in terms of the projected in-system rate of erasure for any given sector. The sectors used for data logging may rapidly accumulate erase cycles depending on the frequency and size of the data being captured. Such use may ultimately lead to those sectors failing first. As such, the shorter the Program/Erase interval time between Program/Erase cycles, the worse the data retention. Longer interval times between Program/Erase cycles can de-trap the excess trapped electrons between Program/Erase cycles, resulting in better data retention. Figure 8 shows an example of the retention lifetime over a variety of interval times, assuming 20 years retention lifetime after 10k Program/Erase cycles at an average of 55 degree Celsius cycling, under JEDEC test conditions.

The quote above represents exactly the same use case of the Mac Pro NVRAM volume inside the SPI flash memory that stores the BootROM image.

Macschrauber · May 11, 2021

Read the threads from @tsialex, he explained all in detail.

The flash chips die, if it's the wear, the age or the continuous writing of the same cells in the nvram area. They die.

I have repaired more than a dozen boards with dead firmware chips, some of them readable with lower frequency when pulled off the board and read out with an external programmer.

Just wanted to spell a warning, dont overstress the firmware chip with too much rebootings as say all night thru.

Macschrauber · May 11, 2021

Dayo said:
Might be worth using that as it would load the driver version consistent with the BS version.

You can't rule out issues coming from using an older version otherwise.

Tried supply_apfs and the efi driver allone, with both BS hang with USB Hardware and ancient Firmware.

eVasilis · May 12, 2021

Hi all

Has anyone tried consecutive safe boots?

Just a thought

mikas · May 12, 2021

Some of the active ones would have noticed this obviously sooner or later, but could there be some kind of a correlation here:
Yesterday at 22:08

JohnD · May 12, 2021

mikas said:
Some of the active ones would have noticed this obviously sooner or later, but could there be some kind of a correlation

I wouldn't think so, as Syncretic discovered the issue to be Apple using entirely new bootstrap code in 11.3b3, which is the version that we started having issues with. Highly unlikely (but possible) that Apple would include that code in previous OS updates, and especially not Mojave, as our 5,1 Mac Pro's are still supported under Mojave. That would be GREAT if they did however.

Macschrauber · May 12, 2021

mikas said:
Some of the active ones would have noticed this obviously sooner or later, but could there be some kind of a correlation here:
Yesterday at 22:08

I read that and thaught also: aha, another race condition.

if that bootstrap code is timing critical it could affect other supported Macs In special, untested configuration.

why not? So the ball goes back to the mothership to fix their code.

at least, hope so.

JohnD · May 12, 2021

Macschrauber said:
I read that and thaught also: aha, another race condition.

if that bootstrap code is timing critical it could affect other supported Macs In special, untested configuration.

why not? So the ball goes back to the mothership to fix their code.

at least, hope so.

That post is for a Catalina security update - I would highly doubt Apple would update the bootstrap code, in a security update, for the previous OS, but never know I suppose.

Syncretic · May 12, 2021

Another quick update - good news/bad news. Among the changes to the startup code, Apple reorganized some (not all) things from a linear procedure to a priority list. Basically, instead of code saying "do A, then do B, then do C...", there's an array of functions to call and arguments to pass, each one with a subsystem and a rank assignment. (I didn't catch this at first, because this list is stored in its own separate data segment (__KLDDATA,__init_entry_set), so my initial disassembly missed it). Anyway, that list gets sorted at runtime by subsystem and rank, then as each of the subsystems gets initialized, all of the associated functions in the list get called in order of their rank. From a programming perspective, it's much more elegant and flexible; from a reverse-engineering perspective, it's a pain in the ass - and the eventual source code release is unlikely to help, because 11.1 uses this same mechanism (to a much smaller extent) and the 11.1/11.2 source did not include this part.

I was initially intrigued by the runtime sorting of the list, because they use qsort. Some variants of qsort use random pivots, which would result in slightly different outcomes with every run (sound familiar?). Unfortunately, after tracing through the qsort they used, it's deterministic (meaning it should generate the same output every time).

My current working theory is that by reorganizing the code in this manner, one or more functions ended up being executed earlier or later than they did before this change, resulting in either creating or exposing a race condition. The good news is that because they used a priority list, if we could identify the function(s) that need to be re-ordered, it would only take a one-byte (or perhaps a few-byte) patch to fix it. The bad news is that in 11.3, there are 2315 of these function/argument pairs to analyze in order to make that determination.

I'm going to take a break from this and ponder how I might automate at least part of this process; there's no way I'm going to slog through 2315 functions just to find the needle in this haystack.

RLTechs1 · May 12, 2021

Please see my post #2,516 as I do not want to be repeating posts in multiple threads.

For some reason unbeknownst to me, my flashed 2009 cMP at work is doing just fine with no new issues on 11.3

JohnD · May 12, 2021

Thinking backwards a bit, is there any way to make a supported machine exhibit this race condition? Then it would be an official bug report that Apple would need to fix. I don't have a supported machine, or I'd be hard at work trying every combination to make it fail right now.

vit9696 · May 12, 2021

@Syncretic, you can use macOS DEBUG or DEVELOPMENT kernels for easier disassembling. For me IDA extracts the dSYM and makes things quite readable in Hex-Rays out of the box. DEBUG builds are much better, but Apple stopped providing them after 11.0b1 or so at least for the time being.

I partly reverse-engineered the sysctl initialisation code in 11.3 for my own needs, and the only change I remember was sysctl init code being moved to kernel_startup_initialize_upto (https://github.com/apple/darwin-xnu/blob/xnu-7195.81.3/osfmk/kern/startup.c) in addition to all the other functions that were moved earlier in 11.x. I had not noticed anything unusual in it as Apple was restructuring their constructor code from the beginning of Big Sur. Also, the code is pretty much open, I am not sure what in particular do you mean regarding closed source.

RLTechs1 · May 13, 2021

JohnD said:
Thinking backwards a bit, is there any way to make a supported machine exhibit this race condition? Then it would be an official bug report that Apple would need to fix. I don't have a supported machine, or I'd be hard at work trying every combination to make it fail right now.

I think I still have one 2018 Mac mini that hasn't been upgraded at the shop yet. If it's a slow day today I can check it out, and give it a try although, the other 2 that are upgraded are not showing any signs of issues yet.

LordeOurMother · May 13, 2021

@VitaminK You mention NVRAM and bricking the Mac Pro in your post. Out of curiosity, this is the same brick that requires one to then purchase a MATT card correct? Does merely resetting the NVRAM too many times have the same effect?

Sorry it's a bit unrelated, but my biggest fear is accidentally bricking my Mac Pro.

tsialex · May 13, 2021

LordeOurMother said:
@VitaminK You mention NVRAM and bricking the Mac Pro in your post.

The warning on the first post was written by me, please follow the link, read the instructions and check it.

LordeOurMother said:
Out of curiosity, this is the same brick that requires one to then purchase a MATT card correct?

Yes.

LordeOurMother said:
Does merely resetting the NVRAM too many times have the same effect?

Sorry it's a bit unrelated, but my biggest fear is accidentally bricking my Mac Pro.

This is a lot more complicated than it appears. NVRAM with Intel Macs is not a battery backed SRAM like with PPCs that you just remove the power and it's fresh from factory.

What it's called "NVRAM reset" is a triggered/forced garbage collection that happens inside the NVRAM volume that is stored in the BootROM. The NVRAM is not really erased when you reset it.

A lot of info inside the NVRAM volume is permanent, while some are almost permanent and some are transient. The reset NVRAM procedure removes the transient (like default boot device/default sound volume/etc), the "deep NVRAM reset", when working, removes the transient and some of the almost permanent (like the MemoryConfig variables), but you never get a pristine NVRAM volume with a NVRAM reset - you never get it back factory fresh like it's possible with a PPC Mac (the exception is with BootROM reconstructions where a firmware engineer recreates the never booted image of your Mac Pro BootROM).

I've read that someone forced hundreds of NVRAM resets overnight using a Arduino, this is beyond crazy and doing it will kill the SPI flash memory (read post #755).

LordeOurMother · May 13, 2021

tsialex said:
The warning on the first post was written by me, please follow the link, read the instructions and check it.

Yes.

This is a lot more complicated than it appears. NVRAM with Intel Macs is not a battery backed SRAM like with PPCs that you just remove the power and it's fresh from factory.

What it's called "NVRAM reset" is a triggered/forced garbage collection that happens inside the NVRAM volume that is stored in the BootROM. The NVRAM is not really erased when you reset it.

A lot of info inside the NVRAM volume is permanent, while some are almost permanent and some are transient. The reset NVRAM procedure removes the transient (like default boot device/default sound volume/etc), the "deep NVRAM reset", when working, removes the transient and some of the almost permanent (like the MemoryConfig variables), but you never get a pristine NVRAM volume with a NVRAM reset - you never get it back factory fresh like it's possible with a PPC Mac (the exception is with BootROM reconstructions where a firmware engineer recreates the never booted image of your Mac Pro BootROM).

I've read that someone forced hundreds of NVRAM resets overnight using a Arduino, this is beyond crazy and doing it will kill the SPI flash memory (read post #755).

Like usual, you're a fount of information! Thanks.

I really ought to get the information pulled off of my NVRAM at some point, just to be safe in case it happens.

Macschrauber · May 13, 2021

In

Mac OS 11.3 has broken support for older Mac Pros

I ran the tests again, this time I took care to set Startup Disk from the prefpane to the Big Sur 11.3 Disk. I restarted more than 20 times, this time logging the nvram free space thru a complete nvram circle (from filled to full to garbage collection to the same filled situation) and it...

forums.macrumors.com

you will see garbage collection has run in the 3rd to 4th Screenshot, look at Free Space.

This script used for dumping the Firmware and analysing the NVram a little can be loaded here:

Dropbox - Error - Simplify your life

www.dropbox.com

if you want to start healthy do at least with a forced garbage collection (4 times nvram reset in one row)

Macschrauber · May 14, 2021

Maybe this helps a little to understand the early boot process for 11.3

even this is for M1 Macs it should have a lot in common:

Booting an M1 Mac: external disks and local boot policy

How an M1 Mac can start up from an external bootable disk, and how that can fail. All about boot security policy, and how that’s applied.

eclecticlight.co

also the linked pdf is very informative

start reading at page 41

https://manuals.info.apple.com/MANUALS/1000/MA1902/en_US/apple-platform-security-guide.pdf

startergo · May 14, 2021

Macschrauber said:
Maybe this helps a little to understand the early boot process for 11.3

even this is for M1 Macs it should have a lot in common:

Booting an M1 Mac: external disks and local boot policy

How an M1 Mac can start up from an external bootable disk, and how that can fail. All about boot security policy, and how that’s applied.

eclecticlight.co

also the linked pdf is very informative

start reading at page 41

https://manuals.info.apple.com/MANUALS/1000/MA1902/en_US/apple-platform-security-guide.pdf

Intel-based Mac computers without a T2 chip An Intel-based Mac without a T2 chip doesn’t support secure boot. Therefore the UEFI firmware loads the macOS booter (boot.efi) from the file system without verification, and the booter loads the kernel (prelinkedkernel) from the file system without verification. To protect the integrity of the boot chain, users should enable all of the following security mechanisms:

• System Integrity Protection (SIP): Enabled by default, this protects the booter and kernel against malicious writes from within a running macOS.

• FileVault: This can be enabled in two ways: by the user or by a mobile device management (MDM) administrator. This protects against a physically present attacker using Target Disk Mode to overwrite the booter.Apple Platform Security 43

• Firmware Password: This can be enabled in two ways: by the user or by an MDM administrator. This protects a physically present attacker from launching alternate boot modes such as recoveryOS, Single User Mode, or Target Disk Mode from which the booter can be overwritten. This also prevents booting from alternate media, by which an attacker could run code to overwrite the booter

cdf · May 14, 2021

FileVault. Has anyone tried booting ≥11.3 with it? I wonder if the booting issue would be any different.

startergo · May 15, 2021

cdf said:
FileVault. Has anyone tried booting ≥11.3 with it? I wonder if the booting issue would be any different.

It looks like the prohibitory sign is probably due to the filevault not able to unlock the hard drive so the Apple suggestion is actually to disable it:

ProfessorBong said:
So I was on the phone with Apple support today and my adviser said the error messages in my boot log are very similar to those you would see when FileVault is not able to unlock your hard drive while booting.

MP 1,1-5,1 Mac OS 11.3 has broken support for older Mac Pros

macrumors 68040

macrumors 68020

Contributor

macrumors regular

Contributor

macrumors 68040

macrumors 68040

macrumors 6502

macrumors 6502a

macrumors regular

macrumors 68040

macrumors regular

macrumors 6502

macrumors newbie

macrumors regular

macrumors member

macrumors newbie

macrumors 6502

Contributor

macrumors 6502

macrumors 68040

macrumors 68040

macrumors 603

macrumors 68020

macrumors 603

Our Staff