Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
Status
Not open for further replies.

Syncretic

macrumors 6502
Original poster
Apr 22, 2019
311
1,536
UPDATE 27sep21:
  • latebloom is now deprecated/unnecessary. Please see this thread for more information.

I have created a kext that may provide a workaround for the Big Sur 11.3+ race condition. Alpha testing with a small group has provided encouraging results, so I'm now posting it publicly to see if those results withstand broader scrutiny.


UPDATE 1sep21:
  • On 5aug21, I posted version 0.20 here. When I have a bit more time, I'll incorporate that information into Post #1, but for now, please see the linked post for details. The main difference in v0.20 is the addition of the lbloom= boot-arg, which allows latebloom parameters to be condensed into a single item. (The original boot-args also work.)
    I have updated the version attached to Post #1 to v0.20. The original usage instructions are still valid; if you're interested in using the lbloom= boot-arg, see the original post for details.
  • On 25aug21, I posted version 0.21 here. This is strictly an experimental version; if you're already using latebloom to good effect, there is no reason to "upgrade" to this version. Details are in the linked post. If you're interested in experimenting, please feel free to do so and report back here. Until there's more data available, I am reluctant to make v0.21 the "official" version.


UPDATE 4aug21:


Some important points before we start:
  • THIS IS STILL EXPERIMENTAL. It's not polished, and there are no guarantees - it may work extremely well for you, or you may see little or no benefit. It could conceivably even make things worse on your system, although I have yet to see any such case, and have no reason to believe it might do so. Be aware that at this stage, trying this makes you a guinea pig.
  • THIS IS A WORKAROUND, NOT A SOLUTION. As far as I know, no one has yet identified the true root cause of the race condition. Therefore, no one can offer an actual "fix," since we don't know what the real problem is. This kext helps mitigate the symptoms, but it does not address the root cause - therefore, any related problems may still be present. In particular, I've seen a few reports of disk corruption attributed to the race condition (although I have yet to see anything like that myself); be aware that using this kext probably does little or nothing to mitigate that risk.
  • THIS IS STILL IN BETA TESTING. As such, I'm currently looking for somewhat more experienced Mac users to give it a try and provide feedback. If you're unsure what to do if I simply say "inject it via OpenCore or just link it directly into the kernel" without further detail, then please don't try this just yet - if it proves to be useful, I'll clean up the code and I (or some kind soul) will write up step-by-step instructions for installation and use. (Anyone is free to try this, but my availability for helping folks with installation questions is extremely limited right now.)
  • IF IT AIN'T BROKE, DON'T FIX IT. If your system isn't experiencing the race condition bug on Big Sur 11.3+ or Monterey, you shouldn't bother with this kext. All it's going to do is increase your boot time.
The kext is called latebloom (see the bottom of this post if you're curious about the name). It hooks the PCI bus probe code, and inserts a pause for every iteration through the PCI bus. It does not touch the filesystem, it does not touch the network, it's only active during boot (before the login screen), and its only output is some print statements if you have verbose/debugging enabled. It won't do anything at all if it's loaded on MacOS older than Big Sur. (Let me reiterate here that this is EXPERIMENTAL - it's very much like having an old TV that flickers, so you bang on the side of the case until the picture clears up. We can't directly address the true cause, because we don't know what it is, but we can shake up the boot process and apparently improve the odds of a successful boot.)

This has been tested on various versions of Big Sur 11.3+, and on three betas of Monterey. It's primarily intended for Mac Pro 4,1 and 5,1 systems, but it should run on any Intel-based Mac (if you're so inclined). Note that while it's effective on either warm (restart) or cold (power-up) boots, testing shows it to be much more effective on cold boots, for reasons currently unknown. I welcome more test results to see if we can figure out why, and perhaps do something about it.

The kext can either be injected via OpenCore or linked directly into the kernel. (I have not tested direct kernel linking, so that's still an action item.) There's nothing special about the OpenCore injection; I have set MinKernel to 20.4.0 to ensure that it only gets loaded on 11.3+. The Kernel/Add values I used:
Code:
<dict>
    <key>Arch</key>
    <string>x86_64</string>
    <key>BundlePath</key>
    <string>latebloom.kext</string>
    <key>Comment</key>
    <string>PCI delay for BS/Monterey</string>
    <key>Enabled</key>
    <true/>
    <key>ExecutablePath</key>
    <string>Contents/MacOS/latebloom</string>
    <key>MaxKernel</key>
    <string></string>
    <key>MinKernel</key>
    <string>20.4.0</string>
    <key>PlistPath</key>
    <string>Contents/Info.plist</string>
</dict>
If you have kernel verbose/debug enabled, you'll see some startup messages that start with _____[ !!! *** latebloom *** !!! ]:, which might be informative. Notably, if you see HOOK NOT PLACED, the kext didn't actually do anything because it couldn't find the correct code to hook. PLEASE NOTIFY ME IF THIS HAPPENS. (This should not appear on any current Big Sur or Monterey releases, but new betas might trigger it.)

The kext uses three numeric boot-args for configuration. Note that unlike kernel boot-args, these values must be numeric, must be decimal, and must contain one to four digits. If a latebloom boot-arg is found but what follows the "=" is non-numeric, it will be interpreted as 0.
  • latebloom=N sets the delay, in milliseconds. If latebloom=N is not present, a "safe" default of 60ms is used. If latebloom=0 is set (or if N is non-numeric, effectively making it 0), the kext will not place its hook, and will do nothing at all. This is an easy way to disable the kext (another being to set the OpenCore Enabled tag to false).
  • lb_range=N sets a range for random delays. By default, every delay is exactly the same (either the latebloom=N value or the default 60ms). If lb_range is set, the delay for each PCI bus probe is random in the range (latebloom +/- lb_range) milliseconds. For example, latebloom=90 lb_range=20 results in random delays between 70..110 ms (90 +/- 20). If lb_range=N is not present, it defaults to 0 (no random changes, always use the same delay value). Using random delays may help avoid deadlocks.
  • lb_debug=N enables additional debugging messages. If N is 0, or if lb_debug is not present, only some progress messages are displayed during boot. If lb_debug is set to 1, additional debugging messages are printed, including one for each iteration of the PCI bus probe, along with the actual delay value for that loop. At present, the only valid values are 1 or 0 (or no lb_debug boot-arg present).
If the kext encounters a problem trying to hook the PCI code, it will display an error message and pause the boot process for 4 seconds (to give you a chance to see the message). From there, your system will boot as usual, and the kext will do nothing. If there are no errors hooking the PCI code, the boot will proceed at normal speed (subject to whatever PCI delays you have set). Depending on your setup, there will likely be 50-100 PCI bus probe iterations; multiply that by your delay value to get a rough estimate of the impact to boot time. For example, if you set latebloom=1500, you're adding 1.5 seconds per loop - if there are 60 loops on your system, you've thus added 90 seconds to your boot time, which can feel like forever - so be careful with large values. Also note that unless you have lb_debug=1 in your boot-args, you'll see "silent" delays where your boot might appear to hang - be patient, especially if you're using large values (anything over, say, 200).

Note that if the kext is injected via OpenCore or linked directly into the kernel, it will always show up in kextstat (or other lists of loaded kexts), even if it failed to place its hook or if you've disabled it with latebloom=0. MacOS sees it as loaded, but it's not doing anything; seeing it as loaded tells you nothing about whether or not it successfully hooked or added any delays. The only evidence for that is in the verbose boot messages and the duration of your boot process.

The best results seem to be in the 50-150ms range. However, depending on your CPU clock speed, the number of PCI devices you have, the number of USB devices you have, the number of disks you have, and other configuration items, your value will likely be unique. The only way to determine what works best for you is to experiment - start with some value, see how it works, try increasing or decreasing it, see how it works, and eventually home in on the best value for your system. I'm requesting that once you've decided on an optimal (or at least "good enough") value for your system, please post in this thread with your system configuration and your latebloom parameters; I'm hoping we can collect enough of these to make a table of "best guess starting values" so that later users won't have to go through this process (or at least will have fewer steps to go through). If those parameters are consistent, and paint a fairly clear picture of which values work best on which configurations, I can incorporate that into the kext itself so it can choose a better default based on the system it's running on.

This is an unsigned kext, so you'll almost certainly need SIP disabled to link it directly into the kernel; OpenCore will apparently allow you to inject the kext without disabling SIP. As with my other kexts posted on MacRumors, the attached file is a ZIP of a TGZ file, so you'll need to extract it twice.

Good luck, and please post any results you get, good or bad. This may still prove to be yet another dead end, but it might also provide at least a narrow path past 11.2.3 (depending upon your use-case).

(Regarding the name "latebloom" - This is the 18th kext-based mitigation process I have attempted. The first one held the system in single-CPU mode until boot was completed, then enabled all the processors, creating a "late bloomer" situation. That method didn't prove fruitful, but the name stuck for every successive process.)

Version history:
Code:
12jul21   0.17   Initial public beta release
14jul21   0.19   Added support for Monterey beta 3
5aug21   0.20   Added condensed "lbloom=" boot-arg
25aug21   0.21   Added "Phase 1/Phase 2" support
 

Attachments

  • latebloom-0.20-RELEASE.zip
    12.1 KB · Views: 370
Last edited:
User-reported results:
(I will probably change the format of this as time goes on)

The simplest way to produce a complete report is to use the script provided by @Macschrauber.
NOTE: I had been including the entirety of each report in this post; however, MacRumors apparently has a 100k limit per post, and we've managed to overflow that, so I'm now only including links to the original posts (as well as any updates that have been posted). If you need details about a particular user's report, click through to their original posts.

NOTE: Some users have observed that the best-case values for latebloom= and lb_range vary between MacOS versions (e.g. what works well for 11.4 may work poorly for 11.5, and may not work at all for 12.0b2 - and vice versa). Be aware that when you change MacOS versions, you may need to tweak latebloom as well.

NOTE: When installing or updating MacOS 11.3+, you may experience fewer hangs and other issues if you remove as many non-boot drives as is practical prior to the install, then replace them after installation is complete.

Mac Pro 3,1 Systems



Mac Pro 4,1 Systems



Mac Pro 4,1 -> 5,1 Systems

@Enricote - original report
(Updated with more details here)
(Another update here)
@KvR - original report
(Updated with more details here)


Mac Pro 5,1 Systems

@kkinto - original report (200/20/1)
(Update: 150/20/1 - updated report)
@mrpink1337 - original report (first system)
@mrpink1337 - original report (second system)
@MacRumors3590 - original report
(Updated with more system detail here)
(Another update here)
(And another here)
@Macschrauber - original report
(Update to 70/10/1 here)
(Another update here)


iMac 8,1 Systems



iMac 11,3 Systems



Hackintosh Systems

@lulujyc - original report
(Editor's note from @Syncretic: just for comparison, the 6c/12t Coffee Lake ES described here most likely falls into the i7-8086K or i7-8700* family. Unless @lulujyc says otherwise...)

Hi -- I know this forum isn't for hackintosh support but this kext definitely helped me with my Dell Precision 7530 laptop setup so just wanna post my test results here for others experiencing similar situations on 11.3+ even if they're not (1) using a pre-SNB platform and (2) using a real Mac.

latebloom=200 lb_range=20

Dell Precision 7530
OpenCore 0.7.1
CPU: Coffee Lake 6c12t, Engineering Sample, 2.0-3.6
Memory: 2x8GB DDR4 3200Mhz
WiFi/BT: DW1820a

PCIe:
• AMD WX 4150 Graphics
• Three NVMe SSD
• Realtek SD Card reader
• Intel Titan Ridge TB3 Controller
• BCM94356 Wireless

Native USB:
• USB mouse
• USB 3.0 Dock
• USB sound card
 
Last edited:
Many thanks for your work! ❤️ I will test today, hopefully the cheese grater lives on...
 
Great work, I will be testing shortly on my MacPro3,1. macOS Big Sur installation processes are a reliable way for me to test the race condition, plus an excuse to fix the installation from my random root volume tests ;p

Main question I have:
Using random delays may help avoid deadlocks.
Curious, what's the logic behind having a random delay each boot? Ofc I understand these boot issues are randomized for many though having the delay randomized seems to throw me off.

Besides that, if you're alright I can create a new OCLP branch and have users easily test your kext. What would you recommend to be the best default values? Ofc there will be a settings entry to let people change the values. Or do you feel the default 60ms provided by no args is the best default and simply allow the override to be manually done by the user.

(Oh and if you feel this kext is too early and shouldn't have mass testing yet, I can hold off on adding it until you feel its ready)
 
Great work, I will be testing shortly on my MacPro3,1. macOS Big Sur installation processes are a reliable way for me to test the race condition, plus an excuse to fix the installation from my random root volume tests ;p

Main question I have:

Curious, what's the logic behind having a random delay each boot? Ofc I understand these boot issues are randomized for many though having the delay randomized seems to throw me off.

Besides that, if you're alright I can create a new OCLP branch and have users easily test your kext. What would you recommend to be the best default values? Ofc there will be a settings entry to let people change the values. Or do you feel the default 60ms provided by no args is the best default and simply allow the override to be manually done by the user.

(Oh and if you feel this kext is too early and shouldn't have mass testing yet, I can hold off on adding it until you feel its ready)

(I’m on the road right now, please forgive any typos.) Probably too early for true mass testing. It’s not “random delay each boot,” it’s “random delay each time through the PCI bus probe loop” - you’ll have 50-100 of those per boot, with delays like 66/73/54/97…(etc), or they will all be the same if no range is specified. I’m hoping the results from this thread will lead to better default(s), which can then be used in later versions and wider distribution.
 
  • Like
Reactions: HuRR
For me it is 100ms and no randomization. But it may be different for others. I tried 60, 100, 180, 300
Agreed, 100ms on my MacPro3,1 works wonderfully it seems. No hangs at all reinstalling 11.4 which always hanged at least once before. Ofc best to test throughout the week as well before coming to conclusions

Another note, one of our developers on our discord, EduCovas, has an iMac8,1 machine that couldn't boot Monterey almost at all. With this kext and delay set to 250ms, it seems to boot reliably now! 200ms and below were hit or miss with most being a miss. Overall boot is now 2min but that seems quite reasonable especially for having a stable machine.
 
According to my preliminary testing, latebloom has made booting into macOS 11.4 recovery very reliable. No boot hangs for delays ≥ 50 ms. However, with a delay of 30 ms, the hanging returns.

I was able to boot into a macOS 11.4 installer on the first try with the default delay and got a fresh installation underway. The first reboot, however, experienced a hang, and two attempts were necessary to finish the installation.

Further testing revealed that booting into the OS requires a more careful selection of the delay. In my case, the sweet spot seems to be 100 ms. My parameters are as follows:

Code:
latebloom=100 lb_debug=1

I stopped the testing after 7 successful boots in a row. Without the kext, I was lucky to get 1 successful boot in 7 tries! I'll do more testing, but so far, the kext seems to work stupendously!
 
1 out of 5 or more successful boots on the Monterey Beta2.
Thanks. This is quite helpful to know the before and after.

According to my preliminary testing, latebloom has made booting into macOS 11.4 recovery very reliable. No boot hangs for delays ≥ 50 ms. However, with a delay of 30 ms, the hanging returns.

I was able to boot into a macOS 11.4 installer on the first try with the default delay and got a fresh installation underway. The first reboot, however, experienced a hang, and two attempts were necessary to finish the installation.

Further testing revealed that booting into the OS requires a more careful selection of the delay. In my case, the sweet spot seems to be 100 ms. My parameters are as follows:

Code:
latebloom=100 lb_debug=1

I stopped the testing after 7 successful boots in a row. Without the kext, I was lucky to get 1 successful boot in 7 tries! I'll do more testing, but so far, the kext seems to work stupendously!

That's great to hear! Wow.
 
Hi, just tried - on my Monterey Beta 2 volume - using your kext, with latebloom=60 lb_range=0 added to my boot-args (to facilitate experimenting) and then rebooted about 10 times with variations from 30ms-120ms and range 10 or 20 or 0. However, for my setup, every boot stalled at 'AMFIInitializelocalSigningPublicKey: failed to get local signing public key (e00002bc)'. I only got through to the desktop once I set the kext load to 'false'. Sorry that I can't offer more positive feedback. Hopefully this experimental kext will especially help on others' 11.4 or 11.5 loads. I'm not going to bother with any BS after 11.2.3 and just hope that the final release of 12.x loads as consistently fast and reliably as it does with the latest OCLP 0.2.3 release.
 
I hope some R&D guys will come around this thread and modify the kernel especially where they placed "possible race condition" notes.
 
Hi, just tried - on my Monterey Beta 2 volume - using your kext, with latebloom=60 lb_range=0 added to my boot-args (to facilitate experimenting) and then rebooted about 10 times with variations from 30ms-120ms and range 10 or 20 or 0. However, for my setup, every boot stalled at 'AMFIInitializelocalSigningPublicKey: failed to get local signing public key (e00002bc)'. I only got through to the desktop once I set the kext load to 'false'. Sorry that I can't offer more positive feedback. Hopefully this experimental kext will especially help on others' 11.4 or 11.5 loads. I'm not going to bother with any BS after 11.2.3 and just hope that the final release of 12.x loads as consistently fast and reliably as it does with the latest OCLP 0.2.3 release.

Have you verified that the kext successfully set its hook? (I think I need to create a way to verify that after boot, since it’s hard to be sure when the messages whiz by so quickly.)
 
Have you verified that the kext successfully set its hook? (I think I need to create a way to verify that after boot, since it’s hard to be sure when the messages whiz by so quickly.)
One can use:
Code:
log show --predicate  'sender == "latebloom"' --start $(date "+%Y-%m-%d") --debug


This is the important line:
Code:
2021-07-12 13:35:37.937624-0400 0x72       Default     0x0                  0      0    kernel: (latebloom) _____[ !!! *** latebloom *** !!! ]: Hook placed successfully.  Count = 0

Edit: Here is a single liner to check if the hook is successful:
Code:
log show --predicate  'sender == "latebloom" and message contains "Hook placed successfully"' --start $(date "+%Y-%m-%d") --debug
 
Last edited:
I think I need to create a way to verify that after boot, since it’s hard to be sure when the messages whiz by so quickly.
With lb_debug=1, I found that the delays were quite apparent with the lines ending in *_*_*_*_*_*_*_*_*_*_*_*_*_*_*_*_*_*_*_*_*_*_*_. I suppose that these wouldn't appear if the hook is not placed.
 
With lb_debug=1, I found that the delays were quite apparent with the lines ending in *_*_*_*_*_*_*_*_*_*_*_*_*_*_*_*_*_*_*_*_*_*_*_. I suppose that these wouldn't appear if the hook is not placed.
You are correct, those lines only appear if the kext is active. (Also, the reason for the long and oddly-formatted verbose boot lines is that I have a 4k display, so the words are tiny and they scroll by quickly; with oddball lines like the one you quoted, it's easier to spot them amid all the noise. To find "normally formatted" lines, I had to record the boot screen in slow motion on my iPhone and play it back frame-by-frame...)

Since the ultimate goal is to either turn lb_debug off or turn verbose booting off entirely, I'll be working on a way to easily check the hook status from a booted system. @startergo's method is good, but I'm looking for something simpler, especially for Terminal-averse users.
 
  • Like
Reactions: h9826790
one question about this work around: Will this negatively impact DPC latency after having booted up and operating the OS normally? Would be nice if this work around was only in effect during boot up.

Maybe some of you can run the cmd line "latency" command with and without this workaround for comparison.
 
Last edited:
you can do the log show in apple script with the line

note the \

Code:
display dialog (do shell script "log show --predicate  'sender == \"latebloom\"' --start $(date \"+%Y-%m-%d\") --debug")

edit: attached App
 

Attachments

  • latebloom log.app.zip
    49.8 KB · Views: 260
  • Like
Reactions: Ausdauersportler
one question about this work around: Will this negatively impact DPC latency after having booted up and operating the OS normally? Would be nice if this work around was only in effect during boot up.
This kext hooks the IOPCIFamily PCI Bus Probe routine. As far as I can tell, that's only ever called during boot, so latebloom is effectively only active during boot. If there are other instances where the PCI bus gets probed, the delays would still take place; however, I haven't found any cases where the PCI bus gets probed more than once (I would be extremely interested in hearing about any that exist). After the system is booted, latebloom just sits there quietly, taking up a little memory and humming Édith Piaf tunes to itself.
 
Some help please

I guess the latebloom delay is input in the boot-args section in config.plist. As this is missing from my config could somebody guide me or post their config.plist?

Thanks
 
This kext hooks the IOPCIFamily PCI Bus Probe routine. As far as I can tell, that's only ever called during boot, so latebloom is effectively only active during boot. If there are other instances where the PCI bus gets probed, the delays would still take place; however, I haven't found any cases where the PCI bus gets probed more than once (I would be extremely interested in hearing about any that exist). After the system is booted, latebloom just sits there quietly, taking up a little memory and humming Édith Piaf tunes to itself.

Good to hear. it would be interesting to hear about before and after results using the "latency" command.
 
Status
Not open for further replies.
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.