Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
Status
The first post of this thread is a WikiPost and can be edited by anyone with the appropiate permissions. Your edits will be public.
Status
Not open for further replies.
I'm only testing SSDs, I'll take a look at a HDD - it wouldn't be too crazy that with the relaxed timings for a HDD the success rate improves.
Tried with a USB-connected HDD. Still hangs.

Have you guys tried to slow the machine down like with the bootarg cpus=1 ?
Tried cpus=1. Still hangs.

One thing I noticed: shortly after the hang, sometimes I see the line "!BSD" appear in verbose mode. Does anybody know what that indicates?
 
  • Like
Reactions: Bmju
If you were running OpenCore, you will still have the boot picker. I don't get the issue?
That's what I think my issue is for some reason the boot picker isn't coming up anymore, I haven't done anything with the drive that has opencore on it but something must have messed that up.

Going to try and put opencore on a USB stick and boot up with the clone drive.
 
One thing I noticed: shortly after the hang, sometimes I see the line "!BSD" appear in verbose mode. Does anybody know what that indicates?
In .../iokit/bsddev/IOKitBSDInit.cpp, there's this:
Code:
IOFindBSDRoot( char * rootName, unsigned int rootNameSize, dev_t * root, u_int32_t * oflags )
{
        mach_timespec_t     t;
        IOService *         service;
        IORegistryEntry *   regEntry;
        OSDictionary *      matching = NULL;
        OSString *          iostr;
        OSNumber *          off;
        OSData *            data = NULL;

        UInt32              flags = 0;
        int                 mnr, mjr;
        const char *        mediaProperty = NULL;
        char *              rdBootVar;
        char *              str;
        const char *        look = NULL;
        int                 len;
        bool                debugInfoPrintedOnce = false;
        bool                needNetworkKexts = false;
        const char *        uuidStr = NULL;

        static int          mountAttempts = 0;

        int xchar, dchar;

        // stall here for anyone matching on the IOBSD resource to finish (filesystems)
        matching = IOService::serviceMatching(gIOResourcesKey);
        assert(matching);
        matching->setObject(gIOResourceMatchedKey, gIOBSDKey);

        if ((service = IOService::waitForMatchingService(matching, 30ULL * kSecondScale))) {
                service->release();
        } else {
                IOLog("!BSD\n");
        }
// <snip> - (there's much more to this function)
It appears that the "!BSD" message only means that when IOFindBSDRoot() is called, there are no IOBSD devices still initializing (the lack of "!BSD" should indicate that IOFindBSDRoot() had to wait (up to 30 seconds) for one or more IOBSD devices to complete their initialization).
 
I see the line "!BSD" appear in verbose mode.

Seems your hang is this wait period:

C++:
// stall here for anyone matching on the IOBSD resource to finish (filesystems)
matching = IOService::serviceMatching(gIOResourcesKey);
assert(matching);
matching->setObject(gIOResourceMatchedKey, gIOBSDKey);

Perhaps some don't pass the ASSERT?

@cdf ... you need to try to see whether there is a pattern to when the log item shows up.

Also someone said he could not install it on MBP14,3 in an external NVMe. That is a real Apple report possibility.
Hopefully the person did put a report in ... might need a further prompt.
 
  • Like
Reactions: Bmju
Perhaps some don't pass the ASSERT?
The assert()s are no-ops in production kernels. I don't have 11.3 to verify absolutely, but in every other production kernel, assert() is a no-op, and can be ignored when analyzing the code (it's mostly useful to understand the programmer's intent). It's definitely that way in 11.1, which is the newest 11.x I have here.
 
The assert()s are no-ops in production kernels.
Indeed ... Only active in the debug version. Presumably safely locked up in Apple HQ.
But if the condition fails, a crash will follow at some point.
 
Indeed ... Only active in the debug version. Presumably safely locked up in Apple HQ.
But if the condition fails, a crash will follow at some point.
A crash could follow at some point. I'm not trying to be argumentative, but I've now spent months poring over the MacOS source code and disassemblies, working on my AVX integration, and I know that many of the assert()s in the code are non-fatal. Some even appear to be quick-and-dirty debugging statements that probably should have been removed but got overlooked during cleanup.

In this case, we know that assert(matching) passes muster, because matching->setObject(... would immediately panic if it didn't (and the "!BSD" message would not be displayed).
 
  • Like
Reactions: Bmju and Dayo
Right, but that message itself does not halt the boot process nor it causes any panics during boot. In fast I have seen it on my fully BS supported MBP11,3 without any previous panic or restart. Just a normal boot or restart.
@TECK 's research specifically makes me think that it may all be the same issue: https://forums.macrumors.com/thread...port-for-older-mac-pros.2289056/post-29821781

But I agree that's definitely not proven.

Also, until proven otherwise, it seems as if that message at startup must mean that something has (quietly) failed, crashed or panicked somewhere, either during the boot itself (less likely) or in the previous shutdown (more likely) - but I agree that is also not proven.
 
Was speaking in general terms on what to expect on ASSERTS. Didn't expect Apple code would be so sloppy but we learn everyday.

In this case, we know that assert(matching) passes muster, because matching->setObject(... would immediately panic if it didn't (and the "!BSD" message would not be displayed).
Point is that the message, from what I can make of @cdf report, is not there all the time.
So, speculating that in some cases, the ASSERT condition is not met and it chokes thereafter.

It seems he sees the message only after using a slow connection (USB Drive) or after reducing CPU.
Asked him earlier for whether there is a pattern on when this message comes up to be sure.
 
Last edited:
My current work has me running in circles, so I could use a little break. My only interest in Big Sur is as a test platform, since I'll need to support it in the future; I have no plans to make it my daily driver. As such, my newest (and only) BS installation is the aforementioned 11.1, and I'm not in a good position to help with hands-on installation testing.

However, I can definitely poke around in the kernel, so: if someone wants to create a ZIP or .tgz (or other compressed archive) containing a complete panic report that's representative of the problem so many folks seem to be having, along with the following files, and PM it to me, I'll have a look and see if I can find anything useful to offer. (No promises, of course, other than another set of eyes looking for causes and possible solutions...)

Big Sur 11.3 files to include in the ZIP/.tgz:
Code:
/System/Library/Kernels/kernel
/System/Library/Extensions/IOPCIFamily.kext/IOPCIFamily
/System/Library/Extensions/IONVMeFamily.kext/Contents/MacOS/IONVMeFamily
(and a complete panic report from a system running those exact kernel/kexts)
 
Last edited:
  • Like
Reactions: LucMac
My current work has me running in circles, so I could use a little break. My only interest in Big Sur is as a test platform, since I'll need to support it in the future; I have no plans to make it my daily driver. As such, my newest (and only) BS installation is the aforementioned 11.1, and I'm not in a good position to help with hands-on installation testing.

However, I can definitely poke around in the kernel, so: if someone wants to create a ZIP or .tgz (or other compressed archive) containing a complete panic report that's representative of the problem so many folks seem to be having, along with the following files, and PM it to me, I'll have a look and see if I can find anything useful to offer. (No promises, of course, other than another set of eyes looking for causes and possible solutions...)

Big Sur 11.3 files to include:
Code:
/System/Library/Kernels/kernel
/System/Library/Extensions/IOPCIFamily.kext/IOPCIFamily
/System/Library/Extensions/IONVMeFamily.kext/Contents/MacOS/IONVMeFamily
(and a complete panic report from a system running those exact kernel/kexts)
The problem is that the NVMe panic occurs before initialization. So most likely the only way is to make a video from the screen or slow down the boot process. I remember there was an OC option to slow the boot process, but I forgot how to to do it.
 
@cdf ... you need to try to see whether there is a pattern to when the log item shows up.
I can't identify a pattern. Sometimes it shows up, sometimes it doesn't. HDD or SDD it doesn't matter. And that goes for the prohibitory symbol, too. More data: When it does show up, "!BSD" appears exactly 30 seconds after the hang. Similarly, when it does show up, the prohibitory symbol appears exactly 1 minute after "!BSD" if that message did show up or after the initial hang, otherwise.
 
  • Like
Reactions: Bmju and Dayo
The problem is that the NVMe panic occurs before initialization. So most likely the only way is to make a video from the screen or slow down the boot process. I remember there was an OC option to slow the boot process, but I forgot how to to do it.
Are you saying that the panic doesn't produce a panic report? (I know firsthand that there are certainly a lot of instances of that!) Or perhaps a panic report doesn't get written to the disk because the NVMe isn't yet initialized?

If it's the latter, didn't I see folks saying that it happens even when booting from a non-NVMe drive, if an NVMe drive is present? In that case, a panic report should (hopefully) get produced on the non-NVMe boot drive...
 
Are you saying that the panic doesn't produce a panic report? (I know firsthand that there are certainly a lot of instances of that!) Or perhaps a panic report doesn't get written to the disk because the NVMe isn't yet initialized?

If it's the latter, didn't I see folks saying that it happens even when booting from a non-NVMe drive, if an NVMe drive is present? In that case, a panic report should (hopefully) get produced on the non-NVMe boot drive...
Actually I found one panic log. Sometimes it I written, but sometimes it is not. I think you are right, I may have gotten it after booting to Catalina.
Post in thread 'Mac OS 11.3 has broken support for older Mac Pros' https://forums.macrumors.com/thread...port-for-older-mac-pros.2289056/post-29720393
 
If it's the latter, didn't I see folks saying that it happens even when booting from a non-NVMe drive, if an NVMe drive is present?
The hanging at boot occurs even with no NVMe device at all. This is not an NVMe thing. IONVMeFamily panicking is a symptom not a cause of this unfortunate situation.
 
  • Like
Reactions: Bmju
@tsialex @cdf @startergo What do we know so far, for sure? From what I read, even with a stripped down machine, we got random failures. I have no idea how is it possible for me to have a functional machine, after reading what everyone reports here.
 
@tsialex @cdf @startergo What do we know so far, for sure? From what I read, even with a stripped down machine, we got random failures. I have no idea how is it possible for me to have a functional machine, after reading what everyone reports here.
Unfortunately, I'm not sure that we know anything for sure yet!

Regarding the consistency that you're seeing, my theory is that there's something in your hardware configuration that's mitigating the problem. You could test this by seeing if you can still boot consistently after removing PCIe devices one by one.

Question (for another theory): do you have the SMC fan bug?
 
With debug Lilu one can slow down the process. One second may be too much but this can be reduced:
Add liludelay=1000 to enable 1 second delay after each print for troubleshooting
 
do you have the SMC fan bug?
No, not during the repeated reboots. But I experienced once the video card fans spinning fast for few seconds on 11.3. This was happening at same frequency in 11.2.3, so nothing is different.
 
You could test this by seeing if you can still boot consistently after removing PCIe devices one by one.
The only PCIe devices I have is the Syba and Sonnet cards. Obviously I cannot remove the Syba card, it has the Big Sur (CH0) and Windows (CH1) disks. I am positive the Sonnet card has no influence in the system stability.

Is there a way I can save the boot log so we can see what is happening in my system? Can you post the procedure what I need to do? Can you compare Plistlib EFI I have with yours? Can you try the Plistlib EFI in your system, just to be 100% sure?
 
Last edited:
Similarly, when it does show up, the prohibitory symbol appears exactly 1 minute after
Do you get a mount(x) failed where 'x' is a number that iterates up?
Either with the sign or otherwise?
 
No, not during the repeated reboots. But I experienced once the video card fans spinning fast for few seconds on 11.3. This was happening at same frequency in 11.2.3, so nothing is different.
No high PS or PCI fan speed on cold boots (as described here)?
 
@tsialex Would this help?

IMG_0661.jpg


If not, what log do I need to look at?
 
Status
Not open for further replies.
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.