Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
Status
The first post of this thread is a WikiPost and can be edited by anyone with the appropiate permissions. Your edits will be public.
Status
Not open for further replies.

cdf

macrumors 68020
Jul 27, 2012
2,256
2,583
I'm only testing SSDs, I'll take a look at a HDD - it wouldn't be too crazy that with the relaxed timings for a HDD the success rate improves.
Tried with a USB-connected HDD. Still hangs.

Have you guys tried to slow the machine down like with the bootarg cpus=1 ?
Tried cpus=1. Still hangs.

One thing I noticed: shortly after the hang, sometimes I see the line "!BSD" appear in verbose mode. Does anybody know what that indicates?
 
  • Like
Reactions: Bmju

Jigga Beef

macrumors 6502
Jan 11, 2009
252
23
Philadelphia, Pa
If you were running OpenCore, you will still have the boot picker. I don't get the issue?
That's what I think my issue is for some reason the boot picker isn't coming up anymore, I haven't done anything with the drive that has opencore on it but something must have messed that up.

Going to try and put opencore on a USB stick and boot up with the clone drive.
 

Syncretic

macrumors 6502
Apr 22, 2019
311
1,533
One thing I noticed: shortly after the hang, sometimes I see the line "!BSD" appear in verbose mode. Does anybody know what that indicates?
In .../iokit/bsddev/IOKitBSDInit.cpp, there's this:
Code:
IOFindBSDRoot( char * rootName, unsigned int rootNameSize, dev_t * root, u_int32_t * oflags )
{
        mach_timespec_t     t;
        IOService *         service;
        IORegistryEntry *   regEntry;
        OSDictionary *      matching = NULL;
        OSString *          iostr;
        OSNumber *          off;
        OSData *            data = NULL;

        UInt32              flags = 0;
        int                 mnr, mjr;
        const char *        mediaProperty = NULL;
        char *              rdBootVar;
        char *              str;
        const char *        look = NULL;
        int                 len;
        bool                debugInfoPrintedOnce = false;
        bool                needNetworkKexts = false;
        const char *        uuidStr = NULL;

        static int          mountAttempts = 0;

        int xchar, dchar;

        // stall here for anyone matching on the IOBSD resource to finish (filesystems)
        matching = IOService::serviceMatching(gIOResourcesKey);
        assert(matching);
        matching->setObject(gIOResourceMatchedKey, gIOBSDKey);

        if ((service = IOService::waitForMatchingService(matching, 30ULL * kSecondScale))) {
                service->release();
        } else {
                IOLog("!BSD\n");
        }
// <snip> - (there's much more to this function)
It appears that the "!BSD" message only means that when IOFindBSDRoot() is called, there are no IOBSD devices still initializing (the lack of "!BSD" should indicate that IOFindBSDRoot() had to wait (up to 30 seconds) for one or more IOBSD devices to complete their initialization).
 

Dayo

macrumors 68020
Dec 21, 2018
2,257
1,279
I see the line "!BSD" appear in verbose mode.

Seems your hang is this wait period:

C++:
// stall here for anyone matching on the IOBSD resource to finish (filesystems)
matching = IOService::serviceMatching(gIOResourcesKey);
assert(matching);
matching->setObject(gIOResourceMatchedKey, gIOBSDKey);

Perhaps some don't pass the ASSERT?

@cdf ... you need to try to see whether there is a pattern to when the log item shows up.

Also someone said he could not install it on MBP14,3 in an external NVMe. That is a real Apple report possibility.
Hopefully the person did put a report in ... might need a further prompt.
 
  • Like
Reactions: Bmju

Syncretic

macrumors 6502
Apr 22, 2019
311
1,533
Perhaps some don't pass the ASSERT?
The assert()s are no-ops in production kernels. I don't have 11.3 to verify absolutely, but in every other production kernel, assert() is a no-op, and can be ignored when analyzing the code (it's mostly useful to understand the programmer's intent). It's definitely that way in 11.1, which is the newest 11.x I have here.
 

Dayo

macrumors 68020
Dec 21, 2018
2,257
1,279
The assert()s are no-ops in production kernels.
Indeed ... Only active in the debug version. Presumably safely locked up in Apple HQ.
But if the condition fails, a crash will follow at some point.
 

Syncretic

macrumors 6502
Apr 22, 2019
311
1,533
Indeed ... Only active in the debug version. Presumably safely locked up in Apple HQ.
But if the condition fails, a crash will follow at some point.
A crash could follow at some point. I'm not trying to be argumentative, but I've now spent months poring over the MacOS source code and disassemblies, working on my AVX integration, and I know that many of the assert()s in the code are non-fatal. Some even appear to be quick-and-dirty debugging statements that probably should have been removed but got overlooked during cleanup.

In this case, we know that assert(matching) passes muster, because matching->setObject(... would immediately panic if it didn't (and the "!BSD" message would not be displayed).
 
  • Like
Reactions: Bmju and Dayo

Bmju

macrumors 6502a
Dec 16, 2013
702
768
Right, but that message itself does not halt the boot process nor it causes any panics during boot. In fast I have seen it on my fully BS supported MBP11,3 without any previous panic or restart. Just a normal boot or restart.
@TECK 's research specifically makes me think that it may all be the same issue: https://forums.macrumors.com/thread...port-for-older-mac-pros.2289056/post-29821781

But I agree that's definitely not proven.

Also, until proven otherwise, it seems as if that message at startup must mean that something has (quietly) failed, crashed or panicked somewhere, either during the boot itself (less likely) or in the previous shutdown (more likely) - but I agree that is also not proven.
 

Dayo

macrumors 68020
Dec 21, 2018
2,257
1,279
Was speaking in general terms on what to expect on ASSERTS. Didn't expect Apple code would be so sloppy but we learn everyday.

In this case, we know that assert(matching) passes muster, because matching->setObject(... would immediately panic if it didn't (and the "!BSD" message would not be displayed).
Point is that the message, from what I can make of @cdf report, is not there all the time.
So, speculating that in some cases, the ASSERT condition is not met and it chokes thereafter.

It seems he sees the message only after using a slow connection (USB Drive) or after reducing CPU.
Asked him earlier for whether there is a pattern on when this message comes up to be sure.
 
Last edited:

Syncretic

macrumors 6502
Apr 22, 2019
311
1,533
My current work has me running in circles, so I could use a little break. My only interest in Big Sur is as a test platform, since I'll need to support it in the future; I have no plans to make it my daily driver. As such, my newest (and only) BS installation is the aforementioned 11.1, and I'm not in a good position to help with hands-on installation testing.

However, I can definitely poke around in the kernel, so: if someone wants to create a ZIP or .tgz (or other compressed archive) containing a complete panic report that's representative of the problem so many folks seem to be having, along with the following files, and PM it to me, I'll have a look and see if I can find anything useful to offer. (No promises, of course, other than another set of eyes looking for causes and possible solutions...)

Big Sur 11.3 files to include in the ZIP/.tgz:
Code:
/System/Library/Kernels/kernel
/System/Library/Extensions/IOPCIFamily.kext/IOPCIFamily
/System/Library/Extensions/IONVMeFamily.kext/Contents/MacOS/IONVMeFamily
(and a complete panic report from a system running those exact kernel/kexts)
 
Last edited:
  • Like
Reactions: LucMac

startergo

macrumors 603
Sep 20, 2018
5,022
2,283
My current work has me running in circles, so I could use a little break. My only interest in Big Sur is as a test platform, since I'll need to support it in the future; I have no plans to make it my daily driver. As such, my newest (and only) BS installation is the aforementioned 11.1, and I'm not in a good position to help with hands-on installation testing.

However, I can definitely poke around in the kernel, so: if someone wants to create a ZIP or .tgz (or other compressed archive) containing a complete panic report that's representative of the problem so many folks seem to be having, along with the following files, and PM it to me, I'll have a look and see if I can find anything useful to offer. (No promises, of course, other than another set of eyes looking for causes and possible solutions...)

Big Sur 11.3 files to include:
Code:
/System/Library/Kernels/kernel
/System/Library/Extensions/IOPCIFamily.kext/IOPCIFamily
/System/Library/Extensions/IONVMeFamily.kext/Contents/MacOS/IONVMeFamily
(and a complete panic report from a system running those exact kernel/kexts)
The problem is that the NVMe panic occurs before initialization. So most likely the only way is to make a video from the screen or slow down the boot process. I remember there was an OC option to slow the boot process, but I forgot how to to do it.
 

cdf

macrumors 68020
Jul 27, 2012
2,256
2,583
@cdf ... you need to try to see whether there is a pattern to when the log item shows up.
I can't identify a pattern. Sometimes it shows up, sometimes it doesn't. HDD or SDD it doesn't matter. And that goes for the prohibitory symbol, too. More data: When it does show up, "!BSD" appears exactly 30 seconds after the hang. Similarly, when it does show up, the prohibitory symbol appears exactly 1 minute after "!BSD" if that message did show up or after the initial hang, otherwise.
 
  • Like
Reactions: Bmju and Dayo

Syncretic

macrumors 6502
Apr 22, 2019
311
1,533
The problem is that the NVMe panic occurs before initialization. So most likely the only way is to make a video from the screen or slow down the boot process. I remember there was an OC option to slow the boot process, but I forgot how to to do it.
Are you saying that the panic doesn't produce a panic report? (I know firsthand that there are certainly a lot of instances of that!) Or perhaps a panic report doesn't get written to the disk because the NVMe isn't yet initialized?

If it's the latter, didn't I see folks saying that it happens even when booting from a non-NVMe drive, if an NVMe drive is present? In that case, a panic report should (hopefully) get produced on the non-NVMe boot drive...
 

startergo

macrumors 603
Sep 20, 2018
5,022
2,283
Are you saying that the panic doesn't produce a panic report? (I know firsthand that there are certainly a lot of instances of that!) Or perhaps a panic report doesn't get written to the disk because the NVMe isn't yet initialized?

If it's the latter, didn't I see folks saying that it happens even when booting from a non-NVMe drive, if an NVMe drive is present? In that case, a panic report should (hopefully) get produced on the non-NVMe boot drive...
Actually I found one panic log. Sometimes it I written, but sometimes it is not. I think you are right, I may have gotten it after booting to Catalina.
Post in thread 'Mac OS 11.3 has broken support for older Mac Pros' https://forums.macrumors.com/thread...port-for-older-mac-pros.2289056/post-29720393
 

cdf

macrumors 68020
Jul 27, 2012
2,256
2,583
If it's the latter, didn't I see folks saying that it happens even when booting from a non-NVMe drive, if an NVMe drive is present?
The hanging at boot occurs even with no NVMe device at all. This is not an NVMe thing. IONVMeFamily panicking is a symptom not a cause of this unfortunate situation.
 
  • Like
Reactions: Bmju

TECK

macrumors 65816
Nov 18, 2011
1,129
478
@tsialex @cdf @startergo What do we know so far, for sure? From what I read, even with a stripped down machine, we got random failures. I have no idea how is it possible for me to have a functional machine, after reading what everyone reports here.
 

cdf

macrumors 68020
Jul 27, 2012
2,256
2,583
@tsialex @cdf @startergo What do we know so far, for sure? From what I read, even with a stripped down machine, we got random failures. I have no idea how is it possible for me to have a functional machine, after reading what everyone reports here.
Unfortunately, I'm not sure that we know anything for sure yet!

Regarding the consistency that you're seeing, my theory is that there's something in your hardware configuration that's mitigating the problem. You could test this by seeing if you can still boot consistently after removing PCIe devices one by one.

Question (for another theory): do you have the SMC fan bug?
 

startergo

macrumors 603
Sep 20, 2018
5,022
2,283
With debug Lilu one can slow down the process. One second may be too much but this can be reduced:
Add liludelay=1000 to enable 1 second delay after each print for troubleshooting
 

TECK

macrumors 65816
Nov 18, 2011
1,129
478
do you have the SMC fan bug?
No, not during the repeated reboots. But I experienced once the video card fans spinning fast for few seconds on 11.3. This was happening at same frequency in 11.2.3, so nothing is different.
 

TECK

macrumors 65816
Nov 18, 2011
1,129
478
You could test this by seeing if you can still boot consistently after removing PCIe devices one by one.
The only PCIe devices I have is the Syba and Sonnet cards. Obviously I cannot remove the Syba card, it has the Big Sur (CH0) and Windows (CH1) disks. I am positive the Sonnet card has no influence in the system stability.

Is there a way I can save the boot log so we can see what is happening in my system? Can you post the procedure what I need to do? Can you compare Plistlib EFI I have with yours? Can you try the Plistlib EFI in your system, just to be 100% sure?
 
Last edited:

Dayo

macrumors 68020
Dec 21, 2018
2,257
1,279
Similarly, when it does show up, the prohibitory symbol appears exactly 1 minute after
Do you get a mount(x) failed where 'x' is a number that iterates up?
Either with the sign or otherwise?
 

cdf

macrumors 68020
Jul 27, 2012
2,256
2,583
No, not during the repeated reboots. But I experienced once the video card fans spinning fast for few seconds on 11.3. This was happening at same frequency in 11.2.3, so nothing is different.
No high PS or PCI fan speed on cold boots (as described here)?
 

w1z

macrumors 6502a
Aug 20, 2013
692
481
@tsialex Would this help?

IMG_0661.jpg


If not, what log do I need to look at?
 
Status
Not open for further replies.
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.