Good housekeeping.....

theorist9 · Aug 22, 2024

Confused-User said:
Well, it's correct that I didn't make enough of an effort to teach you enough about this to have the conversation. I am unwilling to do so- it would be a significant effort and I didn't sign up for that. I'm sure (from seeing some of your other posts here) that you're more than capable of figuring it all out if you do enough reading. I invite you to do so, and in the meantime, I've offered some brief guidance.

Ah, I didn't mean to give the impression I wanted you to do more work! You were already generous with your time in what you wrote. I was instead saying it's possible that if it were explained differently (but not longer), it might have made sense to me.

I ran into this several months ago on another topic here on MR, where the poster concluded the same as you—that I didn't have the background to understand their explanation (and it was indeed true that I couldn't follow what they were saying). But then another poster chimed in with a single-sentence clarification, and it all became clear to me. So, in that case, the true limitation wasn't my background, it was the way it was being explained. In other cases, it may be the opposite.

And nor am I asking you to re-explain it! I'm just pointing out that, when someone doesn't understand an explanation, one should not be so quick to leap to the conclusion that it's because they lack the background. It may instead be because of your explanation (or it may not).

Confused-User · Aug 22, 2024

theorist9 said:
Ah, I didn't mean to give the impression I wanted you to do more work! You were already generous with your time in what you wrote. I was instead saying it's possible that if it were explained differently (but not longer), it might have made sense to me.

I ran into this several months ago on another topic here on MR, where the poster concluded the same as you—that I didn't have the background to understand their explanation (and it was indeed true that I couldn't follow what they were saying). But then another poster chimed in with a single-sentence clarification, and it all became clear to me. So, in that case, the true limitation wasn't my background, it was the way it was being explained. In other cases, it may be the opposite.

And nor am I asking you to re-explain it! I'm just pointing out that, when someone doesn't understand an explanation, one should not be so quick to leap to the conclusion that it's because they lack the background. It may instead be because of your explanation (or it may not).

In this case I think your issue is mostly not comprehension, so much as lack of facts. Though possibly I could have been clearer- I was surprised that you didn't seem to get the distinction I was making between basic memory resources and more specific resources.

It may simply not be worth your time to dig into this. There's a lot there.

But in brief, to try to help with that distinction: Memory is easy to understand. You have so much of it and no more. Add in VM, and it gets a little more complex, but not much - now instead of running out of memory, you may instead just get slower and slower (though you can in fact still run out, when things refuse to grow without bounds, which is often the case, and inside the kernel, it can get hairier).

But "resources", as used in this discussion, aren't simply memory. They consume memory, but what makes them interesting is that they are specific data structures (or at least, appropriately sized memory for such), allocated out of a larger data structure that may not be able to grow, or can grow only with significant constraints. The kalloc zones you discovered are a simple example of that, but almost too simple.

As a more typical example, let's say you're writing a kernel driver for an ethernet interface. Perhaps that card allows you to hand it a bunch of buffers, and will transmit them in order. So your driver wants to allocate some buffers that can be filled with outbound packets. When the kernel gives it a packet, it gets moved to a free xmit buffer, and at some point soon after the buffer is handed to the chip. Thing is, you may only have N buffers allocated. If your chip can't handle all the outbound traffic, those buffers fill up. And sooner or later, you run out of them, as unlimited buffering is VERY bad for networking even if the kernel allowed that much memory allocation (which it probably doesn't).

Now, your ethernet driver has graceful ways it can fail which won't crash the OS, when it needs another buffer and can't get one. But historically that hasn't always been true. And other bits of code may not always fail gracefully. They might hang, or effectively self-DoS by making unending requests for resources that get denied. There are countless ways to screw this up. Note that in these cases, the problem (usually) isn't that the machine is low on memory. It's low on a constrained resource that takes memory, but it's not low because there isn't enough memory, it's low because of those other constraints.

There are lots of constrained resources like this in the kernel. Dozens, maybe hundreds. Less than 20 years ago, to be sure - as memory has gotten less constrained, kernel data structures have had many limitations lifted, and many can now grow enormously larger than used to be the case. For example, kernel memory space in 32-bit Linux was limited to 1GB. Now in 64-bit kernels (which are all that's used, aside from some embedded gear and very antique servers that haven't been touched in years) don't have that limit. But still, there are many.

Confused-User · Aug 22, 2024

Come to think of it, there's another related class of failures. Some data structures in the kernel are chosen based on expected size, and if that expectation is badly missed, all sorts of terrible things can happen.

For example, your OS has to keep track of network routes. (Let's just talk v4 to keep things simple.) Most machines don't have that many routes to keep track of - typically default, loopback, local subnet, maybe a multicast route, maybe a VPN route or two, not much else. Servers with multiple interfaces ("multihomed") will have a few more. Those hosting VPNs, DMZs, or doing other networking tasks may have dozens.

So in the old days, most OS kernels had a "route cache". It was a data structure for keeping track of recently used networks, to speed up packet routing, and with slow CPUs and few routes used, it was very effective. But... it turns out, this produced very poor performance if you were trying to build a decent router that could handle tens of thousands of routes (or more!), and was extremely susceptible to DoS attacks, because the cache was constantly getting overwritten, as it couldn't expand indefinitely. In Linux, the route cache was removed entirely, after redoing the data structure for the actual route table completely so accessing it became much more efficient. I don't know what MacOS does these days.

So the problem was the table and cache were designed based on the expected size of the table, and the size of (frequency of) lookup volume. In situations where those expectations were badly off, you wound up with tons of packet loss at least, and usually kernel stalls (due to time spent doing GC against the cache) or worse.

There's a decent article about this if you want more detail. It actually covers lots more than just that - the early part of it about the old cache is all you'd need to read.

Bungaree.Chubbins · Aug 22, 2024

Confused-User said:
Come to think of it, there's another related class of failures. Some data structures in the kernel are chosen based on expected size, and if that expectation is badly missed, all sorts of terrible things can happen.

For example, your OS has to keep track of network routes. (Let's just talk v4 to keep things simple.) Most machines don't have that many routes to keep track of - typically default, loopback, local subnet, maybe a multicast route, maybe a VPN route or two, not much else. Servers with multiple interfaces ("multihomed") will have a few more. Those hosting VPNs, DMZs, or doing other networking tasks may have dozens.

So in the old days, most OS kernels had a "route cache". It was a data structure for keeping track of recently used networks, to speed up packet routing, and with slow CPUs and few routes used, it was very effective. But... it turns out, this produced very poor performance if you were trying to build a decent router that could handle tens of thousands of routes (or more!), and was extremely susceptible to DoS attacks, because the cache was constantly getting overwritten, as it couldn't expand indefinitely. In Linux, the route cache was removed entirely, after redoing the data structure for the actual route table completely so accessing it became much more efficient. I don't know what MacOS does these days.

So the problem was the table and cache were designed based on the expected size of the table, and the size of (frequency of) lookup volume. In situations where those expectations were badly off, you wound up with tons of packet loss at least, and usually kernel stalls (due to time spent doing GC against the cache) or worse.

There's a decent article about this if you want more detail. It actually covers lots more than just that - the early part of it about the old cache is all you'd need to read.

I've appreciated your posts in this thread, it was interesting reading. Thank you.

Confused-User · Aug 22, 2024

Confused-User said:
Come to think of it, there's another related class of failures. Some data structures in the kernel are chosen based on expected size, and if that expectation is badly missed, all sorts of terrible things can happen.

Argh, I should have reread this part before posting it. Not as clear as it should have been.

What I meant was not the size of the data structure in bytes, but the quantity of instances of that structure. As in my example about the route cache, where the issue wasn't the size of each entry (which is constant), but the number of entries.

Sorry about any confusion that may have caused.

fibercut · Aug 24, 2024

All I know is gamers are complaining about Logitech not focusing on gaming anymore and their old stuff was good but now is rehash of older designs! You should buy your third party Mice at other stores around you or online that sell Mac stuff! This way you still find mice you might like!

theorist9 · Aug 24, 2024

Confused-User said:
Argh, I should have reread this part before posting it. Not as clear as it should have been.

What I meant was not the size of the data structure in bytes, but the quantity of instances of that structure. As in my example about the route cache, where the issue wasn't the size of each entry (which is constant), but the number of entries.

Sorry about any confusion that may have caused.

Thanks for the additional info.

What you wrote above (referring to your past few posts, not just this one) makes sense to me. Let me summarize my big-picture view, and you can let me know if it's essentially correct. If not, I'll just hang up the towel on our discussion for now

.

A common reasons Macs show issues with extended uptime is that one or more processes can run out of RAM resources. This is not an issue just for Macs, but for all OS’s.

This is not about running out of total RAM. Rather, it’s about running out of the RAM resources allocated for those processes specifically.

This happens when the data the process sends to the resource eventually exceeds its capacity, and more space can't be made, either because the existing data can't be sufficiently cleared, and/or insufficient provision (or no provision) has been made to allow the resource to expand.

RAM resource allocations fall into two broad categories:

1) Allocated physical memory, as defined by the range of RAM memory addresses allocated to that process. As an example, imagine a process that needs to be able to store various temp files into its allocated memory space in order to keep running. These temp files can be as numerous as you please. They can also be of varying sizes, and contain different types of data. The only constraint is that these files need to be able to fit into the allocated memory space. Here you run out of the resource when you can no longer store all the needed files. [You might also have cases in which the allocated memory is not a specific range of physical addresses, but rather a certain quantity of memory, which is not restricted to a fixed physical location.]

2) Allocated data structures. Imagine that, within the memory space allocated to that process, there are three arrays into which that process needs to be able to store data and that, with extended uptime, one of those becomes filled and can’t be emptied, while plenty of space remains within the other two. In this case, you haven’t run out of the allocated memory space; there’s still room to store more data. Rather, you’ve run out of the space available within a specific data structure within the allocated memory space.

The analogy would be storing boxes in a closet. #1 would be running out of space to fit all the boxes; #2 would be running out of space within at least some of the boxes.

Sippincider · Aug 24, 2024

ThunderSkunk said:
PC's kinda threw that out the window, and it was pretty common practice in a lot of co's to keep them powered down until you needed one, then you'd pull the plastic dust cover off, boot it up, use it, and shut it back down afterward.

OTOH we had multiple Ballmer-era workplace and personal PCs, where we didn’t shut down unless absolutely necessary. Boot time was literally several minutes before one got a usable desktop, whatever the # was loading up. Won’t mention the “Installing updates, don’t turn your computer off!” which would appear when you did try to it shut down… 🤦‍♂️🤦‍♂️

Confused-User · Aug 25, 2024

theorist9 said:
A common reasons Macs show issues with extended uptime is that one or more processes can run out of RAM resources. This is not an issue just for Macs, but for all OS’s.

Not so much. That could crash or lock an individual process, but the OS would continue just fine, and you could always restart that process.

HOWEVER - there are a few processes that may be critical to the OS function that are not the kernel, and it's possible that their failure would be problematic enough to effectively fail the machine. Mac's windowserver and dock processes could fall into that category, though it is in fact possible to recover from such failures. It's just often faster and much easier to reboot.

Also, it's mostly not helpful to think of it as "RAM resources", though that's an underlying truth. It's better to just think of the "boxes" (as you put it below).

This is not about running out of total RAM. Rather, it’s about running out of the RAM resources allocated for those processes specifically.

This happens when the data the process sends to the resource eventually exceeds its capacity, and more space can't be made, either because the existing data can't be sufficiently cleared, and/or insufficient provision (or no provision) has been made to allow the resource to expand.

Sort of. Most often it's not a matter of "sending data to resources" but the previous step, allocating such resources in order to them be able to make use of them. Also, once allocated, you don't usually think of it as "sending" data to the resources. They are simply where your data lives. I mean, if you write "a = b + c" in your program, you don't normally think of it as "sending" data to a.

RAM resource allocations fall into two broad categories:

1) Allocated physical memory, as defined by the range of RAM memory addresses allocated to that process. As an example, imagine a process that needs to be able to store various temp files into its allocated memory space in order to keep running. These temp files can be as numerous as you please. They can also be of varying sizes, and contain different types of data. The only constraint is that these files need to be able to fit into the allocated memory space. Here you run out of the resource when you can no longer store all the needed files. [You might also have cases in which the allocated memory is not a specific range of physical addresses, but rather a certain quantity of memory, which is not restricted to a fixed physical location.]

You'd rarely think of storing "files" in allocated memory. Also don't confuse yourself thinking about physical addresses. Nobody and nothing outside the MMU (and possibly some I/O drivers and a few other odd corner cases in the kernel) cares about that. Certainly no normal userspace process.

Most code doesn't care what addresses it gets to use, but once the addresses are assigned, they have to stay put (these are virtual addresses - again, phys addrs can/do change all the time and nothing notices). There are probably some systems not like that, but the only one I can think of offhand is the original Mac OS, where the OS allocated memory to you by giving you handles (pointers to pointers), and you mostly had to double-deref to access such data because the second pointer could be changed by the OS at (almost) any time with no warning.

2) Allocated data structures. Imagine that, within the memory space allocated to that process, there are three arrays into which that process needs to be able to store data and that, with extended uptime, one of those becomes filled and can’t be emptied, while plenty of space remains within the other two. In this case, you haven’t run out of the allocated memory space; there’s still room to store more data. Rather, you’ve run out of the space available within a specific data structure within the allocated memory space.

The analogy would be storing boxes in a closet. #1 would be running out of space to fit all the boxes; #2 would be running out of space within at least some of the boxes.

Now *this* is the key point for OS stability issues. Your kernel probably has hundreds of these boxes, not three.

Search

Search

Good housekeeping.....

theorist9

macrumors 68040

Confused-User

macrumors 6502a

Confused-User

macrumors 6502a

Bungaree.Chubbins

macrumors regular

Confused-User

macrumors 6502a

fibercut

Suspended

theorist9

macrumors 68040

Sippincider

macrumors 6502

Confused-User

macrumors 6502a

Our Staff