Do other filesystems support checksums? Which ones?
I have read that Btrfs supports checksums, but it has some issues. What are the differences between Btrfs and APFS?
ZFS
Do other filesystems support checksums? Which ones?
I have read that Btrfs supports checksums, but it has some issues. What are the differences between Btrfs and APFS?
The big things in my mind are:
* Snapshots
* Copy on Write
* Containers instead of Partitions
Snapshots are probably the least useful day to day, but back the sealed system volume functionality, since you are booting snapshots that are signed. A malicious actor wanting to modify the system volume has to recreate the seal which is extra work. If I recall correctly, Time Machine has benefited from snapshots as well. But it'd be a good way to snapshot the state you want to backup and not require exclusive access to files you are trying to backup.
Fast file copy is part of copy on write.Some quick things to add to this list: native sparse file support, fast file copy (via cloning), optimised for latency.
I figured they had to be leaning on it, I just wasn't sure, so didn't want to claim it was. Thanks for clarifying that point.Snapshots are the foundation of how Apple does backups these days. Instead of copying the files like in the old good days, Time Machine now will only compare the metadata and copy over the differing pages. Enables much much faster and more stable backups.
Fast file copy is part of copy on write.![]()
If I understand it correctly, there are some details I regards to fast file copy functionality. Since usual file copying algorithms rely on reads and writes, copying this way will actually create "real" copies (duplicated data) even on COW filesystems. So if you use the usual copy API or commands, you will still incur a real copy. COW filesystems offer additional APIs/flags to avoid full copies.
The copyItem(at:to:) and copyItem(atPath:toPath:) methods of FileManager automatically create a clone for Apple File System volumes, as shown in the listing below.
Two different issues:Can someone ELI5: is this anything to be concerned about for average users?
How can we know if an application (e.g. SQLite) is well/poorly written?poorly written multi-platform programs that want to save the data safely might not know the correct macOS command to save data safely to the SSD
I thought APFS Containers are in fact fixed size, like HFS+ partitions. APFS volumes can share the space within a Container, as you describe.A lot of this benefits from SSDs. Containers are a form of thin provisioning of volumes. So I don't have to cut a 2TB drive into two 1TB partitions, but rather I can let them share the free space on the drive and each container only uses what it needs. But this means a heavily fragmented drive in the end. Containers make the sealed system volume possible without overhead of resizing partitions/etc.
I believe SQLite uses the correct flags - FSYNC/FSYNC barrier are the recommended flags for flushing in Apple’s guide. Someone posted that earlier. If you can’t inspect the code you can try to use a program that measures SSD write speed and on the Mac it’ll slow to a crawl when a FSYNC or FSYNC barrier hits. If it’s always fast, then it’s never doing a true flush. Finally you can perform a similar test to what Hector is proposing on SSDs to test to see which ones cheat on benchmarks. Basically cut power to the SSD/computer after the “flush” and see if you experience data loss.How can we know if an application (e.g. SQLite) is well/poorly written?
I thought APFS Containers are in fact fixed size, like HFS+ partitions. APFS volumes can share the space within a Container, as you describe.
Sorry to nit-pick, but while we are getting it right ......container does not = volume. An APFS container contains APFS volumes. Container is more equivalent to HFS+ partition.As I’ve stated before, I’m not intimately familiar with the terminology around APFS and what maps to what. You are correct though. Container != volume, and I used container in place of volume in this case.
That’s what I just said.Sorry to nit-pick, but while we are getting it right ......container does not = volume. An APFS container contains APFS volumes. Container is more equivalent to HFS+ partition.
OK. I saw "Container != volume", but realise now that was meant to be "Container ≠ volume". Peace!That’s what I just said.
Old habits die hard: Not equal in programming/ASCIIOK. I saw "Container != volume", but realise now that was meant to be "Container ≠ volume". Peace!
I remember watching the WWDC session a few years ago where they talked about the first step, which was migrating Mobile Time Machine to use APFS atomic snapshots. The old design used some kind of crazy multi-daemon thing that they didn't describe in great detail but sounded like a pile of special-case hacks. Rebuilding on top of APFS snapshots let them delete tens of thousands of lines of code, and block level CoW meant it could back up anything (the old Mobile TM was limited to relatively small files).I figured they had to be leaning on it, I just wasn't sure, so didn't want to claim it was. Thanks for clarifying that point.
Can someone ELI5: is this anything to be concerned about for average users?
Two different issues:
1) Basically some UNIX software written with both Linux and macOS in mind might not send the correct command on OS X to ensure that your data is actually written safely to the SSD. Correctly written macOS software should be fine but the default is not to be so secure most of the time (i.e. in the case of a power outage on a desktop with no battery backup, it's more likely to lose recent data as it more likely to be stored in volatile SSD cache rather than the SSD itself).
2) When performing the full "no really, this is important to save, write to the SSD now" command the internal Apple SSD is much, much slower than it should be. Tests indicate this slowness is a problem with Apple's firmware rather than the SSD hardware itself. So it is fixable. However, most of the time this doesn't matter, the default is not to do this. In fact this is probably why it was not caught or, if caught, not cared about because it is more rare than not to do this and the default being faster but less safe is often okay because most of Apple's most popular products come with batteries such that a sudden power outage doesn't really matter.
In short: a user is more likely to get data loss on desktops with power outages; poorly written multi-platform programs that want to save the data safely might not know the correct macOS command to save data safely to the SSD; and programs that are often invoking the safe save to SSD command will suffer from poor performance (things like database programs).
How can we know if an application (e.g. SQLite) is well/poorly written?
I remember watching the WWDC session a few years ago where they talked about the first step, which was migrating Mobile Time Machine to use APFS atomic snapshots. The old design used some kind of crazy multi-daemon thing that they didn't describe in great detail but sounded like a pile of special-case hacks. Rebuilding on top of APFS snapshots let them delete tens of thousands of lines of code, and block level CoW meant it could back up anything (the old Mobile TM was limited to relatively small files).
I find the argumentation that the own (Apple) SSDs do not need checksums questionable. However, I lack the knowledge to answer this. But it is arogant. Because people also want to connect external SSDs and they are not from Apple.
I had once read an article or comment that the statement that APFS is SSD optimized is not true at all. Unfortunately, I can't find this source anymore and I can't judge to what extent this is true.
The content was that there is no SSD optimized in the sense. The truth according to this article is that Apple has forgotten (or deliberately ignored) HDDs. You can build a file system so that SSD is "optimized" and still runs on HDDs without problems.
If that is true, I would consider that to be another arogant view from Apple.
- iMacs with HDDs were still sold for a long time. Many people still use such Macs today.
- There are people who want to connect HDDs to their Macs.
As I said: I had read that and I lack knowledge to be able to judge this. Maybe this article was also technically wrong. I cannot judge it.
I believe the SSD is flushed even after a kernel panic. This should only be a problem in case of sudden power failure.2) Also hardly a problem since a full "no really, this is important to save, write to SSD now" actions are extremely rare in practice, all things considered. This is not an API that a regular application will use or need — computer crashing is an extreme catastrophic situation and no customer software is designed to guarantee that your data is safe the moment you press that "save button" if your computer were to crash a fee seconds later.
Not an expect either, but I am fairly sure that there are such things as SSD-optimized file systems. SSDs and HDDs have very different access characteristics that justify different algorithmic approaches, e.g. SSDs have very fast random access and asymmetric read/write latency. Also, you can get the best out of SSDs performance and longevity by batching writes to minimize cell rewrites, which also requires a very different approach if you want it done properly.
The content was that there is no SSD optimized in the sense. The truth according to this article is that Apple has forgotten (or deliberately ignored) HDDs. You can build a file system so that SSD is "optimized" and still runs on HDDs without problems.
ZFS is absolutely robust. I'll go so far as to say that anyone who says otherwise is a huge liar or a storyteller.