Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

RazorBackXX

macrumors member
Original poster
Oct 1, 2023
62
8
Lylat System
Hey My Friends,

I’m working on a Zsh-based script in the macOS terminal, and I’ve hit a roadblock while trying to obtain precise file sizes for folders. I’m seeing discrepancies between the size reported in Finder and the size calculated via terminal commands, and this is critical for my app development project, where accurate file duplication and size reporting are essential. I need rysnc -a for backing up the files not cp -a.


The Problem:

• Finder reports the size of a folder on my desktop as 1.21 GB, but all methods I’ve tried in the terminal report around 1.12 GB.

• I’ve already removed hidden files like .DS_Store, so the file count is correct. I need the size of the folder to be as accurate as possible, and the discrepancy is causing performance concerns in my app.


What I’ve Tried So Far:

1. Using rsync:

I’ve experimented with several variations of the rsync command:

• rsync -a

• rsync -av

• rsync -av --progress

Each time, the file size output in the terminal is still 1.12 GB, even though Finder consistently reports 1.21 GB.


2. Using stat & awk:

I’ve tried summing file sizes using stat and awk to manually calculate the total size. Here’s a snippet I used:

total_size_bytes=$(find "$src_folder" -type f -exec stat -f%z {} + | awk '{total += $1} END {print total}')


total_size_gb=$(echo "scale=3; $total_size_bytes / (1024 * 1024 * 1024)" | bc)


This approach still returns a smaller file size than Finder reports.


3. Binary vs Decimal Size Calculations:

I explored whether the discrepancy was due to differences in binary vs decimal size measurements (i.e., 1 GB = 1024 MB vs 1 GB = 1000 MB). Despite converting between these formats, the discrepancy remains unresolved.

4. Other Commands (du -sh):

I’ve used du -sh to display the human-readable size of the folder. However, this also reports the size as around 1.1 GB, which is still off compared to what Finder shows.

What I Need:

I’m looking for a method or terminal command that can accurately calculate the size of a folder and its contents on macOS, and match Finder’s output as closely as possible. Ideally, I would like to avoid relying on external tools like brew-installed packages and stick with native macOS solutions in Zsh.

This is critical for my project because I’m developing an app that handles file duplication, and accurate file sizes are necessary for the app’s performance and functionality.

Important:

• I’m using Zsh exclusively.

• I need a solution that works natively in macOS without external dependencies.

Any help or insights you could offer would be greatly appreciated!
 
I explored whether the discrepancy was due to differences in binary vs decimal size measurements (i.e., 1 GB = 1024 MB vs 1 GB = 1000 MB). Despite converting between these formats, the discrepancy remains unresolved.
Are you sure?

1.21 * 1000 * 1000 * 1000 = 1210000000, and then 1210000000 / 1024 / 1024 / 1024 = 1.12, which meshes with your figures.
 

How I Fixed the Issue with File Sizes Matching Finder​

The problem was that the script was calculating file sizes using binary gigabytes (1 GB = 1,073,741,824 bytes), while Finder uses decimal gigabytes (1 GB = 1,000,000,000 bytes). This caused the sizes reported by the script to differ from what Finder was showing.

Key Changes I Made:​

  1. Switched to Decimal Gigabytes:
    • To match Finder’s format, I changed the formula to convert file sizes from bytes to decimal gigabytes. This ensures that the sizes reported by the script now align with Finder's calculation.
  2. Summed File Sizes Correctly:
    • I used find, stat, and awk to find all files in the folder and sum their sizes. This way, I got the total size in bytes for all files.
  3. Displayed a Summary:
    • After summing the total file size in bytes, I converted it to decimal gigabytes and displayed a clean summary showing the total number of files and the total size in both gigabytes and bytes.
 
Are you sure?

1.21 * 1000 * 1000 * 1000 = 1210000000, and then 1210000000 / 1024 / 1024 / 1024 = 1.12, which meshes with your figures.
The problem was that the script was calculating file sizes using binary gigabytes (1 GB = 1,073,741,824 bytes), while Finder uses decimal gigabytes (1 GB = 1,000,000,000 bytes
 
What about extended attributes?

This file is 1914636 bytes and has extended attributes of 32 and 524729 bytes.
Code:
ls -l@ /Volumes/Classic/System\ Folder/Finder 
-rw-rw-r--@ 1 root  admin  1914636 29 May  2001 /Volumes/Classic/System Folder/Finder
    com.apple.FinderInfo         32 
    com.apple.ResourceFork     524729

Those are old extended attributes. New files probably don't have those attributes. They might have other attributes created by macOS which are temporary and small.

Does rsync preserve extended attributes?
 

Attachments

  • Finder.zip
    1,022 KB · Views: 18
  • Like
Reactions: Brian33 and chown33
Does rsync preserve extended attributes?
My understanding is yes with -E option which from what the OP has said is not being copied. But I don't know how selective rsync is. For precise backup and restore then some xattr should be preserved, but not all. @RazorBackXX has not said anything about purpose.
 
My understanding is yes with -E option which from what the OP has said is not being copied. But I don't know how selective rsync is. For precise backup and restore then some xattr should be preserved, but not all. @RazorBackXX has not said anything about purpose.
sorry brother, I was building an app & finished it. to duplicate large files of data like 100gb ect. but if someone wanted to use it for smaller files it could. but this is is why I went with rysnc
because I was trying to save as much metadata data for the user as possible.

1. Smarter Copying: rsync only copies files that have changed or are new, so you’re not duplicating the same files over and over. Command-D, on the other hand, creates a complete copy every time, even if nothing has changed.


2. Preserves More Metadata: rsync keeps a lot more file details intact. It retains things like file permissions, ownership (who created the file), and timestamps (when it was created or modified). It can even copy symbolic links as they are, instead of copying the actual files they point to. Command-D doesn’t handle all of these as precisely.


3. Handles Mac-Specific Metadata: If you’re on macOS, rsync can also preserve those special extended attributes (like Finder tags or comments) with the right flags. Command-D doesn’t always get these right.


4. More Efficient for Repeated Syncs: If you need to update a copy of a folder, rsync will only transfer the changes, which is way faster. Command-D duplicates the entire folder each time, which is inefficient.


5. Can Work Over Networks: rsync allows you to copy folders across the network, even over SSH, making it great for remote backups. Command-D is local only.


6. Resumes Interrupted Transfers: If something goes wrong during a copy, rsync can resume from where it left off. With Command-D, you’d have to start over.


7. Customizable: rsync lets you exclude files or folders you don’t want to copy. With Command-D, you can’t pick and choose what gets copied.
 
rsync has this but idk if it stores everything.

1. Smarter Copying: rsync only copies files that have changed or are new, so you’re not duplicating the same files over and over. Command-D, on the other hand, creates a complete copy every time, even if nothing has changed.


2. Preserves More Metadata: rsync keeps a lot more file details intact. It retains things like file permissions, ownership (who created the file), and timestamps (when it was created or modified). It can even copy symbolic links as they are, instead of copying the actual files they point to. Command-D doesn’t handle all of these as precisely.


3. Handles Mac-Specific Metadata: If you’re on macOS, rsync can also preserve those special extended attributes (like Finder tags or comments) with the right flags. Command-D doesn’t always get these right.


4. More Efficient for Repeated Syncs: If you need to update a copy of a folder, rsync will only transfer the changes, which is way faster. Command-D duplicates the entire folder each time, which is inefficient.


5. Can Work Over Networks: rsync allows you to copy folders across the network, even over SSH, making it great for remote backups. Command-D is local only.


6. Resumes Interrupted Transfers: If something goes wrong during a copy, rsync can resume from where it left off. With Command-D, you’d have to start over.


7. Customizable: rsync lets you exclude files or folders you don’t want to copy. With Command-D, you can’t pick and choose what gets copied.
 
rsync is definitely a powerhouse!

@RazorBackXX , if you're looking to get exact copies, you might like to know that (according to my testing) the version that Apple supplies with macOS Monterey (and I believe later macOS versions, too) does not preserve ACLs.

On Monterey (and I think even on later macOSes?), rsync is version 2.6.9. It's old! Later versions of rsync offer an option (-A) to preserve ACLs. I tested with rsync version 3.2.4. You can install even newer rsync 3.3.0 with Brew.

(Note that Apple's rsync uses -E to preserve extended attributes, but recent versions of rsync use -X for extended attributes.)

When I want a copy as exact as possible with rsync 3.2.4 (NOT Apple's rsync) I use these flags:

sudo rsync -aAXNH --fileflags --delete ...

-A #preserve ACLs (as seen with 'ls -le')
-X #preserve extended attributes (as seen with 'ls -l@' or 'xattr -l')
-N #preserve creation times (as seen with 'ls -lU')
-H #preserve hard links
--fileflags #preserve file-flags (aka chflags; as seen with 'ls -lO')
--delete #delete extraneous files from destination, if desired
-U #preserve access times (probably don't usually want this; seen with 'ls -lu')


If you want to preserve even the access times, add -U (I haven't tested that, though).

Happy syncing!
 
Haven't poured over this thread but something to look out for: size values, as touched on, can vary greatly (eg. 1000 vs 1024) depending on where/how you are extracting that info.

For example, the du command reports, by default, 512 byte blocks vs filesystem might actually be 1-2K blocks. Iirc.

And filesystem differences between source and destination might cause some shifting (eg. APFS -> exFAT [or HFS+]; different block sizes [again], etc).
 
Last edited:
  • Like
Reactions: Brian33
rsync is definitely a powerhouse!

@RazorBackXX , if you're looking to get exact copies, you might like to know that (according to my testing) the version that Apple supplies with macOS Monterey (and I believe later macOS versions, too) does not preserve ACLs.

On Monterey (and I think even on later macOSes?), rsync is version 2.6.9. It's old! Later versions of rsync offer an option (-A) to preserve ACLs. I tested with rsync version 3.2.4. You can install even newer rsync 3.3.0 with Brew.

(Note that Apple's rsync uses -E to preserve extended attributes, but recent versions of rsync use -X for extended attributes.)

When I want a copy as exact as possible with rsync 3.2.4 (NOT Apple's rsync) I use these flags:

sudo rsync -aAXNH --fileflags --delete ...

-A #preserve ACLs (as seen with 'ls -le')
-X #preserve extended attributes (as seen with 'ls -l@' or 'xattr -l')
-N #preserve creation times (as seen with 'ls -lU')
-H #preserve hard links
--fileflags #preserve file-flags (aka chflags; as seen with 'ls -lO')
--delete #delete extraneous files from destination, if desired
-U #preserve access times (probably don't usually want this; seen with 'ls -lu')


If you want to preserve even the access times, add -U (I haven't tested that, though).

Happy syncing!
oh wow, thank you my friend, I don't know why but I assumed apple updated everything to the latest version. I'm still shocked you have to use brew to install fd, when its so much better then find. thank you for helping me make my app better I appreciate you. ps. you are correct I am using a old version of rsync. I guess If you possibly know. when I distribute my app on GitHub do I need to tell people to download rysnc from brew? or is there a way to have it automatic inside the app so they dont have to? & the last thing. must I always use sudo? is there a reason to use sudo?
 
  • Like
Reactions: Brian33
You're welcome!

when I distribute my app on GitHub do I need to tell people to download rysnc from brew? or is there a way to have it automatic inside the app so they dont have to?
I'm not very knowledgeable about developing, packaging and distributing apps. However, I will say that rsync appears to be open-source (see https://rsync.samba.org/). That page also has a link for the source code. So it seems you could compile rsync code directly into your application. Or just include an rsync binary in your .app bundle and execute that version from your application.

Oh -- just noticed you are creating a zsh script and would need to use the binary and somehow distribute it with your script. Not sure the best way to do that...

must I always use sudo? is there a reason to use sudo?
Well, it depends. IF all of the files and directories are readable by, and owned by, the user invoking rsync, I think you wouldn't need sudo. However, if the files are owned by several different users, you need sudo to ensure that the copies get the correct ownership and group assignments.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.