Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

Kaitlyn2004

macrumors regular
Original poster
Aug 17, 2008
124
21
I've just completed moving across terabytes of data from my old Windows system on multiple NTFS drives to my new MacBook Pro's external drives, APFS and HFS+ drives.

Due to a variety of reasons, I also used a variety of methods to copy across the data. Including direct ethernet connection and using the SMB share, an intermediary exFAT SSD, and Apple's Finder copy. In addition, and again for multiple reasons, almost every method failed at some point... and had to be resumed/restarted. So I especially worry about random orphan file or an incomplete transfer.

I used Carbon Copy Cloner for a bunch of it, but now upon finally validating things I use the "Compare" function and while its showing the same filesizes, it's giving the exclamation point status as source+destination are different. I'm ROUGHLY assuming it's because of different OS and filesystems and I imagine some differences could be down to bytes?

But I'd really, REALLY like to be able to compare my source+destination (across SMB share ideally so I can "see" both?) - but it seems CCC isn't right for this.

Any suggestions on the best way to truly compare my source+destination across systems and truly say what is the same and not?
 
I used Carbon Copy Cloner for a bunch of it, but now upon finally validating things I use the "Compare" function and while its showing the same filesizes, it's giving the exclamation point status as source+destination are different. I'm ROUGHLY assuming it's because of different OS and filesystems and I imagine some differences could be down to bytes?
As a guess, the exclamation point could be due to extended attributes being different. I don’t recall exactly how NTFS file systems deal with them, if at all, but I seem to remember that they’re handled differently from macOS file systems. You may or may not care about them, though.

But I'd really, REALLY like to be able to compare my source+destination (across SMB share ideally so I can "see" both?) - but it seems CCC isn't right for this.

Any suggestions on the best way to truly compare my source+destination across systems and truly say what is the same and not?
I think the gold standard would be the ‘rsync’ command. Enter ‘man rsync’ to get a feel for it. It will take some effort to select the appropriate options. For example, the ‘—itemized-changes’ option will give output indicating what exactly is making a difference between the source and destination file. There is also an option to actually calculate the hash value on each file and compare them if you want to go so far as to compare the file contents. That will be quite slow though! Also, the —dry-run option is your friend!

I’m not at my computer right now, but if you’re seriously interested in using ‘rsync’ I could make some option suggestions when I get back.
 
I recently went through this while transferring TB's from multiple drives to a NAS and had frequent file copy failures. I ended up using 2 variations of the 'diff' command in terminal to compare files & directories:

This command gave me a txt file that listed any differences in directory contents. It allowed me to quickly see if there were any files/directories missed during the copy process.
Code:
diff -qr /path-to-directory-1 /path-to-directory-2 | sort > /path-to-output-file.txt

This command gave me a txt file that listed files that had differences between them in each directory. This allowed me to dig deeper and investigate file integrity.
Code:
diff -bur /path-to-directory-1 /path-to-directory-2 | sort > /path-to-output-file.txt
 
I'm not familiar with the details of using CCC for verification, but please try clicking on the exclamation point. It could be a "details" or "more info" icon, and clicking it may show the specific differences.

I also recommend running a simple test with CCC. Pick a small folder on the NTFS disk, maybe a half-dozen files with a few MB of total storage. If you don't have one, then you can make such a folder for testing. Then use CCC to copy it from the NTFS disk to a new folder on an APFS disk. After doing that, perform a verify pass with CCC, just like you did before. Next, post a screenshot of the CCC result, so we can see what you see.

There are also ways to get a full listing of extended attributes, access-control lists, etc. using the 'ls' command in a Terminal window. Those should work for both the NTFS disk and any APFS disk. If you tell us the pathname of the small test folders, then we can tell you what command-line to use, and you can then get an extensive listing of what differs.

EDIT

I read the CCC page on using the Verify feature, and it shows what each symbol means:

It looks like there's no "click for details" feature.
 
Last edited:
As a guess, the exclamation point could be due to extended attributes being different. I don’t recall exactly how NTFS file systems deal with them, if at all, but I seem to remember that they’re handled differently from macOS file systems. You may or may not care about them, though.


I think the gold standard would be the ‘rsync’ command. Enter ‘man rsync’ to get a feel for it. It will take some effort to select the appropriate options. For example, the ‘—itemized-changes’ option will give output indicating what exactly is making a difference between the source and destination file. There is also an option to actually calculate the hash value on each file and compare them if you want to go so far as to compare the file contents. That will be quite slow though! Also, the —dry-run option is your friend!

I’m not at my computer right now, but if you’re seriously interested in using ‘rsync’ I could make some option suggestions when I get back.
I initially tried rsync when the drive was mounted but it kept failing and I wasn’t 100% sure whether it was because of old rsync version and some options I was trying or it being an ntfs drive and despite only needing to read/copy from it, rsync might have been trying to write temporarily to it?

I worry about something like a command line output because it’s a deep folder tree with many thousands of files. The IDEA of CCC’s compare is sort of exactly what I wanted… but if it just shows me a somewhat generic exclamation point with any actual info on difference then it’s close to useless for me

And yeah it absolutely could be the attributes are different… but as far as I’m aware I can’t compare the files and ignore that - and see if there’s still some differences?
 
I recently went through this while transferring TB's from multiple drives to a NAS and had frequent file copy failures. I ended up using 2 variations of the 'diff' command in terminal to compare files & directories:

This command gave me a txt file that listed any differences in directory contents. It allowed me to quickly see if there were any files/directories missed during the copy process.
Code:
diff -qr /path-to-directory-1 /path-to-directory-2 | sort > /path-to-output-file.txt

This command gave me a txt file that listed files that had differences between them in each directory. This allowed me to dig deeper and investigate file integrity.
Code:
diff -bur /path-to-directory-1 /path-to-directory-2 | sort > /path-to-output-file.txt
Okay this might be a good start (not at computer now) but given I have a big folder tree and many thousands of files - terabytes just like you - won’t the initial diff, assuming there ARE discrepancies - be outright maddingly long and practically impossible to sift through?
 
I'm not familiar with the details of using CCC for verification, but please try clicking on the exclamation point. It could be a "details" or "more info" icon, and clicking it may show the specific differences.

I also recommend running a simple test with CCC. Pick a small folder on the NTFS disk, maybe a half-dozen files with a few MB of total storage. If you don't have one, then you can make such a folder for testing. Then use CCC to copy it from the NTFS disk to a new folder on an APFS disk. After doing that, perform a verify pass with CCC, just like you did before. Next, post a screenshot of the CCC result, so we can see what you see.

There are also ways to get a full listing of extended attributes, access-control lists, etc. using the 'ls' command in a Terminal window. Those should work for both the NTFS disk and any APFS disk. If you tell us the pathname of the small test folders, then we can tell you what command-line to use, and you can then get an extensive listing of what differs.

EDIT

I read the CCC page on using the Verify feature, and it shows what each symbol means:

It looks like there's no "click for details" feature.
I noticed CCC had at some point created its safety nest thing which it indicates is some rudimentary backup/versioning safety. But I noticed it’s basically full of empty files. Maybe that’s from a previous failed copy or an interrupted copy in another way… or who knows!

I suppose I could simply try to run CCC again for the whole thing, and in theory see how much data it actually syncs across?

but if compare says/thinks there are differences then I worry it may also just re-copy everything

The compare does also indicate that a single video file in the root is different on both sides, despite listing them at 149.4MB on both sides… which again leads me to potentially an attributes issue or whatever but also don’t see a way to ignore that aspect for comparison sake
 
Again, I recommend that you make a test case with only a few files in it. If that shows differences using CCC, then we can more easily figure out what those differences are using Terminal and command-line tools, rather than trying to puzzle things out using terabytes of data.

One of the files in the test case should be that single video file in the root that you mentioned. The data in a file can be different despite the size being the same. The 'cmp' command can identify exactly what differs.

EDIT

If you can longer use your old Windows system to access the NTFS disks, then please state that. In that case, I suggest going under the deep tree of files and finding a folder with only a few files, that also shows the exclamation-point status symbol. In other words, find an existing folder to use as a small test case, rather than making one.
 
Last edited:
I had a similar issue years ago, and used a program called tidy up by hyperbolic software. It did exactly what you are asking about - gives you two windows listing the duplicate files and where they are located so you can handle however you want.
 
Spotlight.

On your desktop or handheld (provided there is a wireless keyboard paired), utilize the spotlight search feature to find files by name, size, file type, where it is stored (and any copies). Maybe even any edits that have been made, but I haven’t figured this out yet for certain.
 
Well I ended up exposing my drives connected to the mac, then used freefilesync on windows to compare and didn't get any mismatches except for basically .DS_Store files and a couple other system files.

It is based on "file time and size" - is there a possibility that the "placeholder" of byte data for a file might be there, but actual data not? It doesn't seem freefilesync offers any form of hash comparison but does offer "file content" but I have to imagine that would take literally forever!

I did also run CCC in "Preview" mode which is described as a dry-run. Even right after Compare continues to show mismatch, it looks like CCC indeed decides that nothing needs to be copied across...

So I'm certainly feeling a LOT better about things. Not sure I'm at 100% just yet though...

Given freefilesync "file time and size", along with the CCC preview option seemingly both confirming all matches... would others say everything is good, or any last steps you'd take as a final validation?
 

Attachments

  • Screenshot 2024-12-12 at 4.52.15 PM.png
    Screenshot 2024-12-12 at 4.52.15 PM.png
    287.1 KB · Views: 47
Last edited:
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.