Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

Johnny Jackhammer

macrumors regular
Original poster
May 5, 2011
150
90
I have an external mass storage 18TB volume for Plex. I formatted that as APFS and I am having issues. I have another 18TB HDD and want to format it as HFS and copy all files over.

What is the best way to insure the files are written without fragmentation. I am pretty sure they are highly fragmented now.

I have Carbon Copy Cloner and also Super Duper and of course there is `diskutil' and `dd' and Disk Utility Restore.

I can't find much info on which of these tools will write the files un-fragmented.
 
You can enable defragmentation on APFS disks but it probably won’t help your situation.

Your copy process should sequentially write to target disk. I guess you could defrag after copy complete, if you desire.
 
  • Like
Reactions: jdb8167
You can enable defragmentation on APFS disks but it probably won’t help your situation.

Your copy process should sequentially write to target disk. I guess you could defrag after copy complete, if you desire.
Thanks... I'd rather do this in one pass because it's going to take 2-3 days to copy 18TB
 
  • Sad
Reactions: _Mitchan1999
I have always heard that copying from one drive to another (just a simple drag copy) is an effective means to defragment anything that is copied.
I think it is relevant to ask: How much free space is left on the HDD?
 
  • Like
Reactions: Basic75 and Wizec
I have always heard that copying from one drive to another (just a simple drag copy) is an effective means to defragment anything that is copied.
I think it is relevant to ask: How much free space is left on the HDD?
1.5TB free space remains on the Plex volume. Drag copy would be a disaster on a volume this large. Too much for Finder to handle while doing everything else it needs to be doing.

I am doing this because I think APFS was a mistake for a mass storage volume OR maybe 18TB is just too large a volume. Finder takes forever to search it over the network. Plex movies constantly stuttering even though my network is plenty fast ~940 Mbps to the Apple TV 4K. Infuse takes forever to add metadata... my smaller 6 TB drive is fine - Infuse adds metadata at a rate of 5 or 6 movies per second. The larger drive it adds metadata at a rate of one per every second sometimes the progress wheel just spins forever like it can't find what it's looking for.
 
hmmm.... Searching this out, and find that APFS will support a volume size of 8 exabytes, and over 9 quintillion files stored in one volume.
I don't think that size, per se, is really an issue. Your "tiny" 18TB is easily handled by the file system and Finder. However, searching might be the difference, as that metadata you are accessing is going to be slower on any HDD, compared to an SSD. Also, fragmentation is not an issue on an SSD, but may be contributing to your situation on your single-volume 18TB HDD.
I think your fastest way to what you want (a disk that is fully defragmented) is to copy with DD. It wouldn't matter about any fragmentation on an SSD, but I guess the storage size you need would be prohibitively expensive.

I only have experience with much less data (maybe about 2 TB max, much more often 5-800 GB, which still takes some time.
I hope you don't run into any damaged file errors with that much to move.
 
A clone/block-level copy (via diskutil, dd) will copy over all structure from the source (including free space) which also means that fragmentation will be preserved. This is why it's actually recommended for forensic stuff (or if you have a failing drive).

Carbon copy cloner, superduper, and time-machine do file-level copies (rsync) and hence will not produce fragmentation.
 
A clone/block-level copy (via diskutil, dd) will copy over all structure from the source (including free space) which also means that fragmentation will be preserved. This is why it's actually recommended for forensic stuff (or if you have a failing drive).

Carbon copy cloner, superduper, and time-machine do file-level copies (rsync) and hence will not produce fragmentation.
I am running this command right now:

Code:
ls | xargs -n1 -P4 -I% rsync -aP % /Volumes/Media_1

cd into directory to be copied
ls (list files) and pipe to xargs which uses four rsync processes to copy files to destination

I chose 4 processes but a person could use P0 which tells xargs to "run as many invocations of rsync as possible".

I am seeing between 170 MB/s and 190 MB/s copy speed.... 1.54 TB copied in ~2 hours
 
That command seems ok to me at first glance (I guess you're parallelizing over the subdirs of your root folder) but tbh piping output of ls to anything always scares me, since it's fairly well known that you should never parse the output of ls as there are many edge-cases (file names including nul, file names with newline, etc.) that can cause it to suddenly break. For a backup solution I would not really trust myself to get it right and would rather just use something like CCC which is more battle-tested (already parallelizing the rsync, and giving you some additional nifty features like checksum verification to boot).
 
  • Like
Reactions: Johnny Jackhammer
would rather just use something like CCC which is more battle-tested (already parallelizing the rsync, and giving you some additional nifty features like checksum verification to boot).
At the time I started I didn't have an answer regarding CCC doing file or block level copy... hate to throw away 2.5 hours of copying but I suppose I should if it's just going to fail when it hits a weird filename. Although, I could let it continue and then use CCC after it fails which would only copy what hasn't already been copied.
 
1.5TB free space remains on the Plex volume
I have always understood 20% free was a minimum. You are less than 10%. So that will be part of your problem.

I am doing this because I think APFS was a mistake for a mass storage volume
I have a 12TB APFS HDD (25% free) for Plex and various backups - multiple volumes in a single APFS container. My Plex server runs in a Linux VM with a shared folder on the HDD - no obvious performance issues.

I do disable Spotlight indexing. You might want to try that as it will remove a whole lot of unnecessary reads.
 
I have an external mass storage 18TB volume for Plex. I formatted that as APFS and I am having issues. I have another 18TB HDD and want to format it as HFS and copy all files over.

What is the best way to insure the files are written without fragmentation. I am pretty sure they are highly fragmented now.

I have Carbon Copy Cloner and also Super Duper and of course there is `diskutil' and `dd' and Disk Utility Restore.

I can't find much info on which of these tools will write the files un-fragmented.
Using "dd" will keep the fragmentation, just use finder and copy( or a GUI for rsync), that will copy with no fragmentation
 
I have always understood 20% free was a minimum. You are less than 10%. So that will be part of your problem.

I do disable Spotlight indexing. You might want to try that as it will remove a whole lot of unnecessary reads.
There is some debate on the 20% philosophy.

I keep spotlight on because I want to search the HDD for movie titles.
 
The tool you want for this job is CCC.

You DO want to format the replacement HDD in HFS+ -- NOT in APFS.
APFS is particularly hard on platter-based drives, and fragmentation is one of the problems that happens with them.

Reason WHY you want CCC:
Nearly 18tb is A LOT of files.
Try to copy even "chunks" of that with the finder, and you can run into problems. If the finder encounters a corrupt or otherwise "bad" file during the process, it will just abort the entire process and NOTHING will get copied.

HOWEVER
If you are using CCC, it will not choke and stop dead if it finds one or more corrupt files. Instead, it will "make note" of the bad file, "skip around" it, and go right on to complete the job. At the completion of the job, you can review a list telling you which files didn't get copied, and decide what to do with them.

Again, HFS+ for platter-based HDDs -- not APFS for a non-booting drive that contains "only data".
(the only exceptions: boot drives, and time machine and CCC/Superduper cloned backups).
 
  • Like
Reactions: Johnny Jackhammer
It turned out OK. In this case my command worked perfectly.
ls | xargs -n1 -P4 -I% rsync -aP % /Volumes/Media_1

Finished in 26 hours
I am going to move the rest of my Plex drives back to HFS+
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.