Q: how to batch-verify image files

ScottR · Sep 21, 2014

I had to recover a trashed drive with hundreds of thousands of images. An unfortunately large number of the recovered images are unusable: when trying to open them Preview reports "The file 'xxxxx.JPEG' could not be opened. It may be damaged or use a file format that Preview doesnt recognize." GraphicConverter gives a similar error.

Is there any utility that'll scan through all those images and help isolate those that are unusable?

superscape · Sep 22, 2014

Hi Scott,

You could probably use a relatively simple AppleScript to do the job. Here's a simple one which will check a single image, try to open it in Preview and if it fails, moves it into a folder of your choice.

Code:

set theFile to choose file with prompt "Please choose a file to verify"
set theErroredFolder to choose folder with prompt "Where would you like to put the failed images?"


tell application "Preview"
	set theReturnValue to open theFile
	close every document
end tell


tell application "Finder"
	if theReturnValue is missing value then
		move theFile to theErroredFolder
	end if
end tell

Obviously, you'd want to scan more than one image. The best way to do that would depend on where the images are on your drive. e.g. Are they all in a single folder? Are they all over the place not the drive? Or would you want to drop a file or files as required?

I guess you could probably adapt some of the code above to work as part of an Automator action too.

ScottR · Sep 25, 2014

Thanks for the reply. I think the problem with this approach is that I have literally hundreds of thousands of images to verify. I don't think scripting opening/moving them one at a time is a viable solution

superscape · Sep 26, 2014

Do you have an example of one of the corrupted jpegs that you could share with me?

chown33 · Sep 26, 2014

There's a command-line tool that might be useful. Its name is 'sips':
https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man1/sips.1.html

As noted in its man page, its functionality is also accessible by AppleScript's "Image Events". I can also say its functionality is accessible through Automator's image processing actions.

As a strategy, you could probably use any of the 'sips' query functions to test the validity of the image. If the query fails, then the image file is damaged. If the query succeeds, it might be wise to try a more time-consuming and comprehensive test, such as format conversion, rotation, flip, crop, resample, etc.

Another quick way to test an image file is the 'file' command:
https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man1/file.1.html

If the output says it's an image file, then at least it's undamaged enough to determine it's an image file. There may well be damage unseen by 'file', so again, it's safest to run a 'sips' check on files that quick checks say are ok.

The reason for a quick check and a separate slow one is simply speed. If a large portion of the files are found to be damaged using the quick check, then the entire process completes faster.

As to the approach for processing that many images, a shell script using 'find' to traverse directories would work. An AppleScript can also traverse directories. Step one, however is going to be "Backup all the files", in case anything goes wrong during the traversal.

Step two is to setup a separate disk or partition with only a small number of known-good and known-bad files, and only run the scripts on that data. A few dozen files of each should suffice. Only after the entire process has been tested and verified on limited data should it be run on the full set of images. That is, make sure it works on a pilot run before full deployment.

Search

Search

Q: how to batch-verify image files

ScottR

macrumors regular

superscape

macrumors 6502a

ScottR

macrumors regular

superscape

macrumors 6502a

chown33

Moderator

Our Staff