TLDR:
Current main goal is to have a photo library that only contain the highest quality version of a photo that I have, along with the correct metadata (time captured; GPS, camera if possible). Hide/tag all non-photos (art/memes/etc). I have already done a lot of manual work, but moving forward, I will need some advice
I have been consolidating over 100,000 images from the past 25 years into my Photos.app photos library, and it has been a trip. A combination of my convoluted requirements, and how abstracted Photos.app can be has lead me to have a collection of questions to throw at the community.
Current situation:
While I believe I have mostly de-duped to the highest quality image in photos, some of the highest quality versions did not contain the correct capture date. My current plan is:
Can anyone think of a better method here?
Cleaning out the memes:
Current main goal is to have a photo library that only contain the highest quality version of a photo that I have, along with the correct metadata (time captured; GPS, camera if possible). Hide/tag all non-photos (art/memes/etc). I have already done a lot of manual work, but moving forward, I will need some advice
I have been consolidating over 100,000 images from the past 25 years into my Photos.app photos library, and it has been a trip. A combination of my convoluted requirements, and how abstracted Photos.app can be has lead me to have a collection of questions to throw at the community.
Current situation:
- Down to about ~75,000 photos (through de-duping using PhotoSweeper X and apple's new Ventura duplicate detector/metadata merger)
- Images consist of:
- Original RAW and JPEG files from the camera
- Facebook downloads from various eras (some subtly more compressed than others due to Facebook’s different algorithms throughout the years)
- Memes/image macros
- Art
- Videos
While I believe I have mostly de-duped to the highest quality image in photos, some of the highest quality versions did not contain the correct capture date. My current plan is:
- Take all the photos from my original source (dozens of folders outside the current iPhoto library)
- Batch resize them down to 128 (or 256) at their long end
- Import into the Photos library in a specific album and tag them
- De-dupe/merge metadata using Apple’s new de-dupe/metadata merge feature in Ventura/IOS 16
- Use the correct, high quality image
- If the low resolution image has an earlier capture date, it would use that
Can anyone think of a better method here?
Cleaning out the memes:
- Detect text-based images such as screenshots/memes/documents.
- So far, I have used a combination of common text found in non-photo images, such as “likes”, “retweet”, “starter pack”. This has worked great for a large majority of memes, but it is limited. I have used PhotoSweeper to find visually similar images to how tweets are usually formatted, but this is time consuming/clunky/not definitive.
- I have also used Photos.app search for finding images that:
- Contain text
- Don’t camera model or lens metadata
- The issue is that Photos.app doesn’t appear to have a way of distinguishing a real photo from a screen cap (non iPhone screenshot)/art.
- Auto save non-photos to specific folder based on what app was open when it was screenshot was taken.
- Certain apps already do this (like Apollo), but this is not thorough across all of iOS/MacOS. Is there a system extension that can somehow tag which URL or application an image was taken from?
- Find NSFW photos
- Have already used ‘bath’ and ‘brassiere’ (as google has suggested). I have seen this app called Orca Vault that may help here, along with finding other objects in general: https://apps.apple.com/us/app/find-text-in-pic-orga-vault/id1305287901
- Is there a way to see/expose what categories can be search for via visual search?
- Im surprised I have not seen. Master list anywhere. /Users/redacted/Dropbox/Pictures/Photos Library.photoslibrary/database/search/searchMetadata.plist doesn’t seem definitive.
- Can we see which machine learning tags have been applied to a photo? This has not worked: https://brattoo.com/propaganda/#photostagger
- Is there a similar way to explore photo library databases like there is for iPhone backups (ie iExplorer/iMazing)?
- Detect how compressed/artifact-y an image is and compare to see if it is ‘better’ than the duplicate I am comparing to?
- As mentioned above, I have many different copies of photos that have subtle differences in compression. It is visible if looking side by side, and I would like to have the one with less compression. A smaller file size does not necessarily mean lower quality though, as it really depends how the JPEG was saved.
- Losslessly rotate JPEG images
- I have only done small tests here, but it does look like Photos.app re-encodes the file on rotate, which I don’t want as this would be a slight loss in quality.
- AI upscale/remove artifacts of older Facebook images
- I am messing with the Topaz Gigapixel AI trial to clean up the artifacts on old, low res Facebook images, but was wondering if there is was another option here?
- What are some of the best/most powerful photos plugins?
- App Store is limited in it’s search for these. I am already using PhotoSweeper X and Duplicate File Finder
- All de-dupe apps appear to be limited to 128x128 bitmaps for comparing. Are there any that go beyond this? I am getting a lot of visually similar images giving false positives.
- I have a few thousand Facebook photos (this was a challenge on it's own; used Album Downloader and Tagged Photo Exporter for Facebook) that unfortunately don't have their correct capture dates (it is stamped as the day I downloaded them). Thankfully, they all have their original name (formatted as 12539_159253124087251_2115393_n), so I am attempting to script a way of calling https://www.facebook.com/photo/?fbid=, downloading the page, scraping it for date info, then adding that metadata to the photos using exif.sh
Last edited: