I 'm not an artist like you are, but I work for an artist (cataloguing his works).
I had the same problem with tons of dusplicates. A few years ago I introduces a rigorous scheme on how I would organise image files:
For example, everything which comes in goes into a special folder 'inbox' and every new folder in there is locked after files have been added, so no changes can ever be made:
Code:
catalogue
> inbox
> 2023
> 2024
> 2025
2025-02-15 - What - from whom
2025-03-05 - What - from whom
...
Then I have 'working' folders when I process images inhouse. And there are folders with the image files which went 'out of house' for a book production for example and which will come back as color-corrected CMYK files. These folder are deleted as soon as the job is done.
And then there is the main RGB folder with the final images for the online catalogue and external requests (max large TIFF, large TIFF, large JPEG/PNG, medium JPEG/PNG, thumbnail JPEG/PNG – all RGB).
Then there is the same with the CMYK files for book productions, print magazines, etc. (max large TIFF and large TIFF only).
To de-duplicate the mess I had, I used
https://dupeguru.voltaicideas.net/. It is open source, works on macOS, Windows and Linux, and the source code is available on GitHub.