So, I’m selfhosting immich, the issue is we tend to take a lot of pictures of the same scene/thing to later pick the best, and well, we can have 5~10 photos which are basically duplicates but not quite.
Some duplicate finding programs put those images at 95% or more similarity.
I’m wondering if there’s any way, probably at file system level, for the same images to be compressed together.
Maybe deduplication?
Have any of you guys handled a similar situation?


Note that Git doesnt store deltas. It will reuse unchanged files, but stores a (compressed) version of every file that has existed in the whole history, under its SHA1 hash.
Indeed! Interesting! I made an experiment now with a non-compressible file (strings < /dev/urandom | head -n something) and it shows you’re right. 2nd commit, where I added a tiny line to that file, increased repo size by almost the size of the whole file.
Thanks for this bit.