There is an analogy here, as git does deduplication 'under the hood'.
It's kind of weird actually - at the architectural level we interact with, we talk in terms of diffs - both in terms of display and what we put in.
At the next level down (content addressable store) git is storing whole files, and the git tooling translates the diffs we communicate about down into whole files for each commit.
Then at the next level down, git puts files together in packfiles (when the repo is packed) which is a compression system to make use of the fact that most files are just tweaks of other files. So, once again it's diffs.
Damn..
Then I guess even using this under the hood would be essentially a rewrite or how git fundamentally works internally