Append-only logging and versions histories are all good in theory. However, how do you keep a Dat archive from ballooning in size over time?
Say you’ve got a static website and change something about the main layout file. That leaves you with one new revision per HTML file in your archive and the total Dat archive has doubled in size. Revert that change a week later and the archive grows by another 33%.
I’ve noticed that Dat will add revisions of identical files as new files. Say you’ve moved an identical copy of a file on top of a file in your Dat archive. Dat will see that as a modification and rewrite the entire file into the archive even though the file is unchanged (only filesystem metadata change). Static site generators will sometime inadvertently do this from time to time.
Archives also start to have performance problems in Bunsen and Beaker when I pass 3000 revisions. The only work-around I’ve found is to delete it and start with a new archive.
So what to do to keep archives as small as possible without having to commit to never ever making any change to them?