squashing old git history
You may have an internal project that you wish to open source. When starting the project, you didn’t take that into account, so it’s likely to contain references to private data that you do not wish to share.
Step one would be to clean things up. If this is a slow process, this can take time, while in the mean time the project gets updates.
Now, at one point you’re confident that at commit X1000, the project contains only non-private data. But since the project wasn’t stale, you may be 200 commits ahead, at X1200.
Instead of creating a new repository with starting at commit X1200, you can squash commits X1..X1000 and keep the history of commits X1000..X1200.
You could do this with git rebase -i --root master
and squash/fixup
all commits from X1..X1000. But that’s a rather lengthy operation,
squashing 1000 commits.
Instead, you can follow this recipe:
Check out a temp branch from commit X1000:
git checkout --orphan temp X1000
git add -A
git commit --date "$(date -Rd '2017-01-01')" \
-m 'squash: Initial commit up to begin 2017.'
Branch ‘temp’ now contains exactly one commit.
Check that the log message, the date and the author are fine. Then rebase the newest commits from ‘master’ onto this new initial commit:
git rebase --onto temp X1000 master
At this point, ‘master’ is updated. And we can push it over the original:
git branch -d temp
git push -f
And then, if you’re like me, this is the moment you find out that there are still a few items that you didn’t want in there.
Quickly fixup a few problems and squash them into the root/first commit:
# edit files, removing stuff we don't want
git commit . -m fixup
git rebase -i --root
# move the fixup commits to the top below the first "pick", and replace
# "pick" with "fixup"
git push -f
Now, there’s a nice clean history.
Now pull the new data onto the other checkouts and do a git gc to remove all traces of the old history:
git pull --rebase
git reflog expire --expire=now --all #--verbose
git gc --aggressive --prune=now