Thursday, February 21, 2013

cleaning up a git repository ...

That's just some misluck: your git push is taking ages to complete. By checking it out later on, you realise you have pushed videos that do not belong there. git rm Videos/* is merely placing a curtain over a broken window: videos are still in the repository, they still make any clone über-huge.
Take your TARDIS and go erase any evidence of your mistake before someone else clone your repository! But before doing that, a git pull is required so that you're sure you're operating on a fresh space-time.

gitk

with this tool, find the revision where your Videos come from. Then grab the commit-id (sha1 hash) of the commit just before. Let's say you messed up on commit deadbeef and that cafebabe was your commit just before.

the command we want to execute on every commit between cafebabe and now is

git rm --ignore-unmatch -f Videos/*

The way to travel through time on cafebabe..HEAD to enforce that is

git filter-branch --tree-filter "git rm --ignore-unmatch -f Videos/*" cafebabe..HEAD

I bet you want to update gitk's view and explore commits to check all signs of your videos are indeed gone. Good.

Now, let's really cover our tracks:

mv .git/refs/original /tmp/refs-original-git
git reflog expire --expire=now --all
git gc --prune=now
git gc --prune=now --aggressive

If you were lucky and realised your mistake before the push, you're now done. If you had it pushed to the master repository, you still need to push these fixes ahead:

git push -f

And if you have a second clone of the master (at home?) where the videos still exist and that you want it as clean as the master,

git pull --rebase --verbose

(ooh, I wish so hard it had a "--dry-run" mode. Maybe you're better to clone your @home repository, and give the command a shot on the clone first so that you can check it has no undesired side-effects. The man pages claim that pull --rebase is potentially dangerous as it may affect your history. This is precisely what I want, but if for some reason your control on the master repository is weaker (i.e. someone else could alter its state between push -f and pull --rebase), you may want to follow the manpage advice and read git-rebase manpage first.)

No comments: