GIT, you have done it again. (history filtering)

My Friday post talked about my reasons to split my PhD repository into many sub-repositories.  At that moment I did not really know what I was supposed to do or if git even had a command that could help me.  After walking into my office today and doing a little web searching I stumbled upon the file-branch command.  This command can filter stuff out of the git repository history.  This can mean a lot of things, but in my case it allows me to (for each sub-project I want to take out) filter out the rest of the git repository.

My PhD repository was organized in a way that each directory in the root directory contained a sub-project.  This was very convenient and allowed for a very simple command.  Lets say that in my root directory I had A, B and C; each representing a sub-project.  And I want to separate A.  The command to do this is:

git filter-branch –subdirectory-filter directory/ — –all

This command appeared in the man-page for git’s filter-branch.  It filters all the history of the project and keeps whatever you pass to the –subdirectory-filter argument.  The –all argument specifies that all the branches and tags are to be re-written.

At the end of the command I get all the contents of directory A in my root directory.  I also get only the history the pertains to the A directory.  This is awesome!!!  This is just one of the reasons why I keep faithful to git :)  Notice that all my history was also separated by projects.  That is, I did not have any commits that changed stuff in directory A and directory B.  I am not sure how git will act in those cases (It would probably do the sensible thing and ask you to edit the commit),  I wont find out as I don’t really have that problem.

Now its just a matter of organizing what is left in the root directory and voila I have successfully separated my git repository.

Note that I had to do some additional work to make a clean separation.  The git configuration file was left untouched and I had to modify it.  I used the following command to get rid of the remote references:

git remote rm origin

Then I “cleaned” the repository with the following command:

git gc –aggressive –prune=0

After these two additional commands I could continue using the “new” repository without any warnings.

For these post I followed some information posted at http://blog.fealdia.org/2010/02/20/separating-history-of-a-git-repository-subtree/.  Thanks to that rambler for all the good info :)

Advertisements

About joelgranados

I'm fascinated with how technology and science impact our reality and am drawn to leverage them in order to increase the potential of human activity.
This entry was posted in git. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s