Monday, July 28, 2008

On rebasing

In a recent blogpost, Andrew Bennetts criticises Git users for rebasing their work, thereby changing their commit history and losing perhaps valuable information.

While it is true that information is lost, and thereby it may become difficult to sync up with other users, his suggestion has also some problems, that are fundamentally at the core of how Bazaar users are using Bazaar. This is basically the same issue as I raised earlier. He suggests merging the experimental commits into the mainline, at that point providing a useful commit message. The same practice is basically done in the Bazaar developers community: they make use of Bundle Buggy to track their development. When a patch series must be tweaked, fixes are uploaded to the buggy. If the series is complete, the whole branch is then merged into Bazaar’s mainline. This is also why it possible in Bazaar to talk about “mainline”: it fundamentally is a linear approach in the repository.

This means that small fixes, as in this bundle also get merged in the mainline.

This is exactly what Andrew means: you can still see all the little fixups, you still maintain your history, but when you run “bzr log —short”, you only see the merges, AKA the real commit messages.

The problem with this, of course, is that your revision control system then suddenly becomes a linear system, unable to do real merges. For if you do a merge of a branch with useful commit messages, those disappear unless you actually copy them in the merge message. If you don’t use “bzr log —short”, your log will be full of useless commits like “fixing newlines”, “oops, a typo”, and so on.

The problem is perhaps more apparent if you go and bisect a bug. In Git, with the commits rebased and having a nice clear commit message, you can understand in what context the bug happened and why the change was made. If you do the same bisect in Bazaar, it might bisect to a commit like “oops, forgot to add this”. Of course, you can look at the commit that merged was changed in, to see if that has a clearer commit message, but you can never be sure that it does. Perhaps the commit itself did have a clear message. Or perhaps the merge will just be a “Synced with mainline” merge, in which case you aren’t any farther than you were before. Perhaps you should look at the merge above that one?

The same is true when doing a merge from non-feature branches. Let’s say that someone has made a branch with ten new features in it. Each of those features was developed as above: small fixes, and a merge for the real feature. How are you going to merge that branch? Are you going to merge it in at once, with a commit message like “Merged ten new features from John”, “bzr log —short” won’t display which features were actually merged. You could merge the all by hand, but that is a lot of work. Or, you can expand the commit message to list all changes. In any case, you don’t want to view the full “bzr log” history, because that shows all the little fixups and errors.

As you can hopefully see, merging those commits in might seem easy, but can give a lot of problems afterwards. They certainly don’t make the history easier to understand. The way Bazaar is developed, you basically get the same merge power as Subversion, as “bzr log —short” won’t show what was merged in, but “bzr log” itself shows too much information to be useful.

2 comments:

Anonymous said...

Great post as for me. It would be great to read a bit more concerning this theme. Thank you for posting this data.
Sexy Lady
English escorts

Carlus Henry said...

I am a newbie to Git and I don't think that I quite understand something in your post.

It seems as though what you may be saying is that instead of using a merge, into the "mainline", you would suggest rebasing your topic branch onto your "mainline" branch. In this way, you retain all of the commits that was on your topic branch....does that sound about accurate?

I liek this idea a lot. Where I am getting confused is how would it cause problems in rewriting the history and sharing it. I am familiar with the problems of rewriting the history of a shared branch, but in this case, if you are rebasing your changes from a topic branch onto the master, you really aren't changing the history of the master branch at all (in this scenario the topic branch is local and not shared while the master branch is)...perhaps I am missing something.

Thanks