The wonders of Git Rebase

Translations available:

One of the tools I find more useful on git, and which I love the most is git-rebase(1). It works around the premise that originally bought git to control version systems: that you can and should rework your repository history, to make it more organized and readable.

Around the world, each project stablish its one convention of how to work with git, some of them use techniques like git-flow, others, just a branch and tags, others divide between master/main and a production/release branch. The company I work for convention is to always create a specific branch to work on a new feature, bug fix or whatever. Having done that, I adopt the discipline to (1) keep it up to date with the parent branch using git pull --rebase origin/parent_branch and (2) as I work on it, I try to keep it tidy using git rebase -i HEAD~N.

Rebase

The rebase command has a very simple behavior, essencially it does not execute a merge (or does not generate a commit that introduces the changes upstream unto your local copy), rebase temporally remove all your work, pull newer commits unto your branch and then sequencially applies your patches on this updated branch state. The benefit being: the branch ends up updated and only contains a list of your newer commits, on a very coherent and readable way.

The -i parameter on the command, is a shortcut for git start a interactive rebase, starting from the commit I have asked for 1. For example, git rebase -i HEAD~5 will start a rebase up until my last 5 commits. It will proceed to show you this:

pick 9fdd140 hooks: Flesh out the Hook::Destroy service test
pick ecb296e resource: Flesh out the resource test
pick bd115a0 hooks: Generate token when subscribing
pick fcaba9d hooks: Flesh out the handler test
pick 3926d4b doc: Add document about testing

# Rebase bed2ff9..3926d4b onto bed2ff9 (5 commands)
#
# Commands:
# p, pick <commit> = use commit
# r, reword <commit> = use commit, but edit the commit message
# e, edit <commit> = use commit, but stop for amending
# s, squash <commit> = use commit, but meld into previous commit
# f, fixup <commit> = like "squash", but discard this commit's log message
# x, exec <command> = run command (the rest of the line) using shell
# b, break = stop here (continue rebase later with 'git rebase --continue')
# d, drop <commit> = remove commit
# l, label <label> = label current HEAD with a name
# t, reset <label> = reset HEAD to a label
# m, merge [-C <commit> | -c <commit>] <label> [# <oneline>]
# .       create a merge commit using the original merge commit's
# .       message (or the oneline, if no original merge commit was
# .       specified). Use -c <commit> to reword the commit message.
#
# These lines can be re-ordered; they are executed from top to bottom.
#
# If you remove a line here THAT COMMIT WILL BE LOST.
#
# However, if you remove everything, the rebase will be aborted.
#

That means, it will show my last five commits, and ask me what I want to do with them, presenting me the options to pick, remove or edit (edit or reword) or join them (fixup, squash).

Discipline

My work plan, usually is the following: as I keep advancing on my work, I commit them locally, in a very vague way so I can understand what sort of milestone that represents. Whenever I feel I can present this work upstream I use rebase to rewrite, split or join commits forming a narrative of coherent and readable sequential steps.

Seeking to write this narrative, I strive to write atomic changes that themself are capable of answer the question of what has been done? and will this break the tests that exists? 2

A experience that has helped me a lot on this purpose, was to contribute to projects that utilize a public developer mailing list 3, sending patches instead of creating Pull Requests. Specially because the work logic changes with that. The way to work presented by tools like Github, GitLab and other similar derivations is to create a PR and keep adding commits ad infinitum until the work on it is ready to merge. Yet, the logic behind git via email is of trying to be assertive as possible, avoiding sending giant patchsets 4 with changes all over the place. One of the ways to deal with patchsets is that is possible to apply only a subset of the overall changes presented, and ask revision for specific patches that need some more work.

Applying this logic, I strive to break a long patchset into several different patchsets 5, each one with the most possible atomic patches, with a good description, organized by modules or systems 6.

Concluding: git is fantastic. Having discipline to utilize it and learn how it works is a must for any programmer. One of the main differences between git and other VCS before him is that the history of the repository is not written in stone, is not read-only. Besides that, git has an extensive offline documentation on its manpages available to you to study it. I would strong recommend you to read git-rebase(1), giteveryday(7) and gitrevisions(7).


  1. HEAD~N, means essencially from my HEAD commit up until N commits before this. ↩︎

  2. That means no fix tests or fix typo commits. ↩︎

  3. Which is the way git has been designed to work, notable mailing list driven developments examples are the Linux Kernel, git, cgit, dwm and other suckless tools, coreutils, musl-libc, freedesktop, vim, emacs, *BSD and many others. ↩︎

  4. A patch is a commit, and a patchset is a collection of patches, somewhat akin to a PR. ↩︎

  5. A example here could be organize includes, remove white spaces or any other cosmetic changes on the source code that hasn’t a direct connection to what I was originally writing. ↩︎

  6. I generally like to use the following format for my commits: system: subsystem, description of what this commit does and describe why this change is necessary and my approach on the commit message. ↩︎