2016-07-09

GitFlow (and friends) with remotes: avoid the busywork

There are lots of places online where one can learn about GitFlow, but it's seemingly always discussed in a local way; the details, subtle or not, about using GitFlow when you are pushing and pulling your changes through the network are never mentioned.

Now, GitFlow is a bit long on the teeth, and some of the younger and simpler alternatives do take into consideration things as Pull (or Merge) Requests; lately some even want the original GitFlow to be considered harmful. But still, there's a common basic concept in GitFlow and the alternatives, which is local feature branches. How and when exactly to merge them back into a development branch is one of the big differences, but is rarely detailed.

The goal of this post is to gather some tips on how to keep developing in the local feature branch while staying involved on a lively repository; and how to make easier the final feature merging.

In particular, for GitFlow it looks like what we all do is just extrapolate and use push, pull, rebase, request merges, etc in an ad-hoc way, whenever we feel it makes sense. (Did you realize that the original GitFlow doesn't even mention rebasing? I would have sworn it did!)

The problem is that using GitFlow as if it was all-local and then simply adding remotes to it makes things cumbersome. It's one of those cases were we start accumulating long recipes of maybe-redundant-but-hopefully-safe steps, yet we aren't confident enough about it to just make a script. Result: busywork. And busywork that should be repeated frequently is a recipe for problems. And one place where you don't want problems is your source tracker, do you?

So. In this context, there's a simple git trick that I haven't seen mentioned anywhere: a local branch can be tracking a remote branch with a different name, and git will deal with it in a meaningful way. It will keep rebasing your local branch against the remote branch [1], and if you try to push, git will complain that the branch has a non-matching name (so you can't end up pushing somewhere unintended unless you are really explicit).

In short: 
  • your feature branch can be tracking origin/develop (not develop, not origin/feature),
  • rebase will work just as typically required by Pull Request / Merge Request workflows, 
  • push is still safe, 
  • the local develop branch head, and its related busywork, are unnecessary

This is pretty much perfect for a multi-developer feature-branching workflow (like GitFlow or one of its simplifications, like Atlassian's "Simple Git workflow" or "Feature Branch workflow"). Let's see what it changes from the "local-only" GitFlow.

A GitFlow refresher


All the instructions I have ever seen for GitFlow start by creating a feature branch from the branch develop, working and committing on it, and eventually merging the feature branch into develop.

The first difficulty in this simplistic model is that it assumes that we have total control over develop, and so we can merge into it and then push it to the server. But that only happens when you're developing alone. More commonly, while you work on a local feature branch, another developer will have pushed new commits into develop; so when your feature is ready for merging, in coarse terms you have to
  1. rebase it on develop, and then 
  2. merge it and push; or file a Merge Request
Now,  how to do this depends on taste / policy, and so it seems to be one of the big discussions about git workflows (bikeshedding anyone?). There are 3 possibilities:
  • Just merge (with no-fast-forward):  what was advocated by GitFlow. The feature branch's track of commits stays parallel to the develop branch, to which it merges. This parallel track happens always, even if the develop branch didn't advance while the feature branch was developed. But there can be commits in the develop branch, inside the bump!
  •  
To the left, the "no-fast-forward" bump advocated by GitFlow
  • Just merge (allowing fast-forward): some people don't like the bumps of short-lived feature branches on an otherwise idle develop branch, as shown at left; in that case, instead of the bump, they prefer the commits of the feature branches just going into the develop branch when a fast-forward merge is possible, so no branching remains in those merges; just as shown at right.
  • Rebase-and-merge: the goal here is to create the bumps that the fast-forwarders dislike. Develop always stays unchanged inside the bump, just as shown at left. For this, one has to explicitly rebase on top of the develop branch just before merging.
I'm partial for the rebase-and-merge option, because
  • any conflicts have been resolved by the actual developer before the merge itself, so the merge turns trivial - great for Pull Requests
  • the code review previous to the merge is cleanest: since there are no conflicts, the discussion can focus on the feature itself
  • the process leaves a well-visible development track of the feature in the story
Even the "considered harmful" guy seems to like it (search for the text "The best argument I've ever seen in favor of rebasing" ;P)

OK, so at this point we know what we want. The problem is implementing it, as in being able to deal with the branches multiple times daily without getting bogged down. If possible, we want to do things so simple that they won't require changing mental tracks nor elicit the typical reflex of "oh no, not now, I'm too deep into [whatever else] right now".

More concretely... 


For later comparison, let's go through a concrete step-by-step lifecycle of a feature branch, in the naïve way. Personally I use either naked git or a GUI (SourceTree mostly), not the git-flow scripts, but still I'll try to give the equivalences:

# we start on develop and create a feature branch
$ git checkout -b feature_bar # equivalent to git flow feature start feature_bar

# commit into the feature branch
$ touch myfile
$ git add myfile
$ git commit

# later I realize that origin/develop has advanced. I should check whether my changes still work
# CHORUS starts

# bring the changes from the server
$ git checkout develop

$ git pull # we didn't change development, so this will be a fast-forward merge; equivalent to fetch+merge or rebase

# now we have to rebase any local feature branches on top of the latest develop head
$ git checkout feature_bar
$ git rebase develop

# CHORUS ends

$ # continue commiting to your feature branch
$ touch newfile
$ git add newfile
$ git commit

# the feature is done, I want to merge it...
# but before that, again I need to bring the latest changes from the server 
# and rebase my feature branch on top of develop
# REPEAT CHORUS

# now I can actually merge feature into development

$ git checkout develop
$ git merge feature_bar # equivalent to git flow feature finish
$ git push # or file a merge request

# NOT PICTURED: pushing your feature branch to the server when it is not finished yet. [2]

Simpler rebasing


The point of my "trick" is that the whole CHORUS is simplified if we configure the feature branch to track origin/develop (for example, with git branch -u origin/develop feature_bar)


# no need to checkout develop, just stay on your feature branch!
$ git fetch # to bring in the changes from the server
$ git rebase # to rebase your current branch on top of its tracked branch


# but wait! that actually is equivalent to
$ git pull --rebase

(If you have more than one feature branches to be rebased, you still will need to rebase them one by one: )
$ git checkout feature_N
$ git rebase
It's important to note what is missing: since you don't have a local develop branch head, you don't have to manage it: no need to checkout in and out of it, and no need to rebase it. After all, it's logical; you are not going to change the develop branch locally, so why even create it?

And at this point, we have an one-liner that can be used without second thoughts. In fact, now we can even use our GitFlow(-like) workflow together with a tool like Android's repo: you can do "repo sync" whenever you want and repo will do the fetch+rebase for you in your current branch, in all the repositories you configured.

An alternative: throw-away merges


Since we are here, let's see what other options are there.

We have optimized the everyday task of rebasing our local feature branch on develop. But, the best optimization you can do to anything is... to remove that thing. And maybe we can do this here, too: we are wanting to do 1 fetch + N rebases (with N = the number of local feature branches), maybe multiple times a day, all to make an integration test: making sure that our changes keep working well on top of the last commits of the remote branch origin/develop. Is there another way?

Yes, there is. The way used in git's own development process, as explained in "man gitworkflows", is: instead of rebasing, make a "throw-away" integration merge where you merge everything at once (develop+feature branches) and do whatever integration test you want. Once you're satisfied, you get rid of the merge and continue developing the branches.

The advantage of this routine is that it is uninvolved and mechanical:
# stay on your feature branch!
$ git fetch
#create a new, throwaway branch
$ git checkout -b throwaway_merge
$ git merge origin/develop feature_bar feature_foo # all your branches!

# do your integration test

# go back to your feature branch
$ git checkout feature_bar
# delete the throwaway branch
$ git branch -df throwaway_merge 


# with the branch gone, the merge disappears and will be garbage-collected
# continue developing
(There are different ways to throw away the merge. One way is using git reset; but I feel more comfortable creating a new branch specifically with this purpose. The merge will only be pointed to by this branch; so, given the way git works, once the branch is deleted and nothing is left pointing to the merge, then merge is effectively inaccessible AKA deleted and will be eventually garbage-collected)

Note that there can be merge conflicts, which is an advance of what will happen when you finally merge the features to develop in the future. And the conflict resolution you do on the throw-away merge will be, well, thrown away! Fortunately, you can enable git rerere, with which git will record the resolution of conflicts and reapply them automatically in the future.

OK now, so just merging the branches in place is conceptually even simpler than rebasing those branches... in a way. Though, to me, having to do and then "undo" a merge still sounds too involved for something that I might want to do multiple times a day. Let's daydream. Couldn't we just merge everything whenever we want, and keep the results, in such a way that later we will still be able to rebase easily, once each feature is complete?

Yes, we can! But as before, you must make your local feature branch track origin/develop.

Lazy throw-away merges


# stay on your feature branch!
$ git fetch # get changes from server
$ git merge origin/develop # merge develop INTO your feature branch

# but wait! that is actually equivalent to

$ git pull

# test integration, continue developing, whatever

... you can continue developing in your feature branch, and from time to time merging into it whatever commits appear in origin/develop. You will end up with a tree like this (imagining only one feature branch and 2 merges, symbolized here by the M and N nodes):

A--B--E--F-----G  (develop)
    \     \     \
     C--D--M--H--N--I (feature)

    
At this point you have finished your feature and want to rebase it to clean up the history and merge it to develop. Meaning, you want something like this:

 A--B--E--F--G  (develop)
              \
               C--D--H--I (feature)


How to do it?

Turns out that git rebase has an option that works beautifully here:
$ git rebase --root --onto origin/development # optional -i

Note that you want both --root and --onto; if not, ... well, read the manual or have fun experimenting (did you know that you can have multiple disconnected history trees inside one git repository? AKA orphaned branches... )

In summary, what we are doing in this case is a more extreme/lazier version of the throw-away merges: we don't throw them away at the moment, but keep developing. When we finally rebase, the merges are implicitly unlinked and thrown away (so once more, git rerere will be useful in this case); we just don't care about them in the meantime. Once more we get rid of the busywork.

Finally, a disclaimer: in every git workflow I have seen, you are supposed to keep your feature branches short. You better keep that in mind! If you keep a long-lived feature branch and get used to merging into it the develop branch, as just explained here, you might end up confusing what was introduced by develop, what by you, and how everything interacts...

Conclusion


At the end of the day, the interesting thing here is that a lot of ceremony (even for the original, "almost-local" GitFlow!) can be avoided by just correctly configuring git for our use case. Which I'm afraid is long form to "I should have RTFM".

On the other hand, I'd say I did read it. And looking around, one sees that git questions in Stack Overflow can get >10k upvotes and >3M views; which is an order of magnitude more than, say, C questions, even though these are about 4 times more abundant.

I guess it does imply something bad about git's documentation and/or complexity.

Obligatory XKCD



[1] As simple as git pull --rebase

[2] There are possible reasons to do so: like filing a Merge Request and being told that you still need to change something. The problem is that, while you finish and push some new commit to that branch, develop might get some new commits too. And now you have a problem: to fully follow the rebase-and-merge philosophy, you should now rebase your already-published branch, and push it again; but that is a big no-no in the "standard" git world. In practice you can just use git push --force, but that needs some caveats, as explained elsewhere. Maybe also in another blog post?

No comments

Post a Comment