Awesome Conferences

Yes, you really can work from HEAD

There is often a debate between software developers about whether it is best to branch software, do development, then merge back into HEAD, or just work from HEAD.

Jez Humble and others make the claim that the latter is better. If you make your changes in "small batches" this works. In fact, it works better than branching. When you merge your branch back in the bigger the merge, the more likely the merge will introduce problems.

Jez recently tweeted:

which caused a bit of debate between various twitterers (tweeters? twits?)

Jez co-wrote the definitive book on the subject, so he has a lot of authority in this area. If you haven't read Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation (by Jez Humble and David Farley), stop reading this blog post now and get it. Seriously. It is worth it.

Some things I'd like to point out about that slide from Google:

  • Yes, 10000 developers all working from HEAD actually works. People often say that this can't possibly scale, and yet here is an example of it working. There's a big difference between "it can't work", "I haven't gotten it to work", and "I'm conjecturing that it couldn't work".
  • Even though Google has one big monolithic code tree, each project can be checked out individually. That said, if your project is a library that other people use, compiling at timestamp T means getting the library as it is at timestamp T also.
  • Do some projects use a branch-and-merge methodology? When I started at Google some did use "branch and merge". However those numbers were shrinking. I'm sure there were still some that did this, for special edge cases. Not that I had visibility to every project at Google, but it was generally accepted as true that nearly everyone worked from HEAD.
  • "50% of code changes every month": A big part of why that is possible is that Google is very aggressive about deleting inactive code. It's still in the VCS's history if you need it, so why not delete it if it isn't being used? By being aggressive about deleting inactive code it greatly reduces the maintenance tax. Making a global change (like.... changing a library interface) is much easier when you only have to do it for active projects.

Of course, what's really amazing about that slide is that the entire company has one VCS for all projects. That requires discipline you don't see at most companies. I've worked at smaller companies that had different VCS software and different VCS repositories for every little thing. I'm surprised at how many companies have entire teams that don't even use VCS! (If there was an Unicode codepoint for a scream in agony, I'd insert that here).

By having one repo for the entire company you get leverage that is so powerful it is difficult to even explain. You can write a tool exactly once and have it be usable for all projects. You have 100% accurate knowledge of who depends on a library; therefore you can refactor it and be 100% sure of who will be affected and who needs to test for breakage. I could probably list 100 more examples. I can't express in words how much of a competitive advantage this is for Google.

In a literal sense not all Google code is in that one tree. When I was there, Chrome, Android and other projects had their own tree. Chrome and Android were open source projects and had very different needs. That said, they are "work from HEAD" so the earlier point is the same.

Tom

Disclaimer: This is all based on my recollection of how things were at Google. I have no reason to believe it hasn't changed, but I have no verification of it either.

Posted by Tom Limoncelli

No TrackBacks

TrackBack URL: https://everythingsysadmin.com/cgi-bin/mt-tb.cgi/1726

7 Comments | Leave a comment

We just had this debate at work - it's our first time using a VCS to co manage a project that had been maintained by one guy. We decided it would be easiest to have a "Dev" branch that we share, and roll that into production after testing it.

Does the model I just described match the way Google does things? (I know Chrome has a Canary/Beta branch) Alternatively is the implication that code commits at Google are generally always right into production?

They must have some AMAZING testing tools.

Rather than "they must have some AMAZING testing tools", I'd say they must have some AMAZING developer discipline.

No, that would be the "branch and merge" strategy that Jez recommends against.

The "head" strategy is that every commit should result in a system that still builds, runs, and passes all tests. It *could* be released into production. It is a candidate. However it may not actually make it into production.

So a workflow in that system would look something like this:

Pull changes from head
perform your intended changes
merge/commit those changes back into head (After running testing tools)

Then the mitigator that stops those changes from hitting production is - That the production system has not yet pulled a new release from the repo?

The migrator only pushes into production a release that gets blessed.

Each commit is tested fully so that you know which commit introduced an error. However not every commit ends up in production.

If blessing is manual, it could be daily, weekly, or whatever. Someone picks a build that passed all the tests manually, possibly does more manual tests, and decides whether or not to use that package in production.

If blessing is automatic, it could be whatever schedule is implemented. Let's say programmers are so productive there's a new commit every second; and therefore 3600 packages each hour. Let's say it takes an hour to roll out a release in production. There will be 3599 packages built that aren't put into production and 1 that is. Or, the system might rate-limit things so that one push is done every 2 hours, or Pi minutes. Or whatever. Or hourly but not on Fridays. Or every 2 hours but not on the CEO's birthday.

Thanks for that, it also makes me think.

Code changes have been should to follow a power-law distribution,. I.e. A long tail, a small amOunt of coffee changes a lot, the vast majority changes rarely. So if google are aggressively removing code and 50% changes each month then does the power law distribution stl hold?

We use a pull-request model, like the one described here:

* http://css.dzone.com/articles/continuous-integration-and

We chunk the work into one-point stories, and peer-test and code review each one-point story using a JIRA Agile Board, before merging the pull request.

After setting up a pull-request for own our work, we prioritize testing and reviewing someone else's work before starting a new task. In this way, a PR is rarely outstanding for more than a few business hours. Of course, at standup, as a failsafe, we make sure that every PR has an owner.

Technically, the one-point pull-request branches are feature branches. The benefit is that we increases transparency, knowledge transfer, stability, and quality, without creating long-lived branches with hard to resolve conflicts.

For anyone nervous about trying work-from-head, using git pull-requests and micro stories can increase collaboration and maintain the same velocity.

-Ted.

Leave a comment

Credits