Someone recently asked me how the flags in "flag flips" are implemented in actual software. There are a couple different techniques that I've seen.
Command-line flags: Normal command line flags. To change a flag, you have to kill and restart the process. These flags are usually implemented in libraries that come with the language, though I have a preference for gflags.
A/B flags: When doing A/B testing, a cookie is set with a random number (often a value 0-999). The code then uses the number to partition the tests: A=0-499, B=500-999. To enroll 1 percent of all users, one might use A=0-4 and B=5-9 and everyone else gets the default behavior. Often these ranges are set in command-line flags so they are easy to change.
Dynamic flags are more difficult: Some sites use a PAXOS-based system so that all processes get the same flag change at the same time. You have one thread call a blocking "return when the value changed" call so you can stash the new value and start using it right away. Google uses Chubby internally, the GIFEE equivalent is called ZooKeeper, and CoreOS has open sourced etcd.
More complex systems Facebook has an interesting system, which we talk about in The Practice of Cloud System Administration; Chapter 11 that permits per-user settings such as don't show a certain feature to people that work for TechCrunch.
The person that wrote to me also asked how would I change many machines at the same time. For example, on January 1st all your web pages must switch from Imperial to Metric for all dimensions displayed to the user. How would this be managed?
Well, first you'd have to decide how to deal with computations that are already in flight? I would redefine the problem to be: "all new API calls after midnight will have the new default". The API dispatcher should check the clock and read all flags, which are then passed to the function. That way the flags are invariant. It also makes it easier to write unit tests since the flag setting mechanism is abstracted out of the main code. Also, I say "default" because before midnight you'll want to be able to call the API both ways for testing purposes. By having the dispatcher calculate the default based on the date/time, and letting that trickle down to the derived value, you permit the caller to get the old behavior if needed.
If readers want to recommend libraries that implement these in various languages, please feel to post links in the comments.
Leave a comment