Someone on Quora recently asked, Why did Google include the 'undo send' feature on Gmail?. They felt that adding the 30-second delay to email delivery was inefficient. However rather than answering the direct question, I explained the deeper issue. My (slightly edited) answer is below. NOTE: While I previously worked at Google, I was never part of the Gmail team, nor do I even know any of their developers or the product manager(s). What I wrote here is true for any software company.


Why did Google include this feature? Because the "Gmail Labs" system permits developers to override the decisions of product managers. This is what makes the "Labs" system so brilliant.

A product manager has to decide which features to implement and which not to. This is very difficult. Each new feature takes time to design (how will it work from the user perspective), architect (how will the internals work), implement (write the code that makes it all happen), and support (documentation, and so on). There are only so many hours in the day, and only so many developers assigned to Gmail. The product manager has to say "no" to a lot of good ideas.

If you were the product manager, would you select features that are obviously going to possibly attract millions of new users, or features that help a few existing users have a slightly nicer day? Obviously you'll select the first category. IMHO Google is typically is concerned with growth, not retention. New users are more valuable than slight improvements that will help a few existing users. Many of these minor features are called "fit and finish"... little things that help make the product sparkle, but aren't things you can put in an advertisement because they have benefits that are intangible or would only be understood by a few. Many of the best features can't be appreciated or understood until they are available for use. When they are "on paper", it is difficult to judge their value.

Another reason a product manager may reject a proposed feature is politics. Maybe the idea came from someone that the product manager doesn't like, or doesn't trust. (possibly for good reason)

The "Labs" framework of Google products is a framework that let's developers add features that have been rejected by the product manager. Google engineers can, in their own spare time or in the "20% time" they are allocated, implement features that the product manager hasn't approved. "Yes, Mr Product Manager, I understand that feature x-y-z seems stupid to you, but the few people that want it would love it, so I'm going to implement it anyway and don't worry, it won't be an official feature."

The Third Way of DevOps is about creating a culture that fosters two things: continual experimentation (taking risks and learning from failure) and understanding that repetition and practice is the prerequisite to mastery. Before the Labs framework, adding any experimental feature had a huge overhead. Now most of the overhead is factored out so that there is a lower bar to experimenting. Labs-like frameworks should be added to any software product where one wants to improve their Third Way culture.

Chapter 2 of The Practice of Cloud System Administration talks about many different software features that developers should consider to assure that the system can be efficiently managed. Having a "Labs" framework enables features to be added and removed with less operational hassle because it keeps experiments isolated and easy to switch off if they cause an unexpected problem. It is much easier to temporarily disable a feature that is advertised as experimental.

What makes the "Labs" framework brilliant is that it not only gives a safe framework for experimental features to be added, but it gathers usage statistics automatically. If the feature becomes widely adopted, the developer can present hard cold data to the product manager that says the feature should be promoted to become an official feature.

Of course, the usage statistics might also show that the feature isn't well-received and prove the product manager correct.

A better way of looking at it is that the "labs" feature provides a way to democratize the feature selection process and provides a data-driven way to determine which features should be promoted to a more "official" status. The data eliminates politically-driven decision making and "I'm right because my business card lists an important title"-business as usual. This is one of the ways that Google's management is so brilliant.

I apologize for explaining this as an "us vs. them" paradigm i.e. as if the product managers and developers are at odds with each other. However, the labs feature wouldn't be needed if there wasn't some friction between the two groups. In a perfect world there would be infinite time to implement every feature requested, but we don't live in that world. (Or maybe the "Labs" feature was invented by a brilliant product manager that hated to say "no" and wanted to add an 'escape hatch' that encouraged developers to experiment. I don't know, but I'm pessimistic and believe that Labs started as an appeasement.)

So, in summary: Why did Google include the 'undo send' feature on Gmail? Because someone thought it was important, took the time to implement it under the "labs" framework, users loved the feature, and product management promoted it to be an official Gmail feature.

I wish more products had a "labs" system. The only way it could be better is if non-Googlers had a way to add features under the "labs" system too.

Hey Google, when do we get that?

Posted by Tom Limoncelli in DevOps

TOML vs. JSON

[This is still only draft quality but I think it is worth publishing at this point.]

Internally at Stack Exchange, Inc. we've been debating the value of certain file formats: YAML, JSON, INI and the new TOML format just to name a few.

[If you are unfamiliar with TOML, it is Tom's Obvious, Minimal Language. "Tom", in this case, is Tom Preston-Werner, founder and former CEO of GitHub. The file format is still not reached version 1.0 and is still changing. However I do like it a lot. Also, the name of the format IS MY FREAKIN' NAME which is totally awesome. --Sincerely, Tom L.]

No one format is perfect for all situations. However while debating the pros and cons of these formats something did dawn on me: one group is for humans and another is for machines. The reason there will never be a "winner" in this debate is that you can't have a single format that is both human-friendly and machine-friendly.

Maybe this is obvious to everyone else but I just realized:

  1. The group that is human-friendly is easy to add comments to, and tolerant of ambiguity, is often weakly typed (only differentiating between ints and strings).

  2. The group that is machine-friendly is difficult (or impossible) to add comments, is less forgiving about formatting, and use often strongly typed.

As an example of being unforgiving about formatting, JSON doesn't permit a comma on the last line of a list.

This is valid JSON:

{
   "a": "apple", 
   "alpha": "bits", 
   "j": "jax"
}

This is NOT valid JSON:

{
   "a": "apple", 
   "alpha": "bits", 
   "j": "jax",
}

Can you see the difference? Don't worry if you missed it because it just proves you are a human being. The difference is the "j" line has a comma at the end. This is forbidden in JSON. This catches me all the time because, well, I'm human.

It also distracts me because diffs are a lot longer as a result. If I add a new value, such as "p": "pebbles" the diff looks very different:

$ diff x.json  xd.json 
4c4,5
<    "j": "jax"
---
>    "j": "jax",
>    "p": "pebbles"

However if JSON did permit a trailing comma (which it doesn't), the diffs would look shorter and be more obvious.

$ diff y.json yd.json 
4a5
>    "p": "pebbles",

This is not just a personal preference. This has serious human-factors consequences in an operational environment. It is difficult to safely operate a large complex system and one of the ways we protect ourselves if by diff'ing versions of configuration files. We don't want to be visually distracted by little things like having to mentally de-dup the "j" line.

The other difference is around comments. One camp permits them and another camp doesn't. In operations often we need to be able to temporarily comment out a few lines, or include ad hoc messages. Operations people communicate by leaving breadcrumbs and todo items in files. Rather than commenting out some lines I could delete them and use version control to bring them back, but that is much more work. Also, often I write code in comments for the future. For example, as part of preparation for a recent upgrade, we added the future configuration lines to a file but commented them out. By including them, they could be proofread by coworkers. It was suggested that if we used JSON we would simply add a key to the data structure called "ignore" and update the code to ignore any hashes with that key. That's a lot of code to change to support that. Another suggestion was that we add a key called "comment" with a value that is the comment. This is what a lot of JSON users end up doing. However the comments we needed to add don't fit into that paradigm. For example we wanted to add comments like, "Ask so-and-so to document the history of why this is set to false" and "Keep this list sorted alphabetically". Neither of those comments could be integrated into the JSON structures that existed.

On the other hand, strictly formatted formats like JSON are, in theory, faster to parse. Supporting ambiguity slows things down and leads to other problems. In the case of JSON, it is just plain so widely supported there are many reasons to use it just for that reason.

Some formats have typed data, others assume all data are strings, others distinguish between integer and string but go no further. YAML, if you implement the entire standard, has a complex way of representing specific types and even supports repetition with pointers. All of that turns YAML's beautifully simple format into a nightmare unsuitable for human editing.

I'm not going to say "format XYZ is the best and should be used in all cases" however I'd like to summarize the attributes of each format:

* Format JSON YAML TOML INI
M Formal standard YES YES soon no
M Strongly typed YES YES string/int no
M Easy to implement
the entire standard
YES no YES YES
H Awesome name! no no YES no
H Permits comments no start of line only YES usually
H diffs neatly no YES (I think) YES YES
H Can be
programmatically
updated without losing
format or comments
yes-ish NO soon NO

The * column indicates if this quality is important for machines (M) or humans (H). NOTE: This chart is by no means complete.

Personally I'm trying to narrow the file formats in our system down to two: one used for machine-to-machine communication (that is still human readable), and the other that is human-generated (or at least human-updated) for machine consumption (like configuration files). (Technically there's a 3rd need: Binary format for machine-to-machine communication, such as ProtoBufs or CapnProto.)

I'm very optimistic about TOML and look forward to seeing it get to a 1.0 standard. Of course, the fact that I am "Tom L." sure makes me favor this format. I mean, how could I not like that, eh?

Update: 2015-07-01: Updated table (TOML is typed), and added row for "Awesome name".

I literally never thought I'd see this day arrive.

In 1991/1992 I was involved in passing the LGB anti-discrimination law in New Jersey. When it passed in January 1992, I remember a reporter quoting one of our leaders that marriage was next. At the time I thought Marriage Equality would be an impossible dream, something that wouldn't happen in my lifetime. Well, less than quarter-century later, it has finally happened.

In the last few years more than 50% of the states approved marriage equality and soon it became a foregone conclusion. States are the "laboratory of democracy" and with 26 states (IIRC) having marriage equality, its about time to declare that the experiment is a success.

There were always predictions that marriage equality would somehow "ruin marriage" but in the last decade of individual states having marriage equality not a single example has come forward. What has come forward has been example after example of problems from not having marriage equality. The Oscar winning documentary "Freeheld" is about one such example. Having different laws in different states don't just create confusion, it hurts families.

"Human progress is neither automatic nor inevitable", wrote Martin Luther King Jr. It is not automatic: it doesn't "just happen", it requires thousands of little steps.

This day only happened because of thousands of activists working for many years, plus hundreds of thousands of supporters, donors, and millions of "like" buttons clicked.

A lot of people make jokes about lawyers but I never do. No civil rights law or court decision ever happens without a lawyer writing legislation or arguing before a court. The legal presentations given in Obergefell v. Hodges were top notch. Implementing the decision requires operational changes that will require policy makers, legal experts, and community activists to work together.

This is really an amazing day.

Posted by Tom Limoncelli in CommunityPolitics

Recently we were having the most difficult time planning what should have been a simple upgrade. There is a service we use to collect monitoring information (scollector, part of Bosun). We were making a big change to the code, and the configuration file format was also changing.

The new configuration file format was incompatible with the old format.

We were concerned with a potential Catch-22 situation. Which do we upgrade first, the binary or the configuration file? If we put the new RPM in our Yum repo, machines that upgrade to this package will not be able to read their configuration file and that's bad. If we convert everyone's configuration file first, any machine that restarts (or if the daemon is restarted) will find the new configuration file and that would also be bad.

The configuration files (old and new) are generated by the same configuration management system that deploys the new RPMs (we use Puppet at Stack Exchange, Inc.). So, in theory we could specify particular RPM package versions and make sure that everything happens in a coordinated manner. Then the only problem would be newly installed machines, which would be fine because we could pause that for an hour or two.

But then I realized we were making a lot more work for ourselves by ignoring the old Unix adage: If you change the file format, change the file name. The old file was called scollector.conf; the new file would be scollector.toml. (Yes, we're using TOML).

Now that the new configuration file would have a different name, we simply had Puppet generate both the old and new file. Later we could tell it to upgrade the RPM on machines as we slowly roll out and test the software. By doing a gradual upgrade, we verify functionality before rolling out to all hosts. Later we would configure Puppet to remove the old file.

This reminds me of the fstab situation in Solaris many years ago. Solaris 1.x had an /etc/fstab file just like Linux does today. However, Solaris 2.x radically changed the file format (mostly for the better). They could have kept the filename the same, but they followed the adage and for good reason. Many utilities and home-grown scripts manipulate the /etc/fstab file. They would all have to be rewritten. It is better for them to fail with a "file not found" error right away, then work away and modify the file incorrectly.

This technique, of course, is not required if a file format changes in an upward-compatible way. In that case, the file name can stay the same.

I don't know why I hadn't thought of that much earlier. I've done this many times before. However the fact that I didn't think of it made me think it would be worth blogging about it.

Posted by Tom Limoncelli in Technical Tips

My talk and 2 tutorial proposals have been accepted at Usenix LISA LISA Conference!

  • Talk:
    • Transactional system administration is killing us and must be stopped
  • Tutorials:
    • How To Not Get Paged: Managing Oncall to Reduce Outages
    • Introduction to Time Management for busy Devs and Ops

The schedule isn't up yet at http://www.usenix.org/lisa15 but Usenix is encouraging speakers to post to social media early this year.

See you in Washington DC Nov 8-13, 2015!

P.S. You can follow LISA on various social networks:

Update: 2015-06-16 I changed the title to "some of my proposals" instead of "my proposals". To be clear, I had many rejections this year, I just don't blog about those. That said, I think LISA is a better conference when it increases its speaker diversity and you can't do that if the same few people give a lot of talks.

Posted by Tom Limoncelli in LISAUsenix

Thanks, QCon New York!

I had a great time at QCon New York last week. It was my first time there and my first time speaking too. The audience was engaged and had great questions. I did a book-signing at the Pearson booth and it was fun meeting readers (and future readers) of our books.

Videos of all talks will be available soon. For now you can view the slides. You should also check out the talk by David Fullerton, VP of Engineering of Stack Exchange (my boss's boss) who gave a great talk titled "Scaling Stack Overflow: Keeping it Vertical by Obsessing Over Performance"

Hope to see you next year!

Posted by Tom Limoncelli in Speaking

I'll be speaking at QCon in their "Architecting for Failure" track. My talk is titled "Fail Better: Radical Ideas from the Practice of Cloud Computing".

This conference has a vendor area. I'll be at the Pearson booth signing books on Thursday from 3:50-4:30. Stop by even if you just want to chat!

Registration is still open. More about the conference at qconnewyork.com.

Hope to see you there!

Posted by Tom Limoncelli

Recently a friend told me this story. She had given a presentation at a conference and soon after started receiving messages from a guy that wanted to talk more about the topic. He was very insistent that she was the only person that would understand his situation. Not wanting to be rude, she offered they continue in email but he wanted to meet in person. His requests became more and more demanding over time. It became obvious that he wasn't looking for mentoring or advice. He wanted a date.

She had no interest in that.

Unsure what to do, she asked a few other female attendees for advice. What a surprise to discover that the same guy had also contacted them and was playing the same game. In fact, she later found out 5 women that had attended the conference were receiving the same treatment.

Yes. Really.

Wait... it gets worse.

This is the third conference I've seen this happen. This isn't just a problem with one particular conference or even one kind of conference.

So, it isn't a coincidence, this is an M.O.

I call this pattern "playing the odds". You approach every woman that attends a conference assuming that odds are in favor that at least one will say "yes".

I'm not sure what is more insulting: the assumption that any female speaker is automatically available or interested in dating, or that the women wouldn't see right through him.

The good news in all three cases is that the conference organizers handled the situation really well once they were aware of the situation.

So, guys, if you ever think you are the first person to think of doing this, I have some sad news for you. First, you aren't the first. Second, it won't work.

To the women that speak at conferences, now that you know this is a thing, I hope it is easier to spot.

The problem is that there is no transparency in the system. It isn't obvious if the guy is doing this to a lot of women because sharing such information is difficult. It would be uncomfortable to share this information. There are many privacy concerns, in particular if the guy was contacting the women for legitimate reasons, a false-positive being publicly announced would be... bad.

If only there was a confidential service where people could register "so-and-so is contacting me saying x-y-z". If multiple people reported the same thing, it would let all parties know.

I was considering offering such a service. The implementation would be quite simple. I would set up an email inbox. Women would send messages with a subject line that contained the person's email address, name, and whether their approach was "maybe" or "very" creepy. I would check the inbox daily. For example my inbox might look like this:

Subject: [email protected] Joe Baker  maybe
Subject: [email protected] Mark Jones  maybe
Subject: [email protected] Ken Art  maybe
Subject: [email protected] Mark Jones  maybe
Subject: [email protected] Ryan Example  very
Subject: [email protected] Mark Jones  maybe
Subject: [email protected] Mark Jones maybe

If I saw those Subject lines, I would alert the parties involved that Mark Jones seems to be on the prowl. The service wouldn't be entirely confidential, but I would do the best I could.

Then I realized I could build a system that would require zero human interaction and only be slightly less accurate. It would be a website called "did-he-just-hit-on-me.com" and it would be a static web page that displays the word "yes". True, it wouldn't be 100 percent accurate but exactly how less accurate is difficult to determine. If you have to ask, the answer is probably "yes".

Jokes aside, maybe someone with better programming skills could come up with an automated system that protects everyone's privacy, is secure, and strikes the right balance between transparency, privacy, and accuracy.

I'm not one of the people who is directly affected by this sort of thing, so if my thinking on these solutions is off base, I'm eager to hear it.

In the meanwhile, I'm holding on to that domain.

Posted by Tom Limoncelli in Women in Computing

Tom will be giving a talk entitled "Safer Puppet" in 4 quick demos" at Puppet Camp NYC on May 15, 2015

Info about the event.

Info about the presentations and how to register.

Posted by Tom Limoncelli in AppearancesArchive

Tom will be giving a talk entitled "Safer Puppet" in 4 quick demos" at the May meeting of LOPSA's NJ Chapter on May 7, 2015

More info on their Meetup page.

NOTE: This is a dress rehearsal of the talk I'll be giving at Puppet Camp NYC the following week.

Posted by Tom Limoncelli in AppearancesArchive

 
  • LISA15