Yes, I've started a video podcast that has a homework assignment built-in. Watch a famous talk from a past LISA conference (that's the homework) then watch Tom and Lee interview the speaker. What's new since the talk? Were their predictions validated? Come find out!

Watch it live or catch the recorded version later.

The first episode will be recorded live Tuesday July 28, 2015 at 1:30pm PDT. This month's guest will be Todd Underwood who will discuss his talk from LISA '13 titled, Post-Ops: A Non-Surgical tale of Software, Fragility, and Reliability

For links to the broadcast and other info, go here: https://www.usenix.org/conference/lisa15/lisa-conversations

Posted by Tom Limoncelli in LISA Conversations

Pearson / InformIT.com is running a promotion through August 11th on many open-source related books, including Volume 2, The Practice of Cloud System Administration. Use discount code OPEN2015 during checkout and received 35% off any one book, or 45% off 2 or more books.

See the website for details.

Posted by Tom Limoncelli in DevOps

If you are in NYC, there is a SysAdmin Appreciation day event at The Gingerman, 11 E 36th Ave, New York City, NY, on Friday, July 31, 2015, 6:00 PM. This event usually has a big turn-out and is a great way to meet and network with local admins.

RSVP here: http://www.meetup.com/Sysdrink/events/223896825/

Thanks to Digital Ocean for sponsoring this event, and Justin, Jay, Nathan and the other organizers for putting this together every year.

Hope to see you there!

Posted by Tom Limoncelli in Community

There are many ways to specify scheduled items. Cron has 10 8,20 * 8 1-5 and iCalendar has RRULE and Roaring Penguin brings us REMIND. There's a new cross-platform DSL called Schyntax, created by my Stack Overflow coworker Bret Copeland.

The goal of Schyntax is to be human readable, easy to pick up, and intuitive. For example, to specify every hour from 900 UTC until 1700 UTC, one writes hours(9..17)

What if you want to run every five minutes during the week, and every half hour on weekends? Group the sub-conditions in curly braces:

{ days(mon..fri) min(*%5) } { days(sat..sun) min(*%30) }

It is case-insensitive, whitespace-insensitive, and always UTC.

You can read more examples in his blog post: Schyntax Part 1: The Language. In Schyntax Part 2: The Task Runner (aka "Schtick") you'll see how to create a scheduled task runner using the Schyntax library.

You can easily use the library from JavaScript and C#. Bret is open-sourcing it in hopes that other implementations are created. Personally I'd love to see a Go implementation.

The Schynatx syntax and sample libraries are on Github.

I don't make it to the SF/Bay Area very often so I'm giving you all advanced notice about this event. I'll be giving my talk about our new book, The Practice of Cloud System Administration, at the San Francisco DevOps Meetup on Mon, July 13, 2015. Info about RSVP at http://www.meetup.com/San-Francisco-DevOps/events/223171890/

Posted by Tom Limoncelli in AppearancesArchive

UCMS '15 Call for PapersThe Call for Participation for the new 2015 USENIX Container Management Summit is now online.

UCMS '15 will take place November 9, 2015, during LISA15 in Washington, D.C.

Posted by Tom Limoncelli

Step 1: Watch a video from a past Usenix LISA conference.

Step 2: Join the Hangout On Air and watch Lee Damon and Tom Limoncelli interview the speaker. Send Q&A during the show.

Step 3: Watch and enjoy!

Our first 4 are scheduled for the last Tuesday of July/Aug/Sept/Oct.

The first one is July 28, 2015 at 1:30pm PDT. We'll be interviewing Todd Underwood about his LISA 2013 talk Post-Ops, A Non-Surgical tale of Software, Fragility, and Reliability. Watch the presentation head of time then join the the Google Hangout On Air.

(Want a reminder? RSVP for the event)

Posted by Tom Limoncelli in LISA Conversations

I don't make it to the SF/Bay Area very often so I'm giving you all advanced notice about this event.

I'll be talking about our new book, The Practice of Cloud System Administration, at the July meeting of the Bay Area Infracoders Meetup. This is a very "dev meets ops" book, and I hope you all enjoy hearing about it. RSVP early! Hope to see you there! http://www.meetup.com/Bay-Area-Infracoders/events/222938571/

Posted by Tom Limoncelli in AppearancesArchive

Someone on Quora recently asked, Why did Google include the 'undo send' feature on Gmail?. They felt that adding the 30-second delay to email delivery was inefficient. However rather than answering the direct question, I explained the deeper issue. My (slightly edited) answer is below. NOTE: While I previously worked at Google, I was never part of the Gmail team, nor do I even know any of their developers or the product manager(s). What I wrote here is true for any software company.


Why did Google include this feature? Because the "Gmail Labs" system permits developers to override the decisions of product managers. This is what makes the "Labs" system so brilliant.

A product manager has to decide which features to implement and which not to. This is very difficult. Each new feature takes time to design (how will it work from the user perspective), architect (how will the internals work), implement (write the code that makes it all happen), and support (documentation, and so on). There are only so many hours in the day, and only so many developers assigned to Gmail. The product manager has to say "no" to a lot of good ideas.

If you were the product manager, would you select features that are obviously going to possibly attract millions of new users, or features that help a few existing users have a slightly nicer day? Obviously you'll select the first category. IMHO Google is typically is concerned with growth, not retention. New users are more valuable than slight improvements that will help a few existing users. Many of these minor features are called "fit and finish"... little things that help make the product sparkle, but aren't things you can put in an advertisement because they have benefits that are intangible or would only be understood by a few. Many of the best features can't be appreciated or understood until they are available for use. When they are "on paper", it is difficult to judge their value.

Another reason a product manager may reject a proposed feature is politics. Maybe the idea came from someone that the product manager doesn't like, or doesn't trust. (possibly for good reason)

The "Labs" framework of Google products is a framework that let's developers add features that have been rejected by the product manager. Google engineers can, in their own spare time or in the "20% time" they are allocated, implement features that the product manager hasn't approved. "Yes, Mr Product Manager, I understand that feature x-y-z seems stupid to you, but the few people that want it would love it, so I'm going to implement it anyway and don't worry, it won't be an official feature."

The Third Way of DevOps is about creating a culture that fosters two things: continual experimentation (taking risks and learning from failure) and understanding that repetition and practice is the prerequisite to mastery. Before the Labs framework, adding any experimental feature had a huge overhead. Now most of the overhead is factored out so that there is a lower bar to experimenting. Labs-like frameworks should be added to any software product where one wants to improve their Third Way culture.

Chapter 2 of The Practice of Cloud System Administration talks about many different software features that developers should consider to assure that the system can be efficiently managed. Having a "Labs" framework enables features to be added and removed with less operational hassle because it keeps experiments isolated and easy to switch off if they cause an unexpected problem. It is much easier to temporarily disable a feature that is advertised as experimental.

What makes the "Labs" framework brilliant is that it not only gives a safe framework for experimental features to be added, but it gathers usage statistics automatically. If the feature becomes widely adopted, the developer can present hard cold data to the product manager that says the feature should be promoted to become an official feature.

Of course, the usage statistics might also show that the feature isn't well-received and prove the product manager correct.

A better way of looking at it is that the "labs" feature provides a way to democratize the feature selection process and provides a data-driven way to determine which features should be promoted to a more "official" status. The data eliminates politically-driven decision making and "I'm right because my business card lists an important title"-business as usual. This is one of the ways that Google's management is so brilliant.

I apologize for explaining this as an "us vs. them" paradigm i.e. as if the product managers and developers are at odds with each other. However, the labs feature wouldn't be needed if there wasn't some friction between the two groups. In a perfect world there would be infinite time to implement every feature requested, but we don't live in that world. (Or maybe the "Labs" feature was invented by a brilliant product manager that hated to say "no" and wanted to add an 'escape hatch' that encouraged developers to experiment. I don't know, but I'm pessimistic and believe that Labs started as an appeasement.)

So, in summary: Why did Google include the 'undo send' feature on Gmail? Because someone thought it was important, took the time to implement it under the "labs" framework, users loved the feature, and product management promoted it to be an official Gmail feature.

I wish more products had a "labs" system. The only way it could be better is if non-Googlers had a way to add features under the "labs" system too.

Hey Google, when do we get that?

Posted by Tom Limoncelli in DevOps

TOML vs. JSON

[This is still only draft quality but I think it is worth publishing at this point.]

Internally at Stack Exchange, Inc. we've been debating the value of certain file formats: YAML, JSON, INI and the new TOML format just to name a few.

[If you are unfamiliar with TOML, it is Tom's Obvious, Minimal Language. "Tom", in this case, is Tom Preston-Werner, founder and former CEO of GitHub. The file format is still not reached version 1.0 and is still changing. However I do like it a lot. Also, the name of the format IS MY FREAKIN' NAME which is totally awesome. --Sincerely, Tom L.]

No one format is perfect for all situations. However while debating the pros and cons of these formats something did dawn on me: one group is for humans and another is for machines. The reason there will never be a "winner" in this debate is that you can't have a single format that is both human-friendly and machine-friendly.

Maybe this is obvious to everyone else but I just realized:

  1. The group that is human-friendly is easy to add comments to, and tolerant of ambiguity, is often weakly typed (only differentiating between ints and strings).

  2. The group that is machine-friendly is difficult (or impossible) to add comments, is less forgiving about formatting, and use often strongly typed.

As an example of being unforgiving about formatting, JSON doesn't permit a comma on the last line of a list.

This is valid JSON:

{
   "a": "apple", 
   "alpha": "bits", 
   "j": "jax"
}

This is NOT valid JSON:

{
   "a": "apple", 
   "alpha": "bits", 
   "j": "jax",
}

Can you see the difference? Don't worry if you missed it because it just proves you are a human being. The difference is the "j" line has a comma at the end. This is forbidden in JSON. This catches me all the time because, well, I'm human.

It also distracts me because diffs are a lot longer as a result. If I add a new value, such as "p": "pebbles" the diff looks very different:

$ diff x.json  xd.json 
4c4,5
<    "j": "jax"
---
>    "j": "jax",
>    "p": "pebbles"

However if JSON did permit a trailing comma (which it doesn't), the diffs would look shorter and be more obvious.

$ diff y.json yd.json 
4a5
>    "p": "pebbles",

This is not just a personal preference. This has serious human-factors consequences in an operational environment. It is difficult to safely operate a large complex system and one of the ways we protect ourselves if by diff'ing versions of configuration files. We don't want to be visually distracted by little things like having to mentally de-dup the "j" line.

The other difference is around comments. One camp permits them and another camp doesn't. In operations often we need to be able to temporarily comment out a few lines, or include ad hoc messages. Operations people communicate by leaving breadcrumbs and todo items in files. Rather than commenting out some lines I could delete them and use version control to bring them back, but that is much more work. Also, often I write code in comments for the future. For example, as part of preparation for a recent upgrade, we added the future configuration lines to a file but commented them out. By including them, they could be proofread by coworkers. It was suggested that if we used JSON we would simply add a key to the data structure called "ignore" and update the code to ignore any hashes with that key. That's a lot of code to change to support that. Another suggestion was that we add a key called "comment" with a value that is the comment. This is what a lot of JSON users end up doing. However the comments we needed to add don't fit into that paradigm. For example we wanted to add comments like, "Ask so-and-so to document the history of why this is set to false" and "Keep this list sorted alphabetically". Neither of those comments could be integrated into the JSON structures that existed.

On the other hand, strictly formatted formats like JSON are, in theory, faster to parse. Supporting ambiguity slows things down and leads to other problems. In the case of JSON, it is just plain so widely supported there are many reasons to use it just for that reason.

Some formats have typed data, others assume all data are strings, others distinguish between integer and string but go no further. YAML, if you implement the entire standard, has a complex way of representing specific types and even supports repetition with pointers. All of that turns YAML's beautifully simple format into a nightmare unsuitable for human editing.

I'm not going to say "format XYZ is the best and should be used in all cases" however I'd like to summarize the attributes of each format:

* Format JSON YAML TOML INI
M Formal standard YES YES soon no
M Strongly typed YES YES string/int no
M Easy to implement
the entire standard
YES no YES YES
H Awesome name! no no YES no
H Permits comments no start of line only YES usually
H diffs neatly no YES (I think) YES YES
H Can be
programmatically
updated without losing
format or comments
yes-ish NO soon NO

The * column indicates if this quality is important for machines (M) or humans (H). NOTE: This chart is by no means complete.

Personally I'm trying to narrow the file formats in our system down to two: one used for machine-to-machine communication (that is still human readable), and the other that is human-generated (or at least human-updated) for machine consumption (like configuration files). (Technically there's a 3rd need: Binary format for machine-to-machine communication, such as ProtoBufs or CapnProto.)

I'm very optimistic about TOML and look forward to seeing it get to a 1.0 standard. Of course, the fact that I am "Tom L." sure makes me favor this format. I mean, how could I not like that, eh?

Update: 2015-07-01: Updated table (TOML is typed), and added row for "Awesome name".

 
  • LISA15