Demo Data as Code

My newest article for acmQueue magazine is called Demo Data as Code:

Posted by Tom Limoncelli in ACM Queue Column

Robert Ross (a.k.a. Bobby Tables) will be the speaker at the next nycdevops meetup on Wed, une 19, 2019.

Full details and RSVP info:

NOTE: Different day and location!

  • Title: Staying Informed with Kubernetes Informers
  • Speaker: Robert Ross (Bobby Tables) from FireHydrant
  • Date: Wed, June 19, 2019
  • Location: Compass, 90 Fifth Ave, New York, NY 10011

Kubernetes state is changing all the time. Pods are being created. Deployments are adding more replicas. Load balancers are being created from services. All of these things can happen without anyone noticing. But sometimes we need to notice, however, for when we need to react to such events. What if we need to push the change to an audit log? When if we want to inform a Slack room about a new deployment? In Kubernetes, this is possible with the informers that are baked into the API and Go client. In this talk we'll learn how informers work, and how to receive updates when resources change using a simple Go application.


Bobby is the founder of, and also previously worked as a staff software engineer at Namely, and also built things at DigitalOcean. He likes bleeding edge tech and making software that helps teams build better better systems. From deploying Spinnaker, Istio, and Kubernetes, he has cursed at a lack of docs and code spelunked through the code and loves telling the war stories about them.

Full details and RSVP info:

Posted by Tom Limoncelli in NYCDevOps Meetup

The April nycdevops Meetup is Thursday, April 18. Doors open at 6:30pm!

NOTE: The meetings are now on THURSDAY.

  • Title: How to build a tamper-evident CI/CD system
  • Speaker: Trishank Karthik Kuppusamy, Datadog, Inc

TALK DESCRIPTION: CI/CD is critical to any DevOps operation today, but when attackers compromise it, they get to distribute malicious software to millions of unsuspecting users. We present how Datadog used TUF and in-toto to develop, to the best of our knowledge, the industry's first end-to-end verified pipeline that automatically builds integrations for the Datadog agent. That is, even if this pipeline is compromised, users should not be able to install malware. We will show a demonstration of our pipeline in production being used to protect users of the Datadog agent, and describe how you can use TUF + in-toto secure your own pipeline.

SPEAKER BIO: Trishank Karthik Kuppusamy is a security engineer at Datadog, Inc. Previously, he led the research and development of The Update Framework (TUF) and Uptane at the NYU Tandon School of Engineering. He is also a member of the IEEE-ISTO Uptane standardization alliance, and an Editor of in-toto Enhancements.

Space is limited. Please RSVP soon!

Posted by Tom Limoncelli in NYCDevOps Meetup for details. DevOpsDays-NYC is Jan 24/25, 2019. Don't miss it!

Posted by Tom Limoncelli in DevOpsDays

Was the root cause of the O2 outage really an expired certificate?

Why wasn't the "root cause" any of these?

  • Certificate expiration not monitored
  • Certificate renewal process complex so that everyone hopes someone else fixes it
  • Certificate renewal is so rare, we aren't good at doing it
  • Deploying new certificates manual and error-prone
  • Vendor did not document all periodic maintenance requirements
  • Soon-to-expire certs not logged
  • Logging for each component an island onto itself

The reason, dear reader, is that there is no such thing as a single "root cause". There are only contributing factors.

When will the industry learn?

Posted by Tom Limoncelli

Disclaimer: I haven't worked at Google for 5+ years so this kind of story is probably outdated. I mean, how could Google not have fixed this problem in the last 10 years?

In 2008 I was on a business trip to Seattle and I had dinner with an old college friend who now worked at Microsoft. I noticed that she had an iPhone. This was when Microsoft was heavily pushing their own phone product, and Android hadn't started shipping.

I thought it was odd that a Microsoftie would be using an iPhone and pointed it out.

"Oh, it's the opposite. We are encouraged to use the competition's products. The better we understand their products, the better we can compete with them."

I thought that was a very sound strategy.

When I got back to the office, I happened to have a meeting with one of the feature designers for Google Docs. I was meeting to suggest some improvements.

The designer was interested in one feature I was suggesting. He asked my opinion of how the UX flow should work. I responded, "Well, have you seen how Microsoft Word does it?"

"Oh no, I try not to look at competing products."

"Why not?", asked.

"Oh, I don't want to be influenced by their design decisions."


Even as an I use a lot of Google products and often I see a feature that has a user experience that can only be described as embarrassingly broken. I use this phrase only when competing products get it right.

I wonder where that feature designer is today.

When was the last time you gave your competitor's product a test run? Used it for a week or two? Does your employer encourage this or discourage this? If you are a manager, do you encourage your employees to do this? Does your corporate culture encourage or discourage this?

Posted by Tom Limoncelli

Cheers to my coworker Taryn for her blog post about how she did an extremely complex series of 30 Microsoft SqlServer upgrades.

If you've seen the film "Apollo 13", there's a scene where they have to get something right in the simulator before they can do it in space. That's basically what she had to do.

Read the post here: How we upgraded Stack Overflow to SQL Server 2017

Here's some takeaways:

  • Set up a lab environment to test complex changes.
  • Communicate with your users.
  • Write a detailed playbook.
  • Don't do it alone.
  • Ask for help from all over.
  • Keep a lab notebook.
  • Record it for posterity!

I'm super proud to have people like Taryn on our SRE team at Stack Overflow!

(Would you like to work with awesome people like Taryn? We're have many of open positions including a west-coast (US/Pacific or compatible) Cloud/Azure SRE, an Internal IT Support Engineer (remote or NYC), and a Junior Technology Concierge Help Desk (London))

Posted by Tom Limoncelli in Stack Exchange, Inc.

Things you might not have known about Google Authenticator:

Copy and paste

If you press and hold the 6-digit number, it puts it in your cut and paste buffer.

Re-order the list

If you click the pencil to go into edit mode, you can change the order of the items.

I find this particularly important because I now have 12 different systems authenticating with this app, and only 4 fit on the screen of my tiny iPhone SE.

I've pushed the ones that I use the most to the top of the list. The Google-related services that generally authenticate via a notification asking "Is this you trying to log in?" are now all shifted to the end of the list, since I rarely need them.

As a result, I am able to authenticate in about half the time.

Posted by Tom Limoncelli in MiscSecurity

My team at Stack Overflow is looking to hire SREs with Windows experience, particularly administration of Microsoft SqlServer.

If you are a system administration looking to move into more of an SRE position, this is an ideal opportunity.

Here's the job listing:

NOTE: While we are a remote-first team with team members all over the world, this position will have occasional datacenter work requirements, which means 1-hour travel time to our Jersey City, NJ datacenter is a requirement.

Posted by Tom Limoncelli

All Day DevOps is a global event held on the internet. 24 hours of talks, over 100 speakers, all streaming over the Internet. 17-Oct-2018

Registration is free!

I will be presenting my talk Stealing The Best Ideas From DevOps: Applying DevOps Outside Of SDLC

More info is at:

Posted by Tom Limoncelli in Speaking

  • Don't Miss Out - Register Today