November 2010 Archives

O'Reilly "Cyber Monday" sale: 60% off all ebooks and videos!

Nov. 29 2010

Free to Choose Deal of the Day - Save 60% on ALL Ebooks & Videos! Use code: DDF2H http://oreil.ly/free2choose

Today is a great day to get Time Management For System Administrators for practically nothing!

Don't procrastinate! Use your time wisely and get the eBook version of TM4SA right now!

(This deal starts 12:01 am PT Monday, November 29 and runs for 24 hours)

Posted by Tom Limoncelli in Book News

ACM Queue magazine adds system administration articles!

Nov. 25 2010

If you read ACM Queue Magazine you'll notice a lot more articles relevant to system administrators these next few months.

ACM Queue Magazine is known for being a great magazine for the computer science practitioner has decided to reach out to system administrators by adding articles about sysadmin topics. (I've helped curate the first set of articles you'll see.) There is an increasing trend where software developers need to know more and more about the systems their code runs on. System administrators increasingly strive to automate as much as possible.

Association for Computing Machinery (ACM) is widely recognized as the premier membership organization for computing professionals. If you are a computer scientist, you are probably a member already. ACM used to be very stuffy and academic. Over the last 10 years their magazine has changed dramatically to be readable by most practitioners. I find their coverage of cutting edge topics to be excellent. Their industry news is more useful to me since it is less influenced by vendors.

Also, you'll see some articles that are for software developers but explain to them, well, the things that sysadmins wish they'd know. For example, this issue has an article called "Virtualization: Blessing or Curse?" by Evangelos Kotsovinos.

Check it out here: http://queue.acm.org

(excellent reading for the long weekend)

Posted by Tom Limoncelli in Community

An active technical social life

Nov. 15 2010

At LISA last week I heard the phrase "it is important to have an active technical social life". I'd never heard that phrase before but it really clicked with me.

I'm interesting in hearing what it means to you?

Posted by Tom Limoncelli

LISA papers about robots that correct broken configuration files.

Nov. 10 2010

Ok, not actual robots but hear me out. I'm sitting in a session at LISA 2010 right now (Wednesday, 2pm Papers session) where all 3 papers are about systems that analyse some kind of configuration file and, given some tests, can find a problem and fix it. The three papers cover (in order): router configuration, Role Based Access Control database, and firewall rule set.

The third paper (which is the one about firewall rules, and happens to have won "Best Student Paper") they have a number of ways to manipulate rules in an effort to fix a broken configuration: swap two rules, delete a rule, change an IP addr in a rule, etc. They've invented a novel way to search all the possible combinations of these things so as to quickly find their way to a modified version of the ruleset that works. Pretty cool.

But wait, how do you know if a ruleset "works"? Well, you generate a bunch of tests and run them through the system. If they all pass, the ruleset "works". A test might be "a packet with src address 1.2.3.4, port 80 should NOT get to 1.4.4.4 port ANY". You can have a human write them. But they went further. They invented an algorithm to generate a nearly-optimal number of tests patterns (for example, if you have a /24, you can deduce cases where you don't need to test all 256 addresses in the /24). A human needs to say whether these tests should permit or reject the packet, but at least the generation is automated.

And how do they test this? They invented an algorithm to efficiently apply all the tests to a ruleset at the same time. The ruleset is turned into a decision tree, and the tests can now be processes very quickly.

You'll have to read the paper for all the details.

...but...

But the point of this blog post is not to explain these papers in detail. What I really want to say is that I find it very interesting that people are applying AI techniques to finding and fixing configuration problems. 3 papers in one conference? It could be a trend.

And lastly, if we can write a lot of "unit tests" and automatically fix a firewall ruleset so that it comes into compliance with them, then my questions: at what point will we just write a heck of a lot of tests and let some AI robot write out firewall rules from scratch?

I, for one, welcome our new AI firewall-rule-writing overlords!

Posted by Tom Limoncelli in Conferences

TPOSANA Reading Group started!

Nov. 09 2010

The Mind Of Root podcast has announced that they are doing a study group for The Practice of System and Network Administration. Each week people will read one chapter and discuss it on the podcast.

They are starting with chapter 2. Start reading now and be ready to discuss it on Nov 11.

Click here to get started.

I'm really excited about this project. When they contacted me and asked if it was "ok", I was elated! I look forward to listening to the podcasts and wish them the best!

Tom

Posted by Tom Limoncelli in Book News

OpsCamp Silicon Valley

Nov. 05 2010

Tuesday, November 9, 9:00 a.m.-5:00 p.m., Room C3/4, San Jose Convention Center OpsCamp is an event for the open exchange of ideas around next-gen technologies and strategies for IT Operations. With the rapid change occurring in our industry, we need a place we can meet to share our experiences, challenges, and solutions. OpsCamp is organized in an unconference format.

http://www.opscamp.org/siliconvalley

(this is part of Usenix LISA, but doesn't require registration to attend)

Posted by Tom Limoncelli in Conferences

LOPSA 5th birthday party!

Nov. 01 2010

I can't believe it has been 5 years already! LOPSA runs 5 years old this week. The New Jersey chapter (which is not yet 5 years old) is having a party on Thursday to celebrate.

Everyone is invited. There will be pizza and I may have some prizes to give away.

http://lopsanj.org/node/686

Location:

Lawrence Headquarters Branch of the Mercer County Library
2751 US Highway 1
Lawrenceville, NJ, 08648-4132
See map: Google Maps
Date: Thursday November 4th, 2010
Time: 7:00 PM - 7:20 PM - Social Time
7:20 PM - 7:30 PM - LOPSA-NJ Business and Announcements
7:30 PM - 9:00 PM - Main Presentation

PLEASE RSVP so we know how much food to bring. RSVP info is here: http://lopsanj.org/node/686

Posted by Tom Limoncelli in Community

An open letter to the LISA workshop on Teaching System Administration

Nov. 01 2010

Since I can not attend the LISA Workshop on Teaching System Administration (I'll be teaching system administration that day!), I'd like to take a moment to say something to the attendees.

Often we are in the thick of things and we lose sight of how valuable our work is.

What you are doing is incredibly important; maybe more important that you realize. IT isn't just important, it is scary-important. The usual old sayings about how important IT is are now obsolete. It isn't that IT is a part of how food gets from the farm to our plate, we, as a society, no longer know how to provide food without IT. Medicine isn't just billed and administered with the assistance of IT, we can't provide medical services without IT anymore. Sysadmins are not just "important", the existence of excellence in system administration is key to sustaining civilization as we know it.

Those teaching system administrators need to step up to the plate. Our world depends on you.

It is time for an organization to take a leadership role in defining a standard sysadmin curriculum and get it adopted at all 4-year and 2-year schools. The 2-year training is embarrassingly bad. The 4-year training is bad to mediocre.[1]

Students are graduating 4-year programs without understanding the internals of systems, nor how they are used en masse in the real world. This would be like auto mechanics not being taught how an internal combustion engine works or doctors some how graduating medical school without knowing that patients are alive between office visits.

10% of us know the right way to do things. The other 90% don't. Why the un-even distribution of knowledge? The trouble this brings is far reaching. Sarbanes-Oxley essentially says, "If you are going to be so unbelievably stupid as to do backups without testing them, create accounts without having a mechanism to make sure they are disabled when the employee leaves, and letting developers have unrestricted raw access to live databases; then we're going to legislate how you have to do your job." HIPAA essentially says that our industry has proven itself too incompetent to be trusted with securing databases or WiFi networks in hospitals. Therefore how to do our jobs is being written into legislation.[2]

What's next? What will be the next example of rampant incompetence that leads to more legislation that tells us how we have to do our jobs? What crap caused by the worst of us will ruin it for the rest of us? What other obvious best practice that sites somehow still successfully ignore will become required by law? "have a helpdesks that don't suck"? "Track your customer requests with a 'ticket' system"? "buy load balancers in pairs"? "ping a machine after you've unplugged it to make sure you unplugged the right one"? "lock our screens when you leave your desk"? Many of these were "rocket science" 10 years ago. Now it's just embarrassing to see IT teams that are blind to these ideas.

This is a problem that is bigger than any one person can solve. You and I know this. We've written books to try to educate, but how much can one person do? These are the greatest challenge to our industry has ever faced. This is the kind of thing that requires group effort.

Creating such curriculum would take a long time, and getting it widely adopted even longer. However, with the power of Usenix, the expertise of LOPSA, and the academic ubiquity of ACM, this could really happen.

I hope that the members of the workshop take the time to think big.

Things don't get better on their own.

Sincerely, Tom Limoncelli

[1] These are based on indirect experience. The truth is that we don't have a measure for how to quantify if a school is doing a good job. First we need a standard to measure institutions by, then we need to go around measuring institutions. Providing a self-evaluation kit would even be a major step forward.

[2] One might say that it is the executive management of hospitals that is to blame. I disagree. We are at fault for not being able to explain the issue in a way that gets executive attention. Worse, often we are at companies that are selling systems with known problems. Why do we even offer a known-bad solution? Is it our own ignorance or is it like the consultant I once saw explain to a customer 3 options, one he pointed out that he recommends against. Of course the customer wanted the one he was recommending against. Why did he even mention that option? It wasn't an option. The customer wouldn't have thought of it on their own. It was a counter-example that you turned into an option. Knucklehead!

Posted by Tom Limoncelli in Checklists Professionalism Teaching System Administration

Awesome Conferences

November 2010 Archives

O'Reilly "Cyber Monday" sale: 60% off all ebooks and videos!

ACM Queue magazine adds system administration articles!

An active technical social life

LISA papers about robots that correct broken configuration files.

TPOSANA Reading Group started!

OpsCamp Silicon Valley

LOPSA 5th birthday party!

An open letter to the LISA workshop on Teaching System Administration

Best of Blog

Navigation

Recent Entries

Search

Archives

RSS Feed

Credits