Recently in Teaching System Administration Category

You do? Then you should attend SESA '16 in Boston, Dec 6, 2016. For more info go to (I'm particularly looking forward to Nicole's opening keynote.)

I've always felt that most geeks give examples (to beginners) that are too complex. I believe this is an attempt to be complete. However, beginner examples should be so simple even if you feel like you are committing lies of omission.

A recently Slashdot article, Revisiting Why Johnny Can't Code: Have We "Made the Print Too Small"? mentioned that often the examples we give are too complex for the beginners we intend them for. They compare the starting example from Mark Zuckerberg's what-is-coding video to a simple BASIC example. They make a comparison to the book How to Teach Your Baby to Read, the authors explain, "It is safe to say that in particular very young children can read, provided that, in the beginning, you make the print very big."

In other words: Know your audience.

Many times I've seen people introduce a new system by boasting how it can solve sophisticated problems and start with the most bizarre, complex example. They instantly lose the audience. The first impression they've made is "this is too complex for me". Oops.

One of my favorite examples is the manual page for "find" in FreeBSD. The first example is:

find / \! -name "*.c" -print

What a shitty, mean, example to put in front of a beginner. This example requires that the person understand globs, the need to quote "*", the fact that many shells treat "!" special and it must be escaped. That's two different escaping methods in the same example! I imagine many people see \! then are disappointed to not be able to find \! mentioned anywhere else in the man page (to a new user \! is not !). Oh, and the example will get a user in trouble if they run it because it starts at "/" and, if they are on a machine with access to many NFS servers, will take days to run and may invoke the ire of their sysadmins. Good job, FreeBSD!

Here's a better first example of "find":

find /tmp -name foo.c -print
    Print out a list of all files named "foo.c"
    in /tmp or any subdirectory.

A good second example would introduce exactly one new concept, such as globs:

find . -name '*.c' -print
    Print out all files whose name ends with
    .c in this directory and any subdirectories.

I would then add the "not" concept:

find . \! -name '*.c' -print
    Print out all files whose name does not begin
    with .c in this directory and any subdirectories.
    Note that "!" is escaped because many shells
    treat it as a special character.

Notice that I change " to '. Don't start people using double quotes. That leads to security problems. Get them in the habit of using single quotes from the start.

The examples should cover the most common use-cases, not just show off how to use various features.

One of the most desired use-cases is to have find skip certain files or directories, especially if you use Git or Subversion. To do this one must use -prune, which doesn't work as most people would expect. So what is the first example to do such a thing?

find /usr/src -name CVS -prune -o -depth +6 -print
    Find files and directories that are at least
    seven levels deep in the working directory /usr/src.

Not only is that overly complex, but the description is useless to anyone looking for "skip directories".

The first example of -prune should be very simple and amazingly practical. Just skip one or more directories:

find . -name .git -prune -o -print
    List all files, but skip any subdirectories called .git

find . -type d \( -name .git -o -name .svn \) -prune -o -print
    List all directories, but skip any
    subdirectories called .git or .svn.

The other most common use case of find is to run a command on each file found. In this case the description is confusing to a new person:

find / -type f -exec echo {} \;
    Use the echo(1) command to print out a
    list of all the files.

Would it be so difficult to simply say:

    Run the echo(1) command on each file found.

Linux man pages are equally guilty. The man page for find on CentOS 7 starts with examples that delete files, and has a security hole in it:

find /tmp -name core -type f -print | xargs /bin/rm -f

Yes, the next example explains and fixes the security hole, but why start with an example that you wouldn't want users to blindly cut and paste?

The same man page lists this example for running a program on each file found:

find . -type f -exec file '{}' \;

Is "file" a command, a keyword, or are you supposed to replace it with the name of a file? Ugh. Why pick the one command that has so many different overloaded terms. What's wrong echo or stat or sha256sum?

find . -type f -exec sha256sum '{}' \;
    Run sha256sum(1) on each file found.

I've raised this issue with FreeBSD and Linux developers. One told me, "Man pages shouldn't be tutorials". That's a rationalization to cover up bad behavior. There is a big difference between a comprehensive tutorial, as would be appropriate for a book or video series, and having thoughtful examples.

Posted by Tom Limoncelli in Teaching System Administration

Since I can not attend the LISA Workshop on Teaching System Administration (I'll be teaching system administration that day!), I'd like to take a moment to say something to the attendees.

Often we are in the thick of things and we lose sight of how valuable our work is.

What you are doing is incredibly important; maybe more important that you realize. IT isn't just important, it is scary-important. The usual old sayings about how important IT is are now obsolete. It isn't that IT is a part of how food gets from the farm to our plate, we, as a society, no longer know how to provide food without IT. Medicine isn't just billed and administered with the assistance of IT, we can't provide medical services without IT anymore. Sysadmins are not just "important", the existence of excellence in system administration is key to sustaining civilization as we know it.

Those teaching system administrators need to step up to the plate. Our world depends on you.

It is time for an organization to take a leadership role in defining a standard sysadmin curriculum and get it adopted at all 4-year and 2-year schools. The 2-year training is embarrassingly bad. The 4-year training is bad to mediocre.[1]

Students are graduating 4-year programs without understanding the internals of systems, nor how they are used en masse in the real world. This would be like auto mechanics not being taught how an internal combustion engine works or doctors some how graduating medical school without knowing that patients are alive between office visits.

10% of us know the right way to do things. The other 90% don't. Why the un-even distribution of knowledge? The trouble this brings is far reaching. Sarbanes-Oxley essentially says, "If you are going to be so unbelievably stupid as to do backups without testing them, create accounts without having a mechanism to make sure they are disabled when the employee leaves, and letting developers have unrestricted raw access to live databases; then we're going to legislate how you have to do your job." HIPAA essentially says that our industry has proven itself too incompetent to be trusted with securing databases or WiFi networks in hospitals. Therefore how to do our jobs is being written into legislation.[2]

What's next? What will be the next example of rampant incompetence that leads to more legislation that tells us how we have to do our jobs? What crap caused by the worst of us will ruin it for the rest of us? What other obvious best practice that sites somehow still successfully ignore will become required by law? "have a helpdesks that don't suck"? "Track your customer requests with a 'ticket' system"? "buy load balancers in pairs"? "ping a machine after you've unplugged it to make sure you unplugged the right one"? "lock our screens when you leave your desk"? Many of these were "rocket science" 10 years ago. Now it's just embarrassing to see IT teams that are blind to these ideas.

This is a problem that is bigger than any one person can solve. You and I know this. We've written books to try to educate, but how much can one person do? These are the greatest challenge to our industry has ever faced. This is the kind of thing that requires group effort.

Creating such curriculum would take a long time, and getting it widely adopted even longer. However, with the power of Usenix, the expertise of LOPSA, and the academic ubiquity of ACM, this could really happen.

I hope that the members of the workshop take the time to think big.

Things don't get better on their own.

Sincerely, Tom Limoncelli

[1] These are based on indirect experience. The truth is that we don't have a measure for how to quantify if a school is doing a good job. First we need a standard to measure institutions by, then we need to go around measuring institutions. Providing a self-evaluation kit would even be a major step forward.

[2] One might say that it is the executive management of hospitals that is to blame. I disagree. We are at fault for not being able to explain the issue in a way that gets executive attention. Worse, often we are at companies that are selling systems with known problems. Why do we even offer a known-bad solution? Is it our own ignorance or is it like the consultant I once saw explain to a customer 3 options, one he pointed out that he recommends against. Of course the customer wanted the one he was recommending against. Why did he even mention that option? It wasn't an option. The customer wouldn't have thought of it on their own. It was a counter-example that you turned into an option. Knucklehead!

  • LISA16