Learn your operating system's internals

"Dear Tom, I'm a junior sysadmin and want to be more knowledgeable about the operating systems I administer. I get the feeling that a lot of my co-workers run on myth, superstitions, and folklore when it comes to their job and I want to be better. Sincerely, The Truth Is In There"

Dear Truth,

I applaud your quest to avoid superstition in your role as system administrator. Every time I fix a problem by rebooting (rather than knowing the real cause and fixing it) I feel a little bit of me dies inside. It hurts our industry and our profession when we develop bad habits like guessing instead of knowing.

There are three topic areas that are complicated, misunderstood, and therefore prone to folklore: memory subsystem, the file subsystem, and processes. If I had to add a third it would be the security subsystem, but often understanding the first three is a prerequisite to fully understanding security.

Memory is complicated. Virtual memory, swapping, and so on make this a complicated topic. To tune a system without understanding how these really work (vs. what we were taught in school) is the difference between success and failure. Understanding how modern memory systems work can result in a 9x performance improvement.

Knowing how the filesystem works is as important to a sysadmin as knowing anatomy is to a doctor. Knowing the filesystem begins with understanding how data is laid out on the disk (blocks and tracks), how files and directories are organized (what's stored in the directory structure, for example), and how the file system is buffered and how it interacts with the memory system. Ever since the OS concept of "unified memory and file systems", good performance comes from a tight integration of the memory and file system. Also, the file system dictates the namespace of the OS, which affects every thing else. Do you know what kind of access is slow in your operating system's namespace? You should.

A deep knowledge of how processes work is important syadmins are often required to debug problems that happen at the "edge cases" of processes: Some weird scheduling mishap because there isn't enough memory for all processes and the "wrong" process gets swapped out, developers come to you unsure why their new software release creates zombie processes, and so on.

Here are my suggestions on the best books in this category:

While you may not be a FreeBSD user, that book is excellent to read no matter what operating system you use. It it used as a textbook in many schools because it teaches the fundamental underpinnings of operating system design. If you use an POSIX system, consider reading it.

"TCP IP Illustrated" because, while not an operating system, is my favorite book for learning how TCP/IP works: from ARP and ping, to telnet, to all those funny TCP sliding window issues. This book (and the 2 sequels) is an amazing tour of how the protocols you use every day work.

Hope that helps,
Tom

Posted by Tom Limoncelli in Career Advice

No TrackBacks

TrackBack URL: http://everythingsysadmin.com/cgi-bin/mt-tb.cgi/1234

6 Comments | Leave a comment

For linux administrators, I can't recommend reading LWN's Weekly Edition enough. It's published every Thursday, and each edition becomes available for free a week after it's published. It's like getting a lecture from a group of experienced sysadmins and kernel developers every week.

Tom,

You should include your book "The Practice of System and Network Administration" on that list of resources. While it may not go into the details of memory or processor optimization, it does cover something I think is just as critical. Enterprise management is crucial for understanding interoperability, compatability, timing and time management, and so much more. Dont give your self short shrift, this stuff is just as important and treated with just as much superstition and folklore as the system internals is. And you have advanced the field as much as anyone has.

For Solaris, there's no way around Solaris Internals (2nd ed). Armed with that, dtrace and the source you've got a head start.

Any reading material for Mac OS?

Paul, try "Mac OS X Internals: A Systems Approach" it is very good, although I'm pretty sure it's very dated by now.

Since Linux Kernel Internals (2nd Edition) is a bit outdated and not highly rated as well on Amazon, I was wondering if The Linux Programming Interface: A Linux and UNIX System Programming Handbook was a better book to understand the Linux Internals.
Please advice

Leave a comment

 
LISA14 I'm Teaching button