PICC '12

SolarWinds Advertisement

LISA '12

An Illustrated Guide to SSH Agent Forwarding

I don't think I really understood SSH "Agent Forwarding" until I read this in-depth description of what it is and how it works:

http://www.unixwiz.net/techtips/ssh-agent-forwarding.html

In fact, I admit I had been avoiding using this feature because it adds a security risk and it is best not to use something risky without knowing the internals of why it is risky.

Now that I understand it and can use it, I find it saves me a TON of time. Highly recommended (when it is safe to use, of course!)

Tom

Posted by Tom Limoncelli at April 26, 2012 12:00 PM | Comments (0) | TrackBack

ACM Queue Programming Challenge starts soon!

QueueICPC_coercion.jpg

ACM Queue is hosting an online programming competition on its website from January 15 through February 12, 2012.

Using either Java, C++, C#, Python, or JavaScript, code an AI to compete against other participant's programs in a territory-capture game called, "Coercion".

The competition is open to everyone.

Details at: http://queue.acm.org/icpc/

Posted by Tom Limoncelli at January 11, 2012 10:47 AM | Comments (0) | TrackBack

Config Management Rosetta Stone

Yesterday on the SysAdvent calendar Aleksey Tsalolikhin has an article about configuration management. It includes a comparison of how to the same in in various languages: bash, CFEngine, chef and Puppet. Seeing how the languages differ is very interesting!

SysAdvent: December 19 - Configuration Management

Posted by Tom Limoncelli at December 20, 2011 3:54 PM | Comments (0) | TrackBack

Two interesting Python tutorials

A great explanation about "yield" followed by a discussion of coroutines and more:

In the sequel, he goes into even more detail and the uses all the information to write an operating system in Python.

Posted by Tom Limoncelli at December 15, 2011 10:53 AM | Comments (1) | TrackBack

SSH Fabric for ssh'ing to many hosts

Fabric is a new tool for ssh'ing to many hosts. It has some nice properties, such as lazy execution. You write the description of what is to be done in Python and Fabric takes care of executing it on all the machines you specify. Once you've used it a bunch of times you'll accumulate many "fab files" that you can re-use. You can use it to create large systems too. The API is simple but powerful.

The tutorial gives you a good idea of how it works: http://docs.fabfile.org/en/1.2.2/tutorial.html

It is written using the Paramiko module which is my favorite way to do SSH and SSH-like things from Python.

The Fabric homepage is: http://www.fabfile.org

Thanks to Joseph Kern for this tip!

Posted by Tom Limoncelli at November 8, 2011 12:30 PM | Comments (2) | TrackBack

python-gflags: version 1.6 released

The Google flags parser (available for Python and C++) is very powerful. I use it for all my projects at work (of course) and since it has been open sourced, I use it for personal projects too.

While I support open source 100% I rarely get to submit much code into other people's projects (I contribute to documentation more than code... go figure). So, even though it is only a few lines of new code, I do want to point out that the 1.6 release of the Python library has actual code from me.

One of the neat features of this flags library is that you can specify a file to read the flags from. That is, if your command line is too long, you can stick all or some of the flags in a file and specify "--flagfile path/to/file.flags" to have them treated as if you put them on the command line. Imagine having one flags file that you use in production and another one that points the server to a test database using a higher level of debug verbosity and enabling beta features. You can specify multiple files even with overlapping flags and it does the right thing, keeping the last value.

My patch was pretty simple. I discovered, through a painful incident, that if the flagfile were silently skipped if they were not readable. No warning, no error message. (You can imagine that my discovery was during a frantic "why is this not working???" afternoon.). Anyway... now you get an error instead and the program stops (in python terms... it raises an exception). I think the unit tests are bigger than the actual code but I'm glad the patch was accepted.  I hope nobody was depending on this bug as a "feature". Seriously... nobody would turn off flags via "chmod 000 filename.flags", right? So far I haven't gotten any complaints.

Anyway... if you write code in C++ or Python I highly recommend you give gflags a try. Both are available under the New BSD License on Google Code:

Enjoy!

--Tom

Posted by Tom Limoncelli at September 16, 2011 10:00 AM | Comments (0) | TrackBack

Random technical tips, thoughts and rants

  1. On a Mac, if you SHIFT-CLICK the green dot on a window it opens it as wide and tall as possible (instead of the application-defined behavior)

  2. Even though "ls -l" displays a files permissions as "-rw-r--r--", you can't use "-rw-r--r--" in a chmod command. This is probably one of the most obvious but overlooked UI inconsistencies in Unix that nobody has fixed after all these years. Instead we force people to learn octal and type 0644. Meanwhile every book on Unix/Linux spends pages explaining octal just for this purpose. Time would have been better spent contributing a patch to chmod.

  3. If a network problem always happens 300 seconds after an event (like a VPN coming up or a machine connecting to the network) the problem is ARP, which has to renew every 300 seconds. Similarly, if it times out after exactly 2 hours, the problem is your routing system which typically expires routes after 2 hours of not hearing them advertised.

  4. Git rocks. I should have converted from SubVersion to Git years ago. Sadly I like the name SubVersion better. I hear Hg / Mercurial is better than Git, but Git had better marketing.

  5. Keep all your Unix "dot files" in sync with http://wiki.eater.org/ocd (and I'm not just saying that because my boss wrote it).

  6. People that use advanced Python-isms should not complain when I use features that have been in bash forever and, in fact, were in /bin/sh before most of us knew how to read.

  7. Years ago IETF started telling protocol inventors to avoid using broadcasts and use "local multicast" instead because it will help LAN equipment vendors scale to larger and larger LANs. If your LAN network vendor makes equipment that goes south when there is a lot of multicast traffic because it is "slow path'ed" through the CPU, remind them that They're Doing It Wrong.

  8. The best debugging tool in the world is "diff". Save the output /tmp/old. As you edit your code, write the output to /tmp/new then do "diff /tmp/old /tmp/new". When you see the change you want, you know you are done. Alternatively edit /tmp/old to look like the output you want. You've fixed the bug when diff has no output.

  9. Attend your local sysadmin conference. Regional conferences are your most cost effective career accelerator. You will learn technical stuff that will help you retain your job, do your job better, get promoted, or find a new job. Plus, you'll make local friends and contacts that will help you more than your average call to a vendor tech support line. There are some great ones in the Seattle and NJ/NY/Philly area all listed here.

Posted by Tom Limoncelli at February 16, 2011 10:10 AM | Comments (8) | TrackBack

Don't make your own patch cables.

True story:

My first job out of college we made our own patch cables. Usually we'd make them "on demand" as needed for a new server or workstation. My (then) boss didn't want to buy patch cables even though we knew that we weren't doing a perfect job (we were software people, eh?). Any time we had a flaky server problem it would turn out to be the cable... usually one made by my (then) boss. When he left the company the first policy change we made was to start buying pre-made cables.

That was during the days of Category 3 cables. You can make a Category 3 cable by hand without much skill. With Category 5 and 6 the tolerances are so tight that just unwinding a pair too far (for example, to make it easier to crimp) will result in enough interference that you'll see errors. It isn't just "having the right tools". An Ohm Meter isn't the right testing tool. You need to do a series of tests that are well beyond simple electrical connectivity.

That's why it is so important to make sure the cables are certified. It isn't enough to use the right parts, you need to test it to verify that it will really work. There are people that will install cable in your walls and not do certification. Some will tell you they certified it but they really just plug a computer at each end; that's not good enough. I found the best way to know the certification was really done is have them produce a book of printouts, one from each cable analysis. Put it in the contract: No book, no payment. (and as a fun trick... the next time you do have a flaky network connection, check the book and often you'll find it just barely passed. You might not know how to read the graph, but you'll see the line dip closer to the "pass" line than on the other graphs.)

If your boss isn't convinced, do the math. Calculate how much you are paid in 10 minutes and compare that to the price of the pre-made cable.

Posted by Tom Limoncelli at January 12, 2011 10:00 AM | Comments (7) | TrackBack

Review: The Duplicity Backup System

I needed a way to backup a single server to a remote hard disk. There are many scripts around, and I certainly could have written one myself, but I found Duplicity and now I highly recommend it:

http://duplicity.nongnu.org

Duplicity uses librsync to generate incremental backups that are very small. It generates the backups, GPG encrypts them, and then sends them to another server by all the major methods: scp, ftp, sftp, rsync, etc. You can backup starting at any directory, not just at mountpoints and there is a full language for specifying files you want to exclude.

Installation: The most difficult part is probably setting up your GPG keys if you've never set them up before. (Note: you really, really, need to protect the private key. It is required for restores. If you lose your machine due to a fire, and don't have a copy of the private key somewhere, you won't be able to do a restore. Really. (I burned mine on a few CDs and put them in various hidden places.)

The machine I'm backing up is a virtual machine in a colo. They don't offer backup services, so I had to take care of it myself. The machine runs FreeBSD 8.0-RELEASE-p4 and it works great. The code is very portable: Python, GPG, librsync, etc. Nothing digs into the kernel or raw devices or anything like that.

I wrote a simple script that loops through all the directories that I want backuped, and runs:

duplicity --full-if-older-than 5W --encrypt-key="${PGPKEYID}" $DIRECTORY scp://myarchives@mybackuphost/$BACKUPSET$dir

The "--full-if-older-than 5W" means that it does an incremental backup, but a full back every 35 days. I do 5W instead of 4W because I want to make sure no more than 1 full backup happens every billing cycle. I'm charged for bandwidth and fear that two full dumps in the same month may put me over the limit.

My configuration: I'm scp'ing the files to another machine, which has a cheap USB2.0 1T hard disk. I set it up so that I can ssh from the source machine to the destination machine without need of a password ("PubkeyAuthentication yes"). In the example above "myarchives" is the username that I'm doing the backup to, and "mybackuphost" is the host. Actually I just specify the hostname and use a .ssh/config entry to set the default username to be "myarchives". That way I can specify "mybackuphost" in other shell scripts, etc. SSH aliases FTW!

Restores: Of course, I don't actually care about backups. I only care about restores. When restoring a file, duplicity figures out which full and incremental backups need to be retrieved and decrypted. You just specify the date you want (default "the latest") and it does all the work. I was impressed at how little thinking I needed to do.

After running the system for a few days it was time to do a restore to make sure it all worked.

The restore syntax is a little confusing because the documentation didn't have a lot of examples. In particular, the most common restore situation is not restoring the full backupset, but "I mess up a file, or think I messed it up, so I want to restore an old version (from a particular date) to /tmp to see what it used to look like."

What confused me: 1) you specify the path to the file (or directory) but you don't list the path leading up to the mountpoint (or directory) that was backuped. In hindsight that is obvious but it caught me. What saved me was that when I listed the files, they were displayed without the mountpoint. 2) You have to be very careful to specify where you put the backup set. You specify that on the command line as the source, and you specify the file to be restored in the "--file-to-restore" option. You can't specify the entire thing on the command line and expect duplicity to guess where to split it.

So that I don't have to re-learn the commands at a time when I'm panicing because I just deleted a critical file, I've made notes about how to do a restore. With some changes to protect the innocent, they look like:

Step 1. List all the files that are backuped to the "home/tal" area:

duplicity list-current-files scp://mybackuphost/directoryname/home/tal

To list what they were like on a particular date, add: --restore-time "2002-01-25"

Step 2. Restore a file from that list (not to the original place):

duplicity restore --encrypt-key=XXXXXXXX --file-to-restore=path/you/saw/in/listing scp://mybackuphost/directoryname/home/tal /tmp/restore

Assume the old file was in "/home/tal/path/to/file" and the backup was done on "/home/tal", you need to specify --file-to-restore as "path/to/file", not "/home/tal/path/to/file". You can list a directory to get all files. The /tmp/restore should be a directory that already exists.

To restore the files as of a particular date, add: --restore-time "2002-01-25"

Conclusion: Duplicity is a great piece of engineering. It is very fast, both because they make good use of librsync to make the backups small, but also because they store indexes of what files were backuped so that the entire backup doesn't have to be read just to get a file list. The backup files are small, split across many small files so that not a lot of temp space is required on the source machine. The tools are very easy to use: they do all the machinations about full and incremental sets, so you can focus on what to backup and what to restore.

Caveats: Like any backup system, you should do a "firedrill" now and then and test your restore procedure. I recommend you encapsulate your backup process in a shell script so that you do it the same way every time.

I highly recommend Duplicity.

http://duplicity.nongnu.org

Posted by Tom Limoncelli at October 11, 2010 10:00 AM | Comments (5) | TrackBack

Google Forms

Someone asked me how I did my survey in a way that the data went to a Google spreadsheet automatically. The forms capability is built into the spreadsheet system. You can even do multi-page forms with pages selected based on previous answers.

More info here

Posted by Tom Limoncelli at October 7, 2010 12:59 PM | Comments (0) | TrackBack

Debugging technique: Time the problem

A coworker debugged a problem last week that inspired me to relay this bit of advice:

Nothing happens at "random times". There's always a reason why it happens.

I once had a ISDN router that got incredibly slow now and then. People on the far side of the router lost service for 10-15 seconds every now and then.

The key to finding the problem was timing how often the problem happened. I used a simple once-a-second "ping" and logged the times that the outages happened.

Visual inspection of the numbers turned up no clues. It looked random.

I graphed how far apart the outages happened. The graph looked pretty random, but there were runs that were always 10 minutes apart.

I graphed the outages on a timeline. That's where I saw something interesting. The outages were exactly 10 minutes apart PLUS at other times. I wouldn't have seen that without a graph.

What happens every 10 minutes and other times too? In this case, the router recalculated its routing table every time it got a route update. The route updates came from its peer router exactly every 10 minutes plus any time an ISDN link went up or down. The times I was seeing a 10-minute gap was when we went an entire 10 minutes with no ISDN links going up or down. With so many links, and the fact that they were home users intermittently using their connections, meant that it was pretty rare to go the full 10 minutes with no updates. However, by graphing it the periodic outages were visible.

I've seen other outages that happened 300 seconds after some other event: a machine connects to the network, etc. A lot of protocols do things in 300 second (5 minute) intervals. The most common is ARP: A router expires ARP entries every 300 seconds. Some vendors extend the time any time they receive a packet from the host, others expire the entry and send another ARP request.

What other timeouts have you found to be clues of particular bugs? Please post in the comments!

Posted by Tom Limoncelli at September 27, 2010 10:00 AM | Comments (5) | TrackBack

CSS Positioning

I admit it. I use tables for positioning in HTML. It is easy and works.

However, I just read this excellent tutorial on CSS positioning and I finally understand what the heck all that positioning stuff means.

http://www.barelyfitz.com/screencast/html-training/css/positioning/

I promise not to use tables any more.

I highly recommend this tutorial.

Posted by Tom Limoncelli at September 24, 2010 10:36 AM | Comments (3) | TrackBack

A powerful VIM command I never knew about until now.

Being a long-time "vi" user I find that I am constantly surprised by the little (and not-so-little) enhancements vim has added. One of them is the "inner" concept.

Any vi user knows that "c" starts a c change and the next keystroke determines what will be changed. "cw" changes from where the cursor is until the end of the word. For example, "c$" chances from where the cursor is to the end of the line. Think of a cursor movement command, type it after "c" and you are pretty sure that you will change from where the cursor is to.... wherever you've directed.

"d" works the same way. "dw" deletes word. "d$" deletes to the end of the line. "d^" deletes to the beginning of the line ("^"? "$"? gosh, whoever invented this stuff must have known a lot about regular expressions).

VIM adds the concept of "inner text". Text is structured. We put things in quotes, in parenthesis, between mustaches (that's "{" and "}") and so on. The text between those two things are the "inner text".

So suppose we have:

<span style="clean">Interesting quote here.</span>

but we want to change the style from "clean" to "unruly". Move the cursor anywhere between the quotes and type ci then a quote (read that as "change inner quote"). VIM will seek out the opening and closing quotes that surround the cursor and the next stuff you type will replace it.

It works for all three kinds of quotes (single, double, and backtick), it works for all the various braces: ( { and <. You can type the opening or the closing brace, they both do the same thing.

Therefore you can move the cursor to the word "style" in the above example and type "ci<" to change everything within that tag.

I find this particularly useful when editing python code. I'm often using ci' to change a single quoted string.

If there is an "inner", you'd expect there is an "outer" too, right? (How many of you tried typing co" to see if it worked?) Well, there is an there isn't.

In VIM the opposite of "inner" is "block". A block is kind of special. It don't just include the opening and closing elements plus sometimes a the space or two that follow. Given this text:

  • The quick <span class="foo">>brown</span> fox.

If the cursor is in the <span> element, "cb<" will replace the entire element from the < all the way to the >. The whitespace after the element is also replaced for text-related things like change word (caw) and change sentence (cas).

Not having to move the cursor to the beginning of an element to change the entire thing is a great time saver. It is these little enhancements that makes using VIM so much more pleasant that using VI.

Give it a try!

More information about this is in the "Text Objects" section of Michael Jakl's excellent VIM tutorial.

--Tom

P.S. My second favorite thing about VIM? gVIM (The graphical version of VIM) preserves TABs when you use the windowing system to cut and paste.

Posted by Tom Limoncelli at September 20, 2010 10:00 AM | Comments (5) | TrackBack

The most simple SSH backdoor I could think of.

Remember when you were a little kid and had a clubhouse? Did you let someone in only if they knew "the secret knock"? Lately people have talked about various implementations for doing that with ssh. The technique, called "Port Knocking" permits SSH if someone has touched various ports recently. For example, someone has to ping the machine, then telnet to port 1234, then for the next 2 minutes they can ssh in.

This can be difficult to implement securely, as this video demonstrates: http://www.youtube.com/watch?v=9IrCgCKrv8U

IBM's Developerworks recently posted an article about tightening SSH security. The topic also came up on the mailing list for the New Jersey LOPSA chapter.

I had an idea that I haven't seen published before.

I have a Unix (FreeBSD 8.0) system that is live on the open internet and it is so locked down that I don't permit passwords. To SSH to the machine you have to pre-arranged to set up SSH keys for "passwordless" connections. However, it does not run a firewall because it is literally running with no ports open (except ssh). There is nothing to firewall.

Problem: What if I am stuck and need to log in remotely with a password?

Most of the portknocking techniques I've seen leverage the firewall running on the system. I didn't want to enable a firewall, so I came up with this.

Idea #1: A CGI script to grant access.

Connect to a particular URL, it runs SSH on port 9999 with a special configuration file that permits passwords:

/etc/ssh/sshd_config:

PasswordAuthentication no
PermitEmptyPasswords no
PermitRootLogin no
UsePAM no

/etc/ssh/sshd_config-port9999:

Port 9999
AllowAgentForwarding no
AllowTcpForwarding no
GatewayPorts no
LoginGraceTime 30
MaxAuthTries 3
X11Forwarding no
PermitTunnel no
PasswordAuthentication no
PermitEmptyPasswords no
PermitRootLogin yes
UsePAM yes

Translation: If someone is going to get special access on port 9999, they can't use it to set up tunnels or gateways. It is just for either quick access; enough to fix your SSH keys.

The CGI script is essentially runs:

/usr/sbin/sshd -p 9999 -d

Which permits a single login session on port 9999.

Try #2:

FreeBSD defaults to an inetd that uses Tcpwrappers.

So, try #2 was similar to #1 but appends info to /etc/hosts.allow so that the person has to come from the same IP address as the recent web connection. The problem with that is sometimes people connect to the web via proxies, and adding the proxy to the hosts.allow list isn't going to help.

Try #3:

We all know that you can't run two daemons on the same port number, right? Wrong.

You can have multiple daemons listening on the same port number if they are listening on different interfaces. If two conflict, the connection goes to the "most specific" listening daemon.

What does that mean? That means you can have sshd with one configuration listening on port .22 (any interface, port 22) and another listening on 10.10.10.10.22 (port 22 of the interface configured for address 10.10.10.10). But I only have one interface, you say? I disagree. You have 127.0.0.1 plus your primary IP address, plus any IPv6 addresses. Heck, even if you really only had one IP address, "" and a specific address can both be listening to port 22 at the same time.

That's what the "*" on "netstat -l" means. "Any interface."

So, back to our port knocking configuration.

The normal port 22 sshd runs with a configuration that disables all passwords (only permits SSH keys).

/etc/ssh/sshd_config:

Port 22
ListenAddress 0.0.0.0
ListenAddress ::
PAMAuthenticationViaKBDInt no
PasswordAuthentication no
PermitEmptyPasswords no
PermitRootLogin no
UsePAM no

And the CGI script enables a sshd with this configuration:

/etc/ssh/sshd_config-permit-passwords:

Port 22
ListenAddress 64.23.178.12
ListenAddress fe80::5154:ff:fe25:1234
PAMAuthenticationViaKBDInt no
PasswordAuthentication no
PermitEmptyPasswords no
PermitRootLogin no
UsePAM yes

The wrapper simply runs:

/usr/sbin/sshd -d -f /etc/ssh/sshd_config-permit-passwords

That's all there is to it!

Posted by Tom Limoncelli at September 15, 2010 10:00 AM | Comments (2) | TrackBack

Google Chrome for Mac with multiple profiles

Google Chrome supports multiple profiles. The feature is just hidden until it is ready for prime-time. It is really easy to set up on the Linux and Windows version of Chrome. On the Mac it takes some manual work.

I'm sure eventually the Mac version will have a nice GUI to set this up for you. In the meanwhile, I've written a script that does it:

chrome-with-profile-1.0.tar.gz

Tom

Posted by Tom Limoncelli at September 11, 2010 12:14 PM | Comments (2) | TrackBack

xed 2.0.2 released!

xed is a perl script that locks a file, runs $EDITOR on the file, then unlocks it.

It also checks to see if the file is kept under RCS control. If not, it offers to make it so. RCS is a system that retains a history of a file. It is the predecessor to GIT, SubVersion, CVS and such. It doesn't store the changes in a central repository; it comes from a long-gone era before servers and networks. It simply stores the changes in a subdirectory called "RCS" in the same directory as the file. (and if it can't find that directory, it puts the information in the same directory as the file: named the same as the file with ",v" at the end.)

[More about this little-known tool after the jump.]

The benefit of keeping the change history of a file can not be understated. Can't figure out why a bug suddenly appeared? Now you can use commands like "rlog file", "rcsdiff -r1.2 -r1.4 file" and "co -p -r1.2 >oldfile" to examine old changes.

  • If you work on a team, xed's ability to lock files prevents you from stepping on each others toes.
  • If you work solo, xed's file history is a great way to keep track of changes.
  • If you work on a team, xed's file history is... well, a great way to add accountability to the system.

xed even notices when the last change to the file didn't happen via xed, and offers to clean up for you.

The new version adds a few features that I contributed:

  • Better security and the ability to edit filenames with spaces and funny characters (spaces, quotes, etc.) thanks to changing from Perl's system(string) calls to system(list) calls.
  • The ability to set the RCS changelog message via the environment variable CIMSG.
  • The ability to force non-interactive mode by setting an environment varialbe (INTERACTIVE=0. In non-interactive mode xed that assumes the defaults to any questions it asks.

These new features combine to let me do something I've always wanted to do: Friction-free updates to files: no prompts, no 'are you sure?', no request for changelog statements.

There are certain files that I need to update a lot. One example is my "cribnotes.txt" file where I log little notes that I want to remember later. It is always very tempting for me to get lazy and not record these notes. The rationalization I use is generally "Oh, it's a small note no worth my bother to answer all those 'are you sure?' and other questions. I can surely remember it without writing it down." Oh, how many times I've said that and not been able to remember it later. By making the editing process "friction-free", I don't have that excuse any more.

The alias I've set up is:

bash/ksh/sh: alias note='INTERACTIVE=0 CILOG="updated" xed ~/.cribnotes.txt'

csh/tcsh: alias note '( setenv INTERACTIVE 0 ; setenv CILOG "updated" ; xed ~/.cribnotes.txt'

Now I can simply type "note", record my thought, and save the file. No excuses.

xed is a godsend. It was originally written by the awesome programmer John Linderman at Bell Labs in ksh; later ported to Perl by Cliff Miller. The project page is here: http://www.nightcoder.com/code/xed/

I recommended it in TM4SA and I still recommend it today. No Unix system is complete without it.

Posted by Tom Limoncelli at September 1, 2010 11:00 AM | Comments (2) | TrackBack

Learning From Successful IPv6 Upgrade Projects

I wrote about upgrading to IPv6 in the past, but I have more to say.

The wrong way: I've heard a number of people say they are going to try to convert all of their desktops to IPv6. I think they are picking the wrong project. While this is a tempting project, I think it is a mistake (well-intentioned, but not a good starter project). Don't try to convert all your desktops and servers to IPv6 as your first experiment. There's rarely any immediate value to it (annoys management), it is potentially biting off more than you can chew (annoys you), and mistakes affect people that you have to see in the cafeteria every day (annoys coworkers).

Instead copy the success stories I've detailed below. Some use a "outside -> in" plan, others pick a "strategic value".

Story 1: Work from the outside -> in

The goal here is to get the path from your ISP to your load balancer to use IPv6; let the load balancer translate to IPv4 for you. The web servers themselves don't need to be upgraded; leave that for phase 2.

It is a small, bite-sized project that is achievable. It has a real tangible value that you can explain to management without being too technical: "the coming wave of IPv6-only users will have faster access to our web site. Without this, those users will have slower access to our site due to the IPv4/v6 translaters that ISPs are setting up as a bandaid.". That is an explanation that a non-technical executive will understand.

It also requires only modest changes: a few routers, some DNS records, and so on. It is also a safe place to make changes because your external web presence has a good dev -> qa -> production infrastructure that you can leverage to test things properly (it does, right?).

Technically this is what is happening:

At many companies web services are behind a load balancer or reverse proxy.

ISP -> load balancer -> web farm

If your load balancer can accept IPv6 connections but send out IPv4 connections to the web farm, you can offer IPv6 service to external users just by enabling IPv6 the first few hops into your network; the path to your load balancer. As each web server becomes IPv6-ready, the load balancer no longer needs to translate for that host. Eventually you're entire web farm is native IPv6. Doing this gives you a throttle to control the pace of change. You can make small changes; one at a time; testing along the way.

The value of doing it this way is that it gives customers IPv6 service early, and requires minimal changes on your site. We are about 280 days away from running out of IPv4 addresses. Around that time ISPs will start to offer home ISP service where IPv6 is "normal" and attempts to use IPv4 will result in packets being NATed at the carrier level. Customers in this situation will get worse performance for sites that aren't offering their services over IPv6. Speed is very important on the web. More specifically, latency is important.

[Note: Depending on where the SSL cert lives, that load balancer might need to do IPv6 all the way to the frontends. Consult your load balancer support folks.]

Sites that are offering their services over IPv6 will be faster for new customers. Most CEOs can understand simple, non-technical, value statements like, "new people coming onto the internet will have better access to our site" or "the web site will be faster for the new wave of IPv6-only users."

Of course, once you've completed that and shown that the world didn't end, developers will be more willing to test their code under IPv6. You might need to enable IPv6 to the path to the QA lab or other place. That's another bite-sized project. Another path will be requested. Then another. Then the desktop LAN that the developer use. Then it makes sense to do it everywhere. Small incremental roll-outs FTW!

During Google's IPv6 efforts we learned that this strategy works really well. Most importantly we've learned that it turned out to be pretty easy and not expensive. Is IPv6 code in routers stable? Well, we're now sending YouTube traffic over IPv6. If you know of a better load test for the IPv6 code on a router, please let me know! (Footnote: "Google: IPv6 is easy, not expensive Engineers say upgrading to next-gen Internet is inexpensive, requires small team")

Story 2: Strategic Upgrades

In this story we are more "strategic".

Some people run into their boss's office and say, "OMG we have to convert everything to IPv6". They want to convert the routers, the DNS system, the DHCP system, the applications, the clients, the desktops, the servers.

These people sound like crazy people. They sound like Chicken Little claiming that the sky is falling.

These people are thrown out of their boss's office.

Other people (we'll call these people "the successful ones") go to their boss and say, "There's one specific thing I want to do with IPv6. Here's why it will help the company."

These people sound focused and determined. They usually get funding.

Little does the boss realize that this "one specific thing" requires touching many dependencies. That includes the routers, the DNS system, the DHCP system, and so on. Yes, the same list of things that the "crazy" person was spouting off about.

The difference is that these people got permission to do it.

According to a presentation I saw them give in 2008, Comcast found their 'one thing" to be: Settop box management. Every settop box needs an IP address so they can manage it. That's more IP addresses than they could reasonably get from ARIN. So, they used IPv6. If you get internet service from Comcast, the settop box on your TV set is IPv6 even though the cable modem sitting next to it providing you internet service is IPv4. They had to get IPv6 working for anything that touches the management of their network: provisioning, testing, monitoring, billing. Wait, billing? Well, if you are touching the billing system, you are basically touching a lot of things. Ooh, shiny dependencies. (There used to be a paper about this at http://www.6journal.org/archive/00000265/01/alain-durand.pdf but the link isn't working. I found this interview with the author but not the original paper.)

Nokia found their "one thing" to be: power consumption. Power consumption, you say? Their phones waste battery power by sending out pings to "keep the NAT session alive". By switching to IPv6 they didn't need to send out pings. No NAT, no need to keep the NAT session alive. Their phones can turn off their antenna until they have data to send. That saves power. In an industry where battery life is everything, any CxO or VP can see the value. A video from Google's IPv6 summit details Nokia's success in upgrading to IPv6.

Speaking of phones, T-Mobile's next generation handset will be IPv6-only. Verizon's LTE handsets are required to do IPv6. If you have customers that access your services from their phone, you have a business case to start upgrading to IPv6 now.

In the long term we should be concerned with converting all our networks and equipment to IPv6. However the pattern we see is that successful projects have picked "one specific thing to convert", and let all the dependencies come along for the ride.

Summary:

In summary, IPv6 is real and real important. We are about a year away from running out of IPv4 addresses at which point ISPs will start offering IPv6 service with translation for access to IPv4-only web sites. Successful IPv6 deployment projects seem to be revealing two successful patterns and one unsuccessful pattern. The unsuccessful pattern is to scream that the sky is falling and ask for permission to upgrade "everything". The sucessful patterns tend to be one of:

  • Find one high-value (to your CEO) reason to use IPv6: There are no simple solutions but there are simple explanations. Convert just that one thing and keep repeating the value statement that got the project approved. There will be plenty of dependencies and you will end up touching many components of your network. This will lead the way to other projects.
  • Work 'from the outside -> in": A load balancer that does IPv6<->IPv4 translation will let you offer IPv6 to external customers now, gives you a "fast win" that will bolster future projects, and gives you a throttle to control the speed at which services get native support.

I'd love to hear from readers about their experiences with convincing management to approve IPv6 projects. Please post to the comments section!

-Tom

P.S. Material from the last Google IPv6 conference is here: http://sites.google.com/site/ipv6implementors/2010/agenda

Posted by Tom Limoncelli at August 23, 2010 11:00 AM | Comments (3) | TrackBack

New to Mac OS X? Tips for Unix users

A friend of mine who is an old-time Unix/Linux user asked me for suggestions on how to get used to Mac OS X.

The first mistake that Unix users make when they come to OS X is that they try to use X Windows (X11) because it is what they are used to. My general advice: Don't use X windows. Switching between the two modes is more work for your hands. Stick with the non-X11 programs until you get used to them. Soon you'll find that things just "fit together" and you won't miss X11.

Terminal is really good (continued lines copy and paste correctly! resize and the text reformats perfectly!). I only use X windows when I absolutely have to. Oh, and if you do use X11 and find it to be buggy, install the third-party X replacement called XQuartz (sadly you'll have to re-install it after any security or other updates)

Now that I've convinced you to stick with the native apps, here's why:

  1. pbcopy <file

Stashes the contents of "file" into your paste buffer.

  1. pbpaste >file

Copies the paste buffer to stdout.

  1. pbpaste | sed -e 's/foo/bar/g' | pbcopy

Changes all occurances of "foo" to "bar" in the paste buffer.

  1. "open" emulates double-clicking on an icon.

    open file.txt

If you had double-clicked on file.txt, it would have bought it up in TextEdit, right? That's what happens with "open file.txt". If you want to force another application, use "-a":

open -a /Applications/Microsoft\ Office\ 2008/Microsoft\ Word.app file.txt

Wonder how to start an ".app" from Terminal? Double click it:

open /Applications/Microsoft\ Office\ 2008/Microsoft\ Word.app

Want to find a directory via "cd"ing around on the Terminal, but once you get there you want to use the mouse?

cd /foo/bar/baz
open .

I use this so much I have an alias in my .bash_profile:

alias here="open ."

Now after "find"ing and searching and poking around, once I get to the right place I can type "here" and be productive with the mouse.

  1. Want to use a Unix command on a file you see on the desktop? Drag the icon onto the terminal.

type: od (space) -c (space)

Then drag an icon onto that Terminal. The entire path appears on the command line. If the path has spaces or other funny things the text will be perfectly quoted.

  1. Dislike the File Open dialog? Type "/" and it will prompt you to type the path you are seeking. Name completion works in that prompt. Rock on, File Dialog!

  2. Word processors, spread sheets, video players and other applications that work with a file put an icon of that file in the title bar. That isn't just to be pretty. The icon is useful. CMD-click on it to see the path to the file. Select an item in that path and that directory is opened on the Desktop.

That icon in the title bar is draggable too! Want to move the file to a different directory? You don't have to poke around looking for the source directory so you can drag it to the destination directory. Just drag the icon from the title bar to the destination directory. The app is aware of the change too. Lastly, drag the icon from the title bar into a Terminal window. It pastes the entire path to the file just like in tip 5.

  1. If you want to script the things that Disk Util does, use "hdiutil" and "diskutil". You can script ripping a DVD and burning it onto another one with "hdiutil create" then "diskutil eject" then "hdiutil burn".

  2. rsync for Mac OS has an "-E" option that copies all the weird Mac OS file attributes including resource forks. ("rsync -avPE src host:dest")

  3. "top" has stupid defaults. I always run "top -o cpu". In fact, put this in your profile:

    alias top='top -o cpu'

  4. For more interesting ideas, read the man pages for:

    screencapture mdutil say dscl dot_clean /usr/bin/util pbcopy pbpaste open diskutil hdiutil

Enjoy!

P.S. others have recommended this list: http://superuser.com/questions/52483/terminal-tips-and-tricks-for-mac-os-x

Posted by Tom Limoncelli at August 18, 2010 9:00 AM | Comments (3) | TrackBack

Google Command Line Tool

I try not to use this blog to flog my employer's products but I just used the open source "Google Command Line" program and I have to spread the word... this really rocked.

I wanted to upload a bunch of photos to Picasa. I didn't want to sit there clicking on the web interface to upload each one, I didn't want to import them into iPhoto and then use the Picasa plug-in to upload them. I just wanted to get them uploaded.

Google CL to the rescue! It is a program that lets you access many google properties from the command line. It works on Mac, Linux and Windows. After a (rather simple) install process I was ready to give it a try.

Here's the command line that I typed:

$ google picasa create --title "2010-08-09-Hobart-Tasmania-SAGE-AU" ~/Desktop/PHOTOS-AU/*

I was expecting it to ask me for a username and password but I was surprised when it my web browser popped up, asked me to authorize this script to have permission to log in (just like third-party apps that authenticate against Google), and when I was back at the command line I pressed "return" to continue. The upload began and finished a few minutes later.

In addition to picasa, the command can also access blogger, youtube, docs, contacts and calendar.

Posted by Tom Limoncelli at August 11, 2010 9:18 PM | Comments (2) | TrackBack

Google App Inventor

At the SAAD-NYC event last night I explained how Google App Inventor lets you make apps for Android phones without knowing how to program. It was beta tested "mainly in schools with groups that included sixth graders, high school girls, nursing students and university undergraduates who are not computer science majors."

He said, "Why haven't you written about this amazing thing on your blog?"

I dunno! So here. I'm mentioning it now.

(I think the NY Times article is the best overview.)

Happy, Jim?

Posted by Tom Limoncelli at July 31, 2010 9:13 AM | Comments (1) | TrackBack

So that's how they do it!

Oh that's how they get such amazing speed on a web server! http://www.eecs.harvard.edu/~mdw/papers/seda-sosp01.pdf

In the future, all servers will work like this.

Well worth reading.

Posted by Tom Limoncelli at July 4, 2010 9:20 AM | Comments (1) | TrackBack

Configuration Management Summit, Boston, June 24, 2010

(Reposting this announcement from Dan)

Fellow SysAds etc.-

First, I'd like to make sure you are all aware of the Configuration Management Summit next week in Boston on June 24 (details are at http://www.usenix.org/events/config10/). The first Configuration Management Summit aims to bring together developers, power users, and new adopters of open source configuration management tools for automating system administration. Configuration management is a growth area in the IT industry, and open source solutions, with cost savings and an active user community, are presenting a serious challenge to today's "big vendor" products. Representatives from Bcfg2, Cfengine, Chef, and Puppet will all be participating in the summit - this will be a valuable opportunity if you have been contemplating a configuration management solution for your systems.

There is also a special one-day training on Cfengine being taught by Mark Burgess on June 25 (details are at http://www.usenix.org/events/config10/#tut_cfengine). This class might be a review session for anyone on this mailing list, but it will also offer useful insights for people who are not new to Cfengine. Additionally, If you have colleagues who need to come up to speed on Cfengine quickly, this class will be an excellent opportunity for them to learn Cfengine directly from the author.

If you are interested in either event, you can register at http://www.usenix.org/events/confweek10/registration/ (and if you have questions, you can email me directly). I hope to see you in Boston!

Daniel Klein
Education Director
USENIX

Posted by Tom Limoncelli at June 18, 2010 6:29 AM | Comments (0) | TrackBack

Another Ganeti success story

Lance Albertson wrote up a great description of how Ganeti Virtualization Manager performed under pressure during a power outage:

Nothing like a power outage gone wrong to test a new virtualization cluster. Last night we lost power in most of Corvallis and our UPS & Generator functioned properly in the machine room. However we had an unfortunate sequence of issues that caused some of our machines to go down, including all four of our ganeti nodes hosting 62 virtual machines went down hard. If this had happened with our old xen cluster with iSCSI, it would have taken us over an hour to get the infrastructure back in a normal state by manually restarting each VM.

But when I checked the ganeti cluster shortly after the outage, I noticed that all four nodes rebooted without any issues and the master node was already rebooting virtual machines automatically and fixing all of the DRBD block devices.

Ganeti is a management layer that makes it easy to set up large clusters of Xen or KVM (or other) virutalized machines. He has written a great explanation of what is Ganeti and its benefits too.

I use Ganeti for tons of projects at work.

Posted by Tom Limoncelli at May 21, 2010 10:30 AM | Comments (0) | TrackBack

Your computer room will overheat next weekend

Dear readers in the United States,

I'm sorry. I have some bad news.  That tiny computer closet that has no cooling will overheat next weekend.

Remember that you aren't cooling a computer room, you are extracting the heat.  The equipment generates heat and you have to send it somewhere. If it stays there, the room gets hotter and hotter.

For the past few months you've been lucky.  That room benefited from the fact that the rest of the building was relatively chilly. The heat was drawn out to the rest of the building. During the winter, each weekend the heat was turned off (or down) and your uninsulated computer room leaked heat to the rest of the building. Now it's springtime, nearly summer.  The building A/C is on during the week. When it shuts down for the weekend the building is hot; hotter than your computer room.  The leaking that you were depending on is not going to happen.

Last weekend the temperature of your computer room got warm on Saturday and hot on Sunday. However, it was ok.

This weekend it will get hot on Saturday and very hot on Sunday. It will be ok.

However, next weekend is Memorial Day weekend. The building's cooling will be off for three days. Saturday will be hot. Sunday will very very hot.  Monday will be hot enough to kill a server or two.

If you have some cooling, Monday you'll discover that it isn't enough.  Or the cooling system will be overloaded and any weak, should-have-been-replaced, fan belts will finally snap.

How do we get into this situation?

Telecom closets don't have any cooling because they have no active components. It's just a room where wires connect to wires. That changed in the 1990s when phone systems changed. Now that telecom closet has a PBX, and an equipment rack.  If there is an equipment rack, why not put some PC servers into it? If there is one rack, why not another rack? By adding one machine at a time you never realize how overloaded the system has gotten.

Even if you have proper cooling, I bet you have more computers in that room than you did last year.

So what can you do now to prevent this problem?
  • Ask your facilities person to check the belts on your cooling system.
  • Set up monitoring so you'll be alerted if the room gets above 33 degrees C. (You probably don't have time to buy a environmental monitor, but chances are your router and certain servers have a temperature gauge on or near the hottest part of the equipment. It is most likely hotter than 33 degrees C during normal operation, but you can detect if it goes up relative to a baseline.)
  • Clean (remove dust from) the air vent screens, the fans, and any drives. That dust makes every mechanical component work harder. More stress == more likely to break.
  • Inventory the equipment in the room and shut off the unused equipment (I bet you find at least one server)
  • Inventory the equipment and rank by priority what you can power off if the temperature gets too high.
If you do have a system that overheats, remember that you can buy or rent temporary cooling systems very easily.

I don't generally make product endorsements, but at a previous company we had an overheating problem and it was cheaper and faster to buy a Sunpentown 9000 BTU unit at Walmart than to wait around for a rental. In fact, it was below my spending limit to purchase two and tell the CFO after the fact. I liked the fact that it self-evaporated the water that accumulated; I needed to exhaust hot air, not hot air and water.

Most importantly, be prepared. Have monitoring in place. Have a checklist of what to shut down in what order.

Good luck! Stay cool!

Tom

P.S. I wrote about this 2 years ago.
Posted by Tom Limoncelli at May 21, 2010 6:52 AM | Comments (3) | TrackBack

Motivation for sysadmins to write documentation

My new O'Reilly blogpost about getting the motivation to write docs.

Posted by Tom Limoncelli at March 15, 2010 8:02 AM | Comments (0) | TrackBack

End every helpdesk request on a good note

New blog post up on O'Reilly's Author Blogs.

Posted by Tom Limoncelli at March 11, 2010 9:11 PM | Comments (0) | TrackBack

Making enterprise ShortName service shorter

Previously I wrote about the Google Apps shortname service which lets you set up a tinyurl service for your enterprise.

The article implies that the service can be used without using the FQDN. This is not true. In other words, I had said that "go.example.com/lunch" could be shortened to "go/lunch".

There is a workaround that makes it work. It is difficult to configure, but I've set up a Community Wiki on ServerFault.com that explains all the steps. As a wiki, I hope people can fill in the items I left blank, particularly specific configuration snippets for ISC BIND, Windows DHCP server, Linux DHCP clients, and so on.

The new article is here: How to set up Google ShortName service for my domain, so that the FQDN isn't needed

Posted by Tom Limoncelli at January 26, 2010 4:27 PM | Comments (1) | TrackBack

A "tinyurl" service for your domain

Update 2010-01-26: There is a follow-up article to this here

Update 2009-12-20: Enabling the service wasn't working for a few days. It is now working again. It does not require Premier service. Any Google Apps customer should be able to use it.

Where I work we have a service called "go" which is a tinyURL service. The benefit of it being inside our domain is huge. Since "go" (the shortname) is found in our DNS "search path", you can specify "go" links without entering the FQDN.

That means we can enter "go/payroll" in your browser to get to the payroll system and "go/lunchmenu" to find out what's for lunch today. That crazy 70-char long URL that is needed to get to that third-party web-based system we use? I won't name the vendor, but let me just say that I now get there via "go/expense".

Creating a new shortlink is "self-service". You go to "http://go", fill out the form, and you are done. You don't have to open a ticket. You don't have to wait for your system administrator to create a link. A person with little or no web skills can create a "go" link, mention it in an email or on a sign in the hallway and everyone knows what to do. As a result it has transformed the corporate online culture.

Control-freaks may be appalled that we don't have an official librarian that dictates and enforces a naming standard. But just like Flickr "tags" it just works. People self-organize. "Best practices" organically evolve. When there is a name conflict it tends to get resolved by people talking directly to the owner and negotiating. I'm very happy about that.

What really impresses me is that where I work the "go" service was never officially announced. The people that invented it started using it and people noticed. The idea spread virally. Within a month everyone was using it. People thought to just type "go" (alone) in their web browser and were happy to find a web site that explained what to do.

Wish you had this at your company? If you are a Google Apps customer there is a "labs" app that gives you this kind of thing for your users. You can call it "go" or "t" or "shortlink" or whatever you wish. You, as the system administrator, has to enable it (Dashboard, Add more services, Short Links). Update DNS as the system directs you. Configure your DNS search path needs to be properly configured, which it should be already. However, people can specify the FQDN for links if they are traveling. For example http://go/tom-picture works if your DNS search path is correct, and http://go.whatexit.org/tom-picture can be used if it is not.

What about security? Links may be "public" or "private". Private links don't work unless the user has logged into their Apps domain. "Public" links always work. I set up a private link on my personal domain you can see: http://go.whatexit.org/internal-secrets

There are a few more features. A user can edit their own links. The system can suggest a short link name (hash) if you are feeling uncreative. You can transfer ownership of a link to another user of your domain. There is a search capability. It keeps counts on how often a link is used and therefore can tell you which are the most "popular" links.

The administrative control panel has a large number of optional features: Enable the service for multiple domains. If a link is used more than x times, only an administrator can delete it. Restrict the feature to certain IP subnets. Enable API access (yes, it has an API). Restrict who can create new links. Restrict public shortlinks to a specific list (probably a good idea). And many more.

Some features that I wish it had: The search feature searches the shortlink name, not the destination URL. Therefore I can't search to find out if someone has created a shortlink already. It doesn't give a warning if I'm about to create a public link to a private (internal to my domain) URL.

There are two caveats: First, it is a "lab" product (did you know that Google Apps now has "labs" features just like Gmail?). It is implemented in Google App Engine (did you know that Google Apps can now have privately hosted apps?). Secondly, in the name of transparency I should point out that I'm a Google employee and therefore could be biased. However, I've been using this for eons and have been waiting for the day that I'd be able to talk about it publicly. It is so simple and "just works". If you don't use Google Apps, you might consider writing a simple redirector service for your users. (Or enable Google Apps and only use this feature!)

For more information or to add it to your Google Apps domain, here is the home page.

As with all hosted apps, check the privacy policy and terms of service before use.

If you use this feature please let me know (post a comment) and share your experiences with it!

Posted by Tom Limoncelli at December 9, 2009 10:48 AM | Comments (8) | TrackBack

SysAdvent has begun!

SysAdvent has started its second year.  SysAdvent is a project to count down the 24 days leading to Christmas with a sysadmin tip each day.  Last year Jordan Sissel wrote all 24 days (amazing job, dude!). This year he has enlisted guest bloggers to help out. You might see a post of mine one of these days.

While I don't celebrate the holiday that the event is named after, I'm glad to participate.

Check out this and last year's postings on the SysAdvent Blog: sysadvent.blogspot.com


Posted by Tom Limoncelli at December 3, 2009 7:41 AM | Comments (1) | TrackBack

Can my SLA rule work for networks? Yes.

Last week I mentioned that that if you have a service that requires a certain SLA, it can't depend on things of lesser SLA.

My networking friends balked and said that this isn't a valid rule for networks. I think that violations of this rule are so rare they are hard to imagine. Or, better stated, networking people do this so naturally that it is hard to imagine violating this rule.

However, here are 3 from my experience:


  • Situation: A company who's internet connection is a DSL modem. The modem is in the hallway near the computer room, but not in the computer room. As a result, when someone knocks the modem over, the company's website is down. (web site depending on router). Improvement: move the router into the computer room.
  • A computer room with excellent UPS and power infrastructure... but the router isn't on the UPS for weird historical reasons (it is depending on external power). Improvement: move the router onto the UPS.
  • An excellent computer room with fine ethernet switches... but the router is in the lab one room over. Each VLAN has a physical cable connected to it with a cable that runs to that other room. I was told, "the researchers are doing some experiments on the router so they wanted it in their lab". Improvement: Move the router into the computer room.

3 true stories.

Posted by Tom Limoncelli at November 19, 2009 12:44 PM | Comments (2) | TrackBack

Interview: Design Patterns for System Administrators Training at LISA 2009

Matt Simmons interviews me about "Design Patterns for System Adminsitrators".

This is a tutorial that I've never taught before. You can see it first at LISA 2009 in November.

In case you missed it, Matthew Sacks interviewed me about my other LISA tutorial. That tutorial also has a lot of new material.

Posted by Tom Limoncelli at October 17, 2009 10:41 AM | Comments (0) | TrackBack

Chrome Zygote solves major Shared Library issue

Sysadmins have a love-hate relationship with shared libraries. They save space, they make upgrades easier, and so on.  However, they also cause many problems.  Sometimes they cause versioning problems (Windows DLLs), security problems, and (at least when they were new) performance problems.  I won't go into detail, just mention them on a technical email list and you'll get an earful.

Here's one example that hits me a lot. On my Linux box, if I run an update of Firefox, my current Firefox browser keeps running. However, the next time it needs to load a shared library, it is now loading the upgraded version which is incompatible and my Firefox goes bonkers and/or crashes. On the Mac and Windows this doesn't happen because the installer waits for you to close any Firefox instances before continuing.

Google Chrome browser does its updates in the background while you use it. The user doesn't have to wait for any painful upgrade notification. Instead, the next time they run Chrome they are simply told that they are now running the newest release. I call this a "parent-friendly" feature because the last time I visited my mom much of her software had been asking to be upgraded for months.  I wish it could have just upgraded itself and kept my mom's computer more secure. ACM has an article by the Chrome authors about why automatic upgrades are a key security issue. (with graphs of security attacks vs. upgrade statistics)

However, if Google Chrome upgrades itself in place, how does it keep running without crashing? Well, it turns out, they use a technique called the LinuxZygote.  The libraries they need are loaded at startup into a process which then fork()s any time they need, for example, a renderer. The Zygote pattern is usually done for systems that have a slow startup time. However, they claim that in their testing there was no performance improvement. They do this to make the system more stable.

Read the (brief) article for more info: LinuxZygote


Posted by Tom Limoncelli at August 22, 2009 2:59 PM | Comments (2) | TrackBack

Debian PPC on PowerBook G4: Touchpad and CD-ROM problems

(I'm setting up Debian PPC on an old PowerBook G4.)

The installation went really well.  I downloaded the stable 5.0.2 DVD image, burned it onto a DVD from my Mac (note: Safari warned that the file system might be corrupted, but I ran "md5" on the .iso and the output matched what the web site said it should be) and it booted without incident and I was able to go through the entire installation without fail.  I am cheating a little since I'm not doing a multi-boot.  I hear that is more difficult.

When the machine booted the first time I was able to log in!  Sadly, the touchpad wasn't working, and there was only so much I could do from the keyboard.

Using TAB and SPACEBAR I was able to navigate around a little.  Sometimes I would get into a corner where TAB nor SPACEBAR was really helpful.

Luckily you can always log out of an X11 session by pressing CTRL-OPTION-BACKSPACE. Warning: this zaps the entire X11 window session.  All your apps are instantly killed. You are logged out.  Don't press it unless you mean it.  (And, yes, the keyboard sequence is an homage to CTRL-ALT-DEL).  While this wasn't the best option, sometimes it was all I had.

To fix these problems I thought the best thing to do would be to SSH to it from another machine.  The default Debian configuration doesn't include openssh-server, just the -client.  This is wise from a security standpoint, but wasn't helping me fix the machine.

From the initial login screen I was able to set up a "Failsafe" xterm window.  From there I could become root.  "apt-get install ssh" tried to the right thing, but it couldn't get access the DVD drive.

"ls /dev" wasn't showing very much.  No /dev/sd* or hd* or sr0 (CD-ROM) at all.  This was distressing.  My touchpad wasn't working, my CD-ROM (well, DVD) wasn't showing up.

I couldn't load new packages if the DVD didn't work.  I couldn't fix the machine if I couldn't SSH in.  Ugh.

I searched a lot of web sites for information about how to fix this and nearly gave up.

Finally I remember that in the old days zapping the "PRAM" fixed a heck of a lot of problems.  The PRAM is a battery-backed bit of RAM (or NVRAM) that stores a few critical settings like boot parameters and such.  To zap the PRAM, you boot while holding these four keys: Command, Option, P and R.  It takes some practice.

After zapping the PRAM Debian booted and the mouse and touchpad magically worked.  When I logged in, I could see that the DVD was working.  "apt-get install ssh" worked without a hitch.  The DVD had automatically been detected and mounted.  I was impressed!

"ls /dev" now showed many, many more devices.

Later I installed SSH ("apt-get install ssh"), configure my SSH keys so I can log in easily from my primary computer, and even added the Ethernet MAC address to my DHCP server so that it always gets the same IP address.

To be honest, I don't know if zapping the PRAM fixed it or it was the reboot.  udevd may not have started (I forgot to check).  Either way, I was very happy that things worked.  I started up a web browser, went to www.google.com and when it came up it felt like home.


Posted by Tom Limoncelli at July 19, 2009 4:19 PM | Comments (3) | TrackBack

Use Nagios to monitor for Dell systems warranty expirations

You know that here at E.S. we're big fans of monitoring.  Today I saw on a mailing list a post by Erinn Looney-Triggs who wrote a module for Nagios that uses dmidecode to gather a Dell's serial number then uses their web API to determine if it is near the end of the warantee period.  I think that's an excellent way to prevent what can be a nasty surprise.

Link to the code is here: Nagios module for Dell systems warranty using dmidecode

What unique things do you monitor for on your systems?
Posted by Tom Limoncelli at May 26, 2009 8:06 PM | Comments (8) | TrackBack

Google enables IPv6 for most services (but there is a catch!)

Google has enabled IPv6 for most services but ISPs have to contact them and verify that their IPv6 is working properly before their users can take advantage of this.

I'm writing about this to spread the word.  Many readers of this blog work at ISPs and hopefully many of them have IPv6 rolled out, or are in the process of doing so.

Technically here's what happens:  Currently DNS lookups of www.google.com return A records (IPv4), and no AAAA records (IPv6).  If you run an ISP that has rolled out IPv6, Google will add you (your DNS servers, actually) to a white-list used to control Google's DNS servers.  After that, DNS queries of www.google.com will return both an A and AAAA record(s).

What's the catch?  The catch is that they are enabling it on a per-ISP basis. So, you need to badger your ISP about this.

Why not just enable it for all ISPs?  There are some OSs that have default configurations that get confused if they see an AAAA record yet don't have full IPv6 connectivity.  In particular, if you have IPv6 enabled at your house, but your ISP doesn't support IPv6, there is a good chance that your computer isn't smart enough to know that having local IPv6 isn't the same as IPv6 connectivity all the way across the internet.  Thus, it will send out requests over IPv6 which will stall as the packets get dropped by the first non-IPv6 router (your ISP).

Thus, it is safer to just send AAAA records if you are on an ISP that really supports IPv6.  Eventually this kind of thing won't be needed, but for now it is a "better safe than sorry" measure.  Hopefully if a few big sites do this then the internet will become "safe" for IPv6 and everyone else won't need to take such measures.

If none of this makes sense to you, don't worry. It is really more important that your ISP understands.  Though, as a system administrator it is a good idea to get up to speed on the issues.  I can recommend 2 great books:
The Google announcement and FAQ is here: Google announces "Google over IPv6". Slashdot has an article too.
Posted by Tom Limoncelli at January 8, 2009 2:05 PM | Comments (5) | TrackBack

SMS is nearly "free" for telecom carriers

Everyone from Slashdot to people I talk with on the street are shocked, shocked, shocked, by the report in the New York Times that TXTing costs carriers almost nothing, even though they've been raising the price dramatically.  (SMS is "Short Message Service", the technical name for what Americans call "TXTing" and what the rest of the world calls "SMS".)

People have asked me, "Is this true?" (it is) so I thought this would be a good time to explain how all of this works.

The phone system uses a separate network for "signaling" i.e. messages like "set up a phone call between +1-862-555-1234 and  +353(1)555-1234".  The fact that it is a separate network is for security.  When signally was "in band" it was possible for phone users to play the right tones and act just like an operator (see Phreaking).  It is also for speed reasons; one wants absolute priority for signaling data.

The protocol is called "SS7" (Signaling System 7).  Like most teleco protocols it is difficult to parse and ill-defined.  This is how telcos keep new competition from starting.  They hype SS7 as something so complicated that only rocket scientists could ever understand it.  Of course, it is an ITU standard, so it isn't a secret how it works.  You just have to pay a lot of money to get a copy of the standard. In fact, once Cisco had a working SS7 software stack the downfall of Lucent/AT&T/others was only years away.  Heck, Cisco published a book demystifying SS7.  It turns out the emperor had no clothes and Cisco wanted everyone to know.  SS7 is big and scary, but only as bad as most protocols. I guess SMTP or SNMP would be scary too if you had never seen a protocol before. (Remember that non-audio networks are still "new" to the telecom world, or at least their executives.)

SS7 is all about setting up "connections".  When I dial a number, SS7 packets are sent out that query databases to translate the phone number I want to dial to a physical address to connect to, then an SS7 query goes out to request that all the phone switches from point A to point B allocate bandwidth and start letting audio through.  The nomenclature dates back to what was used when phone calls were set up by ladies sitting in front of switchboards.

What makes international dialing work is that there are SS7 gateways between all the carriers.  They don't charge each other for this bandwidth because it is just the cost of doing business.  The logs of what calls are actually made is used to create billing records, and the carrier do charge each other for the actual calls.  Thus, there is no charge for the SS7 packets between AT&T and O2 (O2 is a big cell provider in Europe), but O2 does back-bill AT&T for the phone call that was made. (This is called "Settlement" and my previous employer processed 80% of the world's settlement records on behalf of the phone companies.)

Setting up a connection for an SMS would be silly.  An entire connection for just a 160-byte message?  No way.  That's more trouble than it is worth.  Therefore, SMS is the only service where the actual service is provided over SS7.  The 160-byte limit comes from a limit in SS7 packet size.

However, the phone companies don't really do anything for free.  The SMS records are used to construct billing data and the companies certainly do back-bill each other for SMS carried by each other's networks.  If you SMS from AT&T to O2, there is settlement going on after the fact. However, SMS between two AT&T customers has no real cost.

"Multimedia SMS" (photos) are not sent over SS7, though SS7 is used to setup/teardown the connection just like a phone call.  If they were smart they'd use SS7 to just transmit an email address and then send the photo over the internet.  It would probably be cheaper.  (Though, when has a telco has a well-run email system?  Sigh.)

So, SMS is "free" because it rides on the back of pre-existing infrastructure.  The "cost" is due to the false economics created to "extract value" out of the system (i.e. "charge money").

If they were doing it all from scratch, they could probably run it all over the internet for "free" too.  Heck, it wouldn't be much bandwidth even if people learned to type 100x faster.

Why was SMS permitted to use SS7 unlike any other service? The real reason, I'm told, wasn't entirely technical.  It was due to the fact that the telecos thought that nobody would actually use the service. Little did they know that it would catch on among teens and then spread!

More info:
Posted by Tom Limoncelli at December 30, 2008 10:52 AM | Comments (3) | TrackBack

Amazon's Kindle

I got a demo of Amazon's Kindle the other day and was very impressed. I hadn't realized that it had a built-in cellphone-based data connection so you could always download more content. The speed was a little slow, but for reading a book I think it was perfect. I'm considering getting one.

Today I got email from Amazon reminding me that if I shill for them on my blog, readers can get a $100 discount. You just have to apply for an Amazon credit card and use this link.

Do I feel bad about shilling for Amazon? Well, not if it gets my readers a $100 discount. It is a product that friends of mine are happy with and I'm impressed by the demos I've seen.

Posted by Tom Limoncelli at August 25, 2008 10:06 AM | Comments (0) | TrackBack

April showers bring May Flowers... but May brings...

April Showers bring May Flowers. What does May bring? Three-day weekends that make A/C units fail!

This is a good time to call your A/C maintenance folks and have them do a check-up on your units. Check for loose or worn belts and other problems. If you've added more equipment since last summer your unit may now be underpowered. Remember that if your computers consume 50Kw of power, your A/C units should be using about the same (or more) to cool those computers. That's the laws physics speaking, I didn't invent that rule. The energy it takes to create heat equals the energy required to remove that much heat.

Why do A/C units often fail on a 3-day weekend? During the week the office building has its own A/C. The computer room's A/C only has to remove the heat generated by the equipment in the room. On the weekends the build's A/C is powered off and now the 6 sides (4 walls, floor and ceiling) of the computer room are getting hot. Heat seeps in. Now the computer room's A/C unit has more work to do.

A 3-day weekend is 84 hours (Friday 6pm until Tuesday 6am). That's a lot of time to be running continuously. Belts wear out. Underpowered units overheat and die. Unlike a home A/C unit which turns on for a few minutes out of every hour, a computer-room A/C unit ("industrial unit") runs 40-50 minutes out of every hour. Something running that much has to be specially engineered.

Most counties have a 3-day weekend in May. By the 2nd or 3rd day the A/C unit is working as much as a typical day during the summer. If your computer room doesn't survive that weekend, imagine a summer full of days just like it.

To prevent a cooling emergency make sure that your monitoring system is also watching the heat and humidity of your room. There are many SNMP-accessible units for less than $100. If you detect temperatures of 38 degrees C you should be alerted. More if that rises to 40 within 30 minutes it is unlikely that the temperature will go down on its own. You can reduce some of the heat in the room by simply shutting down some non-essential machines (The Practice of System and Network Administration has tips about creating a "shutdown list"). Having the ability to remotely power off machines can save you a trip to the office. Lacking that, shutting down a machine will make it generate less heat even if it is powered up. Sitting at a "press any key to boot" prompt often generates little heat compared to a machine that is actively processing. If powering off the non-critical machines isn't enough, shut down critical equipment but not the equipment involved in letting you access the monitoring systems (usually the network equipment). That way you can bring things back up remotely. Of course, as a last resort you'll need to power off those bits of equipment too.

Having cooling emergency? Cooling units can be rented on an emergency basis to help you through a failed cooling unit, or to supplement a cooling unit that is underpowered. There are many companies looking to help you out with a rental unit.

If you have a small room that needs to be cooled (a telecom closet that now has a rack of machines) I've had good luck with a $300 unit available at Walmart. For $300 it isn't great, but I can buy one in less than an hour without having to wait for management to approve the purchase. Heck, for $300 you can buy two and still be below the spending limit of a typical IT manager. The Sunpentown 1200 and the Amcor 12000E are models that one can purchase for about $600 that re-evaporates any water condensation and exhausts it with the hot air. Not having to empty a bucket of water every day is worth the extra cost. The unit is intended for home use, so don't try to use it as a permanent solution. (Not that I didn't use it for more than a year at one company.) It has one flaw... after a power outage it defaults to being off. I guess that is typical of a consumer unit. Be sure to put a big sign on it that explains exactly what to do to turn it back on after a power outage. (The sign I made says step by step what buttons to press, and what color each LED should be if it is running properly. I then had a non-system administrator test the process.)

In summary: test your A/C units now. Monitor them, especially on the weekends. Be ready with a backup plan if your A/C unit breaks. Do all this and you can prevent an expensive and painful meltdown.

Posted by Tom Limoncelli at May 9, 2008 1:07 PM | Comments (5) | TrackBack

HostDB 1.002 released!

A few years ago I released HostDB, my simple system for generating DNS domains. The LISA paper that announced it was called: HostDB: The Best Damn host2DNS/DHCP Script Ever Written.

I just released 1.002 which adds some new features that make it easier to generate MX records for domain names with no A records, and not generate NS records for DNS masters. Other bug fixes and improvements are included.

HostDB is released under the GPL, supported on the HostDB-fans mailing list, and supported by the community. This recent release includes patches contributed by Sebastian Heidl.

HostDB 1.002 is now available for download.

Posted by Tom Limoncelli at February 6, 2008 8:36 PM | Comments (2) | TrackBack

Easier Xen management with Google Ganeti

Managing Xen instances is a drag. So my buddies in the Google Zürich office built a system for managing them . Now life is great! The team I manage has put Xen clusters all over the world, all managed with Ganeti. It rocks. I'm proud to see it is available to everyone now under a GPLv2 license.

When I first heard the name, I thought it sounded like an new kind of Italian dessert. But what do you expect from a guy with a last name like "Limoncelli"?

Posted by Tom Limoncelli at August 31, 2007 11:07 AM | Comments (0) | TrackBack

Hardware password recovery

Hardware didn't used to have passwords. Your lawnmower didn't have a password, your car didn't have a password, and your waffle iron didn't have a password.

But now things are different. Hardware is much smarter and now often requires a password. Connecting to the console of a Cisco router asks for a password. A Honda Prius has an all-software entry system.

My first experience with being "locked out" of my own hardware was a Cisco router in 1991. Luckily every Cisco device has a way to work around the password. In fact, Cisco maintains the Password Recovery Process web site that links to the procedure for every device they've ever made. Now one might say, "but if there is a way to work around a password, it isn't very secure, is it?" These procedures all require some kind of physical access. A Sun workstation requires you to press L1 and "A" at the same time, and these keys are only on a physical console. Some appliances require you to boot them while holding down a button. Requiring that the machine be powered up while holding down a certain button means you have access to the power switch, and if you have access to the power switch you can perform one heck of a denial of service attack. If you have physical access there are worse things you can do than reset the password, like smash the box with a sledge hammer.

In the 2nd edition of The Practice of System and Network Administration one of the updates we needed to include was our discussion of physical access to machines. The first edition was written mostly in 1999/2000 and at the time remote access to consoles was something that only Unix servers did, and Windows servers were just starting to get KVM switches that permitted over-the-network access (IP-KVMs). The 2nd edition tries to treat the issue more evenly-handed since both Windows and Unix communities now recognize the benefit of remote consoles. We also re-emphasize the importance of security in KVM and other remote-console access systems. Hardware designers assume that physical access will be restricted. Adding a remote-console system means attackers no longer need physical access to attack the console.

I've always looked for the ability to reset hardware passwords on any new equipment I buy. I make sure there are three ways to access these instructions. On my "sysadmins wiki" I make a link to the instructions on the vendor's web site, and I copy those instructions into the website so that I have a recent copy just in case I need them when I don't have internet access. My third copy is a printed copy that I tape to the side of the device (be careful not to block the vents).

Not every company realizes the importance of this. I recently bought a used Sony tape library (LIB-162) and couldn't find a way to reset the password. Luckily I didn't need the password for the basic functionality required to do backups. However, to get access to the web-based administration system or do software upgrades one needs the password. The previously owner doesn't know the password thus I am stuck.

The manual says that one can't reset the password and I should call my dealer. I figured they just didn't want the information spread around, so I contacted Sony and they gave me the right people to speak with. A very helpful person names Lucia informed me that I could send in the device and for only $899 they would reset the password. That seemed unreasonable, but escalating it brought me no joy. John Marshall, Customer Service/Support Manager at Sony (not to be confused with the Chief Justice nor the famous percussionist) was very polite and friendly, but was not able to tell me the secrets to doing the process myself. I even offered to sign a non-disclosure if the process was secret. No luck. He offered to reduce the rate to $699 but that was unstatisfactory.

More and more of the products that sysadmins deal with are sold as appliances. It's a relatively new industry (I'm being sarcastic) so companies are still figuring out the "norms" that customers expect. I'm not sure what the entire list should be, but I know that the ability to reset the configuration without spending $899 should be one of them.

So in the meanwhile I'll be using only the minimal features of the device which is ok because I had planned on using this purely for a hobby project. However, I can't recommend that anyone purchase products from Sony Storage until they stop designing their products this way. There is too much business risk in a product like the Sony LIB-162 AIT tape library at any price.

Posted by Tom Limoncelli at May 31, 2007 10:59 AM | Comments (6) | TrackBack

Anti-spam trick: Grey listing

There is an anti-spam technique called "Grey Listing" which has almost completely eliminated spam from my main server. What's left still goes through my SpamAssassin and Amavis-new filters, but they have considerably much less work to do.

The technique is more than a year old but I've only installed a greylist plug in recently and I'm impressed at how well it works. I hope by writing this article other people that have procrastinated will decide to install a greylist system.

(for those that want technical specifics, I'm using Postfix plus Postgrey. If you use FreeBSD, just do "portinstall mail/postgrey" assuming you are already using Postfix. Sendmail users, please post some comments directing people to the Milter equivalent!)

So how does grey listing work?

Well, you know that a "black list" is a list of sites you block, and a "white list" is a list of sites that you always permit. A grey list is somewhere in between.

The basic principle is that spammers don't retry an email that couldn't be delivered. There are two kinds of "can't be delivered" (actually, more than that but two are important here). One is a "hard failure"... the email can't be delivered and nothing is going to fix it. For example, you are trying to send email to an account that doesn't exist. The second type is a "soft failure", which is a problem that is temporary. In other words, a disk is full, or there is some kind of system problem that will be fixed soon. If you get a "hard error" the email is bounced. If you get a "soft failure" the sending server is supposed to wait a bit of time and retry. That's why when you run out of disk space email stops flowing, but when you fix the problem (delete that out-of-control log file or whatever) you suddenly get a flood of backlogged email.

Spammers don't retry sending email whether it is a hard or soft failure. When you are sending email to tens of millions of addresses, its too difficult to keep track of failure codes. Besides, even if they don't get their spam sent to 20% of their list, they're still sending it to millions of addresses. Good enough, eh?

So here's how grey listing works. The first time someone tries to send you email, send a "soft error" result code. If they reply more than 5 minutes later, then actually accept it. If they are a spammer you'll never get a retry. If they are legitimate then you'll get a retry.

Implementing this is extremely simple. When someone tries to send email, gather 3 other item of information: the source IP address, the From:, the To:. Maintain a database of these 3-tuples. If you haven't seen that 3-tuple before, send the "soft failure" code. If you have seen that 3-tuple already and it was more than 5 minutes ago, accept the message.

It's amazingly simple yet it seems to be blocking about 80% of my spam right now.

Now, you may be thinking, "I can't have a 5-minute delay on all my email! That's crazy!" Well don't worry. Systems like Postgrey take this all one step further. For example, if 5 emails get through in the last month, Postgrey decides this IP address must be ok and adds it to a list that is "white listed".

Thus, the system tunes itself. Common senders immediately get into the whitelist (Yahoo, gmail, and so on). Site that disappear eventually get expired from the list because you don't hear from them in 30 days. That makes the database self-cleaning. All maintenance is automatic.

I can't believe I didn't install this years ago!

--Tom

P.S. I've also added "reject_non_fqdn_hostname" to the Postfix variable "smtpd_helo_restrictions". That means that when an STMP server issues a "HELO hostname" the email is rejected if "hostname" isn't a FQDN. This rejects about 80% of the spam I'm getting... most of which just sends "HELO friend". I haven't had any complaints from users about false-positives since I implemented this a month ago. This technique reduced spam by 80% and Postgrey reduced spam by a different-but-overlaping 80%. When both are enabled, I receive very little spam. Enough for Amavis-new and SpamAssassin to take care of easily.

Posted by Tom Limoncelli at April 20, 2006 4:32 PM | Comments (3) | TrackBack

Today's Unix Security Trivia

If you write to a file that is SUID (or SGID) the SUID (and SGID) bits on the file are removed as a security precaution against tampering (unless uid 0 is doing the writing).

(See FreeBSD 5.4 source code, sys/ufs/ffs/ffs_vnops.c:739)

Posted by Tom Limoncelli at March 22, 2006 10:23 PM | Comments (0) | TrackBack

The Jifty buzz

Everyone that has seen me speak knows that I love RT for tracking user requests. I was IMing with the author of RT today and he said that for his next product he realized he should first write a good tool that lets him make AJAXy applications without having to do all the work manually. He's done that, and its called Jifty. Now he's building apps based on that. The first one has as many features as RT but is 1/10th the code base. Awesome! Sounds like Jifty is going to be a big hit! (You can find Jifty in CPAN already.)

Oh, and what's the new app called? Hiveminder.

Let the rumors fly! :-)

Posted by Tom Limoncelli at March 2, 2006 7:33 PM | Comments (3) | TrackBack

Compressing logs is good

It's obvious but I didn't think of one particular reason why until the end of this journey.

Read more...

[ This is a first draft. Feedback is greatly appreciated.]

It's obvious but I didn't think of it until the end. Compressed logs are good. Really good.

I just had a "disk full" situation on /var. No problem, a little "du -sk *" and I identify the problem. /var/logs is huge, nearly the whole 1G disk allocated to /var. I do "du -sk /var/log/*" and discover that FreeBSD's default Apache installation puts all its logs in "/var/log/web/" and that is the real culprit.

No worries. I made /var too small and I'll solve this the way I always do: Move the directory to another place and make a symbolic link.

I'm documenting what I did because it might be educational to people new to such things.

First some background...

I don't have this problem on my other server because there I have a custom Apache config that puts everything in /home/apache (/home/apache/logs, /home/apache/conf, /home/apache/this and /home/apache/that). /home is huge, so I don't worry too much. I was caught off-guard on this server. As you see, for more than one reason.

One of my annoyances with Apache is that every operating system has a different layout of where the various Apache files are kept. I am a bear of little brain, so on most machines I create a directory called /home/web and then make symbolic links in it for conf, logs and htdocs that point to where ever that OS decided to put the configuration files, the logs, and the documents. It really saves me a lot of time.

On my Solaris server I custom-compile Apache and made my own "layout" file so that when I build a new release it will be configured for my particularly "/home/apache" layout. I've done this for so long that I often forget that everyone doesn't do this.

When I set up my new server with FreeBSD I was so happy with the "ports" system for installing things like Apache that I forgot to note that it was putting the logs on /var/log/web, which is a small partition on that system. Actually, I didn't "forget to note" this. I noticed it enough that I made a symbolic link from /home/web/logs to /var/log/web. So obviously I noticed it, but I didn't stop to think, "Is that a good place for logs?" and now that that disk has filled I realize that I should have taken the time back then to store the logs someplace with more room.

Anyway...

I have three policies for how I store weblogs. First, I use a custom log so that I record referal data. Second, I never throw away logs. Third, I keep the logs for each virtual host in a different file. Thus, I have everythingsysadmin.com-access_log (and -error_log) as well as, for example, whatexit.org-access_log (and -error_log). The log for everythingsysadmin.com was extremely big, thus my disk space problem.

Moving the files with minimal downtime

I'm very particular about this kind of surgery. I want to move the data without corrupting it, I don't want to make mistakes, and I want my web server to be down for the least amount of time.

Therefore I rsync'ed the data "live", then shut down Apache, rsync'ed the data again, moved the log directory, created the symbolic link, and restarted Apache. This minimized the outage, which is especially important today since my book was mentioned on Slashdot today and I was expecting a decent number of hits.

Here's what I did in more detail:


mkdir -p /home/web/logs.new ; cd /var/log/web && rsync -avP . /home/web/logs.new/.

We're copying live data. That's bad. The moment we're done with the copy more will appear. However, we'll mitigate that. Read on.

Notice the caution in this statement. We make the new directory and use "-p" so that if it already exists it won't be an error. Then the "cd" is joined to the "rsync" with "&&". This means "don't do the second command if the first command failed." In other words, if I mistyped "cd /var/log/web" then the rsync won't be attempted. This is good because normally a failed "cd" might have left me in "/home/root" and I wouldn't want all those files copied to /home/web/logs. Also I don't use the "-R" (relative) option to rsync. Instead, I make sure that the source and destination directories already exist, and then specify them both as "." (or "blah/blah/blah/."). I do this because "rsync" has different behavior depending on whether it had to create the destination directory or not. That's bad. Bad and confusing. Bad, confusing and frustrating when I'm developing a command that I'm going to run more than once.

As another precaution, I did that command in a window I opened just for that purpose. I'm going to run that sequence of commands a few times, so now I can just use command history to run it over and over. Why is this important? Because I don't want to re-type this long sequence of commands every time I do the process. I want it 100% repeatable. I could make it into a shell script, but that's overkill.

Also note that I'm not copying the data to "/home/web/logs" but to "/home/web/logs.new". That's because /home/web/logs is a symbolic link to /var/log/web, and it would be silly to copy things to where they already are. Scripts and cronjobs might be accessing /home/web/logs so I don't want to muck with it until I'm ready.

While that's copying I used a different window to construct this command:


cd /home/web && mv web web.old ; mv web.new web ;
cd /var/log && mv web web.old ; ln -s ../../home/web/logs web

(I actually constructed this as one long line, but it is split here to be more readable.)

(To be clear, I typed this command but didn't execute it.)

The first part of this moves /home/web/logs out of the way and moves the newly copied log directory into place. The second part of this moves my current log directory to "web.old" and makes a symbolic link to the new location.

Now I move my mouse to that other window and repeat the rsync command. This time it should run a lot quicker because rsync is an incremental copy. If it sees that the data hasn't changed much, it only copies "what's new". (And if you want to know how it does that, read this amazing transcript of a lecture by the author.)

The second copy went very quickly just as I expected. That's a good sign. If it tool a long time I'd start checking to see if I had mistyped the command. If it happened instantly, I'd be worried because it should be fast but not instant.

Now I'm ready to make "the big switch".

Here's what I did:

  1. Re-run the rsync
  2. Immediately do a apachectl stop (this shuts down the web server)
  3. Re-run the rsync again. This time it should be extremely fast, nearly a noop.
  4. Press ENTER on that command line that switches around the directories and synlinks.
  5. Test! "cd /home/web/logs" and "cd /var/log/web" and make sure you get the expected results.
  6. Restart Apache with apachectl start
  7. Test the web sites I host to make sure they're still working.

And during that 7-step process, don't forget to breath. It has to happen quickly, but not if "fast" means "I'm going to make mistakes".

So what about compression?

Well, when I was setting up this FreeBSD server and was very impressed by the "ports" collection (which is like RPM's from Linux, except it doesn't suck). So impressed that I forgot that there was more work to be done.

I have a script that rotates the weblogs when they get too big. It's a trickey task because I want to rotate them when they get to a certain size, not every so-and-so days. However, if you rotate the -access_log you have to rotate the -error_log too. The files are then compressed, but only after being rotated. I wrote a script that I use on my Solaris server.

I copied the script over to this server, checked it for portability issues, and ran it. Since the files had not been rotated or compressed in ages, it rotated nearly every file and then started compressing them. Web logs compress down to 1% or 2% of their original size. It's quite impressive.

The "disk full" problem was, fundamentally, that the script wasn't running. If the logs aren't compressed, they take 1Gig of space instead of 10Meg. In fact, at 10Meg they could have stayed in the original place. However, I didn't notice that until the entire process was done.

Oh well. Hindsight really is 20/20!


P.S. On the other hand, having them on /home is much better than /var for other reasons. I tend to be a little more careful about backing up /home.


Update:

Why not newsyslog.conf? An excellent question.

First, I already had a script that did exactly what I wanted. I want all my servers to have the same, repeatable process.

Secondly, the script is able to move the -error_log file if any -access_log is moved. I don't think newsyslog.conf can do parallel moves.

Lastly, I don't use a ".0", ".1", ".2" system. Instead, I use .YYYYMMDD:HHMMSS. That way I can process logs easier. Since I'm keeping them forever, this is better than .0, .1, .2. I don't think newsyslog.conf can do that (though I haven't done a lot of research). Since I'm keeping them forever, I don't want to rotate the files (doing n renames for n files), I just want to do 1 rename for each file.

Posted by Tom Limoncelli at February 10, 2006 9:36 PM | Comments (5) | TrackBack

Raised Floors not sufficient for datacenters?

techtarget.com reports:
The problem is, directing cold air is like trying to herd cats. Air is unpredictable. Your cooling unit is sucking in air, cooling it and then throwing it up through a perforated floor. But you have little control over where that air is actually ending up.
Two different vendors are promoting more aggressive cooling systems for modern racks.

Posted by Tom Limoncelli at November 3, 2005 7:36 PM | Comments (0) | TrackBack

Monad, Microsoft's answer to Bash

Ars Technica has an excellent article about MSH.

If you love perl and/or bash, you'll be interested in reading this tutorial. It gives some excellent examples that explain the language.

Posted by Tom Limoncelli at October 24, 2005 2:09 PM | Comments (0) | TrackBack

Always the friendly sysadmin

"When I see a person I don't recognize in the office, I always smile, stop, introduce myself, and ask for the person's name. I then ask to read it off his ID badge "to help me remember it. I'm a visual learner." New people think I'm being friendly. I'm really checking for trespassers."
This and other great tips can be found in here.

Posted by Tom Limoncelli at October 15, 2005 1:30 PM | Comments (0) | TrackBack

Destroy CDs and DVDs easily

Now you all know what I want for Christmas!

Posted by Tom Limoncelli at September 30, 2005 10:15 AM | Comments (1) | TrackBack

Solaris users: Blastwave.org needs our help!

A while back I recommend BlastWave as a great source of pre-built binaries for Solaris. Their service has saved me huge amounts of time.

Sadly, they are running low on funds. It's expensive to keep a high-profile web site like this up and running. Corporate donors are particularly needed.

I just donated $50. I hope you consider donating to them too. Otherwise, in less than 48 hours, they may have to shut down.

Posted by Tom Limoncelli at August 31, 2005 1:10 PM | Comments (0) | TrackBack

Solaris package tip

Since I'm more of an OS X/FreeBSD/Linux person lately, I've gotten a bit out of touch with Solaris administration. I was quite pleasently surprised to find CSW - Community SoftWare for Solaris which includes hundreds of pre-built packages for Solaris. More importantly, it provided the three I really needed and didn't have time to build. :-)

The system is really well constructed. I highly recommend it to everyone. Give this project your support!

Posted by Tom Limoncelli at May 22, 2005 10:26 AM | Comments (2) | TrackBack