It's obvious but I didn't think of one particular reason why until the end of this journey.
[ This is a first draft. Feedback is greatly appreciated.]
It's obvious but I didn't think of it until the end. Compressed logs are good. Really good.
I just had a "disk full" situation on /var. No problem, a little "du -sk *" and I identify the problem. /var/logs is huge, nearly the whole 1G disk allocated to /var. I do "du -sk /var/log/*" and discover that FreeBSD's default Apache installation puts all its logs in "/var/log/web/" and that is the real culprit.
No worries. I made /var too small and I'll solve this the way I always do: Move the directory to another place and make a symbolic link.
I'm documenting what I did because it might be educational to people new to such things.
First some background...
I don't have this problem on my other server because there I have a custom Apache config that puts everything in /home/apache (/home/apache/logs, /home/apache/conf, /home/apache/this and /home/apache/that). /home is huge, so I don't worry too much. I was caught off-guard on this server. As you see, for more than one reason.
One of my annoyances with Apache is that every operating system has a different layout of where the various Apache files are kept. I am a bear of little brain, so on most machines I create a directory called /home/web and then make symbolic links in it for conf, logs and htdocs that point to where ever that OS decided to put the configuration files, the logs, and the documents. It really saves me a lot of time.
On my Solaris server I custom-compile Apache and made my own "layout" file so that when I build a new release it will be configured for my particularly "/home/apache" layout. I've done this for so long that I often forget that everyone doesn't do this.
When I set up my new server with FreeBSD I was so happy with the "ports" system for installing things like Apache that I forgot to note that it was putting the logs on /var/log/web, which is a small partition on that system. Actually, I didn't "forget to note" this. I noticed it enough that I made a symbolic link from /home/web/logs to /var/log/web. So obviously I noticed it, but I didn't stop to think, "Is that a good place for logs?" and now that that disk has filled I realize that I should have taken the time back then to store the logs someplace with more room.
I have three policies for how I store weblogs. First, I use a custom log so that I record referal data. Second, I never throw away logs. Third, I keep the logs for each virtual host in a different file. Thus, I have everythingsysadmin.com-access_log (and -error_log) as well as, for example, whatexit.org-access_log (and -error_log). The log for everythingsysadmin.com was extremely big, thus my disk space problem.
Moving the files with minimal downtime
I'm very particular about this kind of surgery. I want to move the data without corrupting it, I don't want to make mistakes, and I want my web server to be down for the least amount of time.
Therefore I rsync'ed the data "live", then shut down Apache, rsync'ed the data again, moved the log directory, created the symbolic link, and restarted Apache. This minimized the outage, which is especially important today since my book was mentioned on Slashdot today and I was expecting a decent number of hits.
Here's what I did in more detail:
mkdir -p /home/web/logs.new ; cd /var/log/web && rsync -avP . /home/web/logs.new/.
We're copying live data. That's bad. The moment we're done with the copy more will appear. However, we'll mitigate that. Read on.
Notice the caution in this statement. We make the new directory and use "-p" so that if it already exists it won't be an error. Then the "cd" is joined to the "rsync" with "&&". This means "don't do the second command if the first command failed." In other words, if I mistyped "cd /var/log/web" then the rsync won't be attempted. This is good because normally a failed "cd" might have left me in "/home/root" and I wouldn't want all those files copied to /home/web/logs. Also I don't use the "-R" (relative) option to rsync. Instead, I make sure that the source and destination directories already exist, and then specify them both as "." (or "blah/blah/blah/."). I do this because "rsync" has different behavior depending on whether it had to create the destination directory or not. That's bad. Bad and confusing. Bad, confusing and frustrating when I'm developing a command that I'm going to run more than once.
As another precaution, I did that command in a window I opened just for that purpose. I'm going to run that sequence of commands a few times, so now I can just use command history to run it over and over. Why is this important? Because I don't want to re-type this long sequence of commands every time I do the process. I want it 100% repeatable. I could make it into a shell script, but that's overkill.
Also note that I'm not copying the data to "/home/web/logs" but to "/home/web/logs.new". That's because /home/web/logs is a symbolic link to /var/log/web, and it would be silly to copy things to where they already are. Scripts and cronjobs might be accessing /home/web/logs so I don't want to muck with it until I'm ready.
While that's copying I used a different window to construct this command:
cd /home/web && mv web web.old ; mv web.new web ;
cd /var/log && mv web web.old ; ln -s ../../home/web/logs web
(I actually constructed this as one long line, but it is split here to be more readable.)
(To be clear, I typed this command but didn't execute it.)
The first part of this moves /home/web/logs out of the way and moves the newly copied log directory into place. The second part of this moves my current log directory to "web.old" and makes a symbolic link to the new location.
Now I move my mouse to that other window and repeat the rsync command. This time it should run a lot quicker because rsync is an incremental copy. If it sees that the data hasn't changed much, it only copies "what's new". (And if you want to know how it does that, read this amazing transcript of a lecture by the author.)
The second copy went very quickly just as I expected. That's a good sign. If it tool a long time I'd start checking to see if I had mistyped the command. If it happened instantly, I'd be worried because it should be fast but not instant.
Now I'm ready to make "the big switch".
Here's what I did:
- Re-run the rsync
- Immediately do a
apachectl stop(this shuts down the web server)
- Re-run the rsync again. This time it should be extremely fast, nearly a noop.
- Press ENTER on that command line that switches around the directories and synlinks.
- Test! "cd /home/web/logs" and "cd /var/log/web" and make sure you get the expected results.
- Restart Apache with
- Test the web sites I host to make sure they're still working.
And during that 7-step process, don't forget to breath. It has to happen quickly, but not if "fast" means "I'm going to make mistakes".
So what about compression?
Well, when I was setting up this FreeBSD server and was very impressed by the "ports" collection (which is like RPM's from Linux, except it doesn't suck). So impressed that I forgot that there was more work to be done.
I have a script that rotates the weblogs when they get too big. It's a trickey task because I want to rotate them when they get to a certain size, not every so-and-so days. However, if you rotate the -access_log you have to rotate the -error_log too. The files are then compressed, but only after being rotated. I wrote a script that I use on my Solaris server.
I copied the script over to this server, checked it for portability issues, and ran it. Since the files had not been rotated or compressed in ages, it rotated nearly every file and then started compressing them. Web logs compress down to 1% or 2% of their original size. It's quite impressive.
The "disk full" problem was, fundamentally, that the script wasn't running. If the logs aren't compressed, they take 1Gig of space instead of 10Meg. In fact, at 10Meg they could have stayed in the original place. However, I didn't notice that until the entire process was done.
Oh well. Hindsight really is 20/20!
P.S. On the other hand, having them on /home is much better than /var for other reasons. I tend to be a little more careful about backing up /home.
Why not newsyslog.conf? An excellent question.
First, I already had a script that did exactly what I wanted. I want all my servers to have the same, repeatable process.
Secondly, the script is able to move the -error_log file if any -access_log is moved. I don't think newsyslog.conf can do parallel moves.
Lastly, I don't use a ".0", ".1", ".2" system. Instead, I use .YYYYMMDD:HHMMSS. That way I can process logs easier. Since I'm keeping them forever, this is better than .0, .1, .2. I don't think newsyslog.conf can do that (though I haven't done a lot of research). Since I'm keeping them forever, I don't want to rotate the files (doing n renames for n files), I just want to do 1 rename for each file.