Awesome Conferences

Compressing logs is good

It's obvious but I didn't think of one particular reason why until the end of this journey.

Read more...

[ This is a first draft. Feedback is greatly appreciated.]

It's obvious but I didn't think of it until the end. Compressed logs are good. Really good.

I just had a "disk full" situation on /var. No problem, a little "du -sk *" and I identify the problem. /var/logs is huge, nearly the whole 1G disk allocated to /var. I do "du -sk /var/log/*" and discover that FreeBSD's default Apache installation puts all its logs in "/var/log/web/" and that is the real culprit.

No worries. I made /var too small and I'll solve this the way I always do: Move the directory to another place and make a symbolic link.

I'm documenting what I did because it might be educational to people new to such things.

First some background...

I don't have this problem on my other server because there I have a custom Apache config that puts everything in /home/apache (/home/apache/logs, /home/apache/conf, /home/apache/this and /home/apache/that). /home is huge, so I don't worry too much. I was caught off-guard on this server. As you see, for more than one reason.

One of my annoyances with Apache is that every operating system has a different layout of where the various Apache files are kept. I am a bear of little brain, so on most machines I create a directory called /home/web and then make symbolic links in it for conf, logs and htdocs that point to where ever that OS decided to put the configuration files, the logs, and the documents. It really saves me a lot of time.

On my Solaris server I custom-compile Apache and made my own "layout" file so that when I build a new release it will be configured for my particularly "/home/apache" layout. I've done this for so long that I often forget that everyone doesn't do this.

When I set up my new server with FreeBSD I was so happy with the "ports" system for installing things like Apache that I forgot to note that it was putting the logs on /var/log/web, which is a small partition on that system. Actually, I didn't "forget to note" this. I noticed it enough that I made a symbolic link from /home/web/logs to /var/log/web. So obviously I noticed it, but I didn't stop to think, "Is that a good place for logs?" and now that that disk has filled I realize that I should have taken the time back then to store the logs someplace with more room.

Anyway...

I have three policies for how I store weblogs. First, I use a custom log so that I record referal data. Second, I never throw away logs. Third, I keep the logs for each virtual host in a different file. Thus, I have everythingsysadmin.com-access_log (and -error_log) as well as, for example, whatexit.org-access_log (and -error_log). The log for everythingsysadmin.com was extremely big, thus my disk space problem.

Moving the files with minimal downtime

I'm very particular about this kind of surgery. I want to move the data without corrupting it, I don't want to make mistakes, and I want my web server to be down for the least amount of time.

Therefore I rsync'ed the data "live", then shut down Apache, rsync'ed the data again, moved the log directory, created the symbolic link, and restarted Apache. This minimized the outage, which is especially important today since my book was mentioned on Slashdot today and I was expecting a decent number of hits.

Here's what I did in more detail:


mkdir -p /home/web/logs.new ; cd /var/log/web && rsync -avP . /home/web/logs.new/.

We're copying live data. That's bad. The moment we're done with the copy more will appear. However, we'll mitigate that. Read on.

Notice the caution in this statement. We make the new directory and use "-p" so that if it already exists it won't be an error. Then the "cd" is joined to the "rsync" with "&&". This means "don't do the second command if the first command failed." In other words, if I mistyped "cd /var/log/web" then the rsync won't be attempted. This is good because normally a failed "cd" might have left me in "/home/root" and I wouldn't want all those files copied to /home/web/logs. Also I don't use the "-R" (relative) option to rsync. Instead, I make sure that the source and destination directories already exist, and then specify them both as "." (or "blah/blah/blah/."). I do this because "rsync" has different behavior depending on whether it had to create the destination directory or not. That's bad. Bad and confusing. Bad, confusing and frustrating when I'm developing a command that I'm going to run more than once.

As another precaution, I did that command in a window I opened just for that purpose. I'm going to run that sequence of commands a few times, so now I can just use command history to run it over and over. Why is this important? Because I don't want to re-type this long sequence of commands every time I do the process. I want it 100% repeatable. I could make it into a shell script, but that's overkill.

Also note that I'm not copying the data to "/home/web/logs" but to "/home/web/logs.new". That's because /home/web/logs is a symbolic link to /var/log/web, and it would be silly to copy things to where they already are. Scripts and cronjobs might be accessing /home/web/logs so I don't want to muck with it until I'm ready.

While that's copying I used a different window to construct this command:


cd /home/web && mv web web.old ; mv web.new web ;
cd /var/log && mv web web.old ; ln -s ../../home/web/logs web

(I actually constructed this as one long line, but it is split here to be more readable.)

(To be clear, I typed this command but didn't execute it.)

The first part of this moves /home/web/logs out of the way and moves the newly copied log directory into place. The second part of this moves my current log directory to "web.old" and makes a symbolic link to the new location.

Now I move my mouse to that other window and repeat the rsync command. This time it should run a lot quicker because rsync is an incremental copy. If it sees that the data hasn't changed much, it only copies "what's new". (And if you want to know how it does that, read this amazing transcript of a lecture by the author.)

The second copy went very quickly just as I expected. That's a good sign. If it tool a long time I'd start checking to see if I had mistyped the command. If it happened instantly, I'd be worried because it should be fast but not instant.

Now I'm ready to make "the big switch".

Here's what I did:

  1. Re-run the rsync
  2. Immediately do a apachectl stop (this shuts down the web server)
  3. Re-run the rsync again. This time it should be extremely fast, nearly a noop.
  4. Press ENTER on that command line that switches around the directories and synlinks.
  5. Test! "cd /home/web/logs" and "cd /var/log/web" and make sure you get the expected results.
  6. Restart Apache with apachectl start
  7. Test the web sites I host to make sure they're still working.

And during that 7-step process, don't forget to breath. It has to happen quickly, but not if "fast" means "I'm going to make mistakes".

So what about compression?

Well, when I was setting up this FreeBSD server and was very impressed by the "ports" collection (which is like RPM's from Linux, except it doesn't suck). So impressed that I forgot that there was more work to be done.

I have a script that rotates the weblogs when they get too big. It's a trickey task because I want to rotate them when they get to a certain size, not every so-and-so days. However, if you rotate the -access_log you have to rotate the -error_log too. The files are then compressed, but only after being rotated. I wrote a script that I use on my Solaris server.

I copied the script over to this server, checked it for portability issues, and ran it. Since the files had not been rotated or compressed in ages, it rotated nearly every file and then started compressing them. Web logs compress down to 1% or 2% of their original size. It's quite impressive.

The "disk full" problem was, fundamentally, that the script wasn't running. If the logs aren't compressed, they take 1Gig of space instead of 10Meg. In fact, at 10Meg they could have stayed in the original place. However, I didn't notice that until the entire process was done.

Oh well. Hindsight really is 20/20!


P.S. On the other hand, having them on /home is much better than /var for other reasons. I tend to be a little more careful about backing up /home.


Update:

Why not newsyslog.conf? An excellent question.

First, I already had a script that did exactly what I wanted. I want all my servers to have the same, repeatable process.

Secondly, the script is able to move the -error_log file if any -access_log is moved. I don't think newsyslog.conf can do parallel moves.

Lastly, I don't use a ".0", ".1", ".2" system. Instead, I use .YYYYMMDD:HHMMSS. That way I can process logs easier. Since I'm keeping them forever, this is better than .0, .1, .2. I don't think newsyslog.conf can do that (though I haven't done a lot of research). Since I'm keeping them forever, I don't want to rotate the files (doing n renames for n files), I just want to do 1 rename for each file.

Posted by Tom Limoncelli in Technical Tips

No TrackBacks

TrackBack URL: http://everythingsysadmin.com/cgi-bin/mt-tb.cgi/910

5 Comments | Leave a comment

Just a tip for your FreeBSD box: NEWSYSLOG.CONF(5). I mean, why re-invent the wheel?

Unless I'm missing a finer point for why you're doing it this way, it seems a tad over-elaborate. I find that it's possible to rename the logfiles while the Apache process is still running, without disrupting its ability to continue writing to those files. My theory is that this is because it already has those logfiles open, and so it isn't referring to them by their filenames. I'm guessing that at a sufficiently low level, the file access calls are instead referring to the inode for the file, which doesn't change when you rename the file. But I haven't verified this. Anyway, what I usually do is: rename the file; HUP the main Apache process; verify that the new logfiles have been created; copy the rotated/renamed logfiles at my leisure, without worrying about their contents changing underneath me. If the filesystem is totally full (preventing you from creating a file at all), then you need to get creative (but that's an orthogonal problem). Whether this technique is dependent on an implementation quirk of the OS, I'm not certain of. However I do know that it works fine under Linux and Solaris (haven't tried a BSD yet). I believe this is how logrotate does its thing. My apologies if you were already aware of this method.

A couple other points: 1) I believe there's a typo in your post. At some point you mention "web", "web.new", and "web.old" in "/home/web". Shouldn't these be "logs", "logs.new", and "logs.old"? Otherwise it seems inconsistent and confusing. 2) Does your blog engine recognize a simplified markup, or subset of HTML? In my first attempt, I tried using basic HTML tags in my comments, but they got stripped-out, and my paragraphs got mashed together. (PS: Thanks for the interesting reading, and I look forward to reading your time-management book.)

I'm surprised your not using an LVM. It's quite handy when you undersize the partitions where logs are kept.

I think I solve this fairly cleanly. We have a mix of Solaris & Linux webserver farms. I always compile PHP and Apache from scratch. When I set up Apache, I just do --prefix /etc/httpd, and let it install everything under there. I place ordinary, non-virtualhost logs under /home/httpd/logs/apache, and each virtualhost's logs in /home/httpd/logs/sitename.com.

Under every folder in the logs directory, there's an 'archive' folder. VirtualHost sites go under /home/httpd/sites/sitename.com, which are synced by rsync off a central fileserver.

Furthermore, the virtualhost sets up are included from httpd.conf, and are each found in /etc/httpd/conf/entries/sitename.com. Every night, before logrotations which occur at midnight, another script runs through /etc/httpd/conf/entries/*, egreps each for '[[:space:]]+[A-Z][a-z]+Log', grabs the file portion, dirname's them and then creates a file /etc/logrotate.d/httpd with entries that look like:

/home/httpd/logs/pepsiworld.com/* {
daily mail [email protected]
notifempty
copytruncate
olddir /home/httpd/logs/pepsiworld.com/archive
postrotate
cd /home/httpd/logs/pepsiworld.com/archive
if ls *.1 >/dev/null 2>&1
then
for i in *.1;
do
FILE=$(basename $i .1).$(date -d yesterday +%F)
/bin/mv $i $FILE
gzip $FILE
done
fi
endscript
}

I use copytruncate because a few times (years ago admittedly), a graceful restart hung, and I hated being paged at midnight to get up and fix things.

I keep access_log's around 6 months, error_logs 2 weeks, and a special io_log that records bytes transferred in and out, plus response times and some stuff 1 day (a perl script generates stats for each site daily from that log). Also, nightly, all httpd access_logs on every webserver gets rsync'd to a webtrends server, put in a format /var/log/httpd/[short machine name]/[sitename]/archive/access_log.YYYY-MM-DD.gz. After the rsync, a script builds a directory tree structure /logs/sitename]/[number of webserver]. Each webserver gets names like sh-web1, sh-web2, pepsi-web1 thru pepsi-web3, etc. So /logs/pepsiworld.com/2 is a symlink to /var/log/httpd/pepsi-web2/pepsiworld.com/archive. That way, our webtrends guys never have to bother me to find out where the logs for a website are, except to ask the total number of servers in the webfarm. This is the server where we keep logs for a guaranteed 2 years, and the server from which all tape backup of logs is done from. After 2 years, they go on permanent tape archive. This box is Solaris, with Veritas's LVM. I've had to adjust the /var partition a couple times.

Grrr! Your comment section doesn't observe whitespace or html tags.

Hello,

Mixi日記に投稿するExtensionが作れるってことで、作成中です, この機能を利用すれば、BlogEngine.NETに投稿した内容(リンクなど)を同時に。 :D

Leave a comment