Recently we were having the most difficult time planning what should have been a simple upgrade. There is a service we use to collect monitoring information (scollector, part of Bosun). We were making a big change to the code, and the configuration file format was also changing.
The new configuration file format was incompatible with the old format.
We were concerned with a potential Catch-22 situation. Which do we upgrade first, the binary or the configuration file? If we put the new RPM in our Yum repo, machines that upgrade to this package will not be able to read their configuration file and that's bad. If we convert everyone's configuration file first, any machine that restarts (or if the daemon is restarted) will find the new configuration file and that would also be bad.
The configuration files (old and new) are generated by the same configuration management system that deploys the new RPMs (we use Puppet at Stack Exchange, Inc.). So, in theory we could specify particular RPM package versions and make sure that everything happens in a coordinated manner. Then the only problem would be newly installed machines, which would be fine because we could pause that for an hour or two.
But then I realized we were making a lot more work for ourselves by ignoring the old Unix adage: If you change the file format, change the file name. The old file was called
scollector.conf; the new file would be
scollector.toml. (Yes, we're using TOML).
Now that the new configuration file would have a different name, we simply had Puppet generate both the old and new file. Later we could tell it to upgrade the RPM on machines as we slowly roll out and test the software. By doing a gradual upgrade, we verify functionality before rolling out to all hosts. Later we would configure Puppet to remove the old file.
This reminds me of the fstab situation in Solaris many years ago. Solaris 1.x had an
/etc/fstab file just like Linux does today. However, Solaris 2.x radically changed the file format (mostly for the better). They could have kept the filename the same, but they followed the adage and for good reason. Many utilities and home-grown scripts manipulate the
/etc/fstab file. They would all have to be rewritten. It is better for them to fail with a "file not found" error right away, then work away and modify the file incorrectly.
This technique, of course, is not required if a file format changes in an upward-compatible way. In that case, the file name can stay the same.
I don't know why I hadn't thought of that much earlier. I've done this many times before. However the fact that I didn't think of it made me think it would be worth blogging about it.