A co-worker of mine recently noticed that I tend to use
rsync in a way he hadn't seen before:
rsync -avP --inplace $FILE_LIST desthost:/path/to/dest/.
Why the "slash dot" at the end of the destination?
I do this because I want predictable behavior and the best way to achieve that is to make sure the destination is a directory that already exists. I can't be assured that
/path/to/dest/ exists, but I know that if it exists then "." will exist. If the destination path doesn't exist,
rsync makes a guess about what I intended, and I don't write code that relies on "guesses". I would rather the script fail in a way I can detect (shell variable
$?) rather than have it "guess what I meant"; which is difficult to detect.
rsync makes a guess? Yes.
rsync changes its behavior depending on a number of factors:
- is there one source file or multiple source files?
- is the destination a directory, a file, or doesn't exist?
There are many permutations there. You can eliminate most of them by having a destination directory end with "slash dot".
- Example A:
rsync -avP file1 host:/tmp/file
- Example B:
rsync -avP file1 file2 host:/tmp/file
host:/tmp/file exists. In that case, Example A copies the file and renames it in the process. Example B will fail because
rsync's author (and I think this is the right decision) decided that it would be stupid to copy
/tmp/file and then copy
file2 over it. This is the same behavior as the Unix
cp command: If there are multiple files being copied then the last name on the command line has to be a directory otherwise it is an error. The behavior changes based on the destination.
Let's look at those two examples if the destination name doesn't exist:
- Example C:
rsync -avP file1 host:/tmp/santa
- Example D:
rsync -avP file1 file2 host:/tmp/santa
In these examples assume that
/tmp/santa doesn't exist. Example C is similar to Example A:
rsync copies the file to
/tmp/santa i.e. it renames it as it copies. Example B, however,
rsync will assume you want it to create the directory so that both files have some place to go. The behavior changes due to the number of source files.
Remember that debugging, by definition, is more difficult than writing code. Therefore, if you write code that relies on the maximum of your knowledge, you have, by definition, written code that is beyond your ability to debug.
Therefore, if you are a sneaky little programmer and use your expertise in the arcane semantics and heuristics of rsync, congrats. However, if one day you modify the script to copy multiple files instead of one, or if the destination directory doesn't exist (or unexpectedly does exist), you will have a hard time debugging the program.
How might a change like this happen?
- Your source file is a variable
$SOURCE_FILESand occasionally there is only one source file. Or the variable represents one file but suddenly it represents multiple.
- The script you've been using for years gets updated to copy two files instead of one.
- Over time the list of files that need to be copied shrinks and shrinks and suddenly is just single file that needs to be copied.
- Your destination directory goes away. In the example that my coworker noticed, the destination was
/tmp. Well, everyone knows that
/tmpalways exists, right? I've seen it disappear due to typos, human errors, and broken install scripts. If
/tmpdisappeared I would want my script to fail.
It is good rsync hygiene to end destinations with "/." if you intend it to be a directory that exists. That way it fails loudly if the destination doesn't exist since rsync doesn't create the intervening subdirectories. I do this in scripts and on the command line. It's just a good habit to get into.
P.S. One last note. Much of the semantics described about change if you add the "
-R". They don't get more consistent, they just become different. If you use this option make sure you do a lot of testing to be sure you cover all these edge cases.