2013-02-25

A hacky fix for SuperDuper running out of space: "delete first" with rsync

Rsync has a possibility to delete the to-be-deleted files in the destination before it starts syncing files that actually need syncing.

That is sorely missing in Shirt Pocket's program SuperDuper!, which is actually a nice backup program, and for some time was one of the few sane options available to make fully reliable backups. SuperDuper! just starts copying blindly, and can then find itself in the situation where the backup destination can't contain the new files + the old files that should be deleted but still weren't.

So that problem would be solved with rsync's "delete first" behaviour. I see there have been people complaining about this in Shirt Pocket's forums for at least 5 years, and the developers seem to only say "yes, we will do something sometime".

But they still didn't. So, this is the command line to be used:

sudo rsync --delete --existing --ignore-existing --recursive  /Volumes/ORIGINAL_VOLUME/ /Volumes/BACKUP_VOLUME

Beware:
  • ORIGINAL_VOLUME has a trailing slash, BACKUP_VOLUME hasn't
  • sudo is there so rsync can delete files not owned by the current user. Of course, that makes the command more dangerous. Adding the option --dry-run shows which files will be actually deleted
  • Why not use rsync to make the full backup? That might an option, but some years ago rsync was unable to copy all the metadata used by OS X, so the backup might not be "good enough". Not even the Apple-modified, Apple-provided rsync in OS X did it right. Again, that was at least 5 years ago, so things might have changed. And anyway, rsync is rather designed to work through a "slow" link between the two volumes – say, disk-computer-network-computer-disk. It will work locally, of course, but it might happen that you'd end faster just making a full copy – rsync might not save anything, and might actually take more.

3 comments

  1. There are two big problems with "delete first" behavior.

    First, it requires a two-pass approach. You need to scan both drives, determine the course of action, then execute. That can double the time it takes to perform all backups to cover the rather rare case (despite protestations) where you run out of space with the occasional "union" issue.

    Second, you're working on a single copy of the destination, rather than an incremental/historical set, and your data on the backup only exists in one place. "Move" operations (eg directory renames and the like), with delete first, delete what may be the only "good" copy of the data, should the source be corrupt, before recopying, putting the data more at risk (since you'll remove the known good data before copying the data that may have gone bad, since it hasn't been read, often for some time).

    Yes, this is sometimes a pain. And we do have some ideas, as I've said in the forums for some time, but they're complicated and the case is not terribly common, so while we've made progress on some interesting improvements, they haven't made their way into the released program yet.

    Hope that's of interest. Feel free to write me via email at Shirt Pocket to follow-up.

    --Dave

    ReplyDelete
  2. It is not always needed, but people have been complaining and asking for it for 5 years. So, why not make it a configuration option?

    Or why not make it a fallback? You could start the normal backup, and when the situation of not-enough-space is found, instead of failing hard, pause the backup and start the delete-first procedure. Most users would not even find themselves in the situation, so no time penalty.

    You could even ask the user before using such fallback. Better giving an option to continue than to just stop cold, even more so when the canonical solution (making a simple call to rsync) would be immediate, since it is installed by default in OS X.

    Also, I would have reported this solution in your forums, in one of the threads complaining about the lack of delete-first. But I couldn't even register since registrations seem to be "disabled by the admin".

    ReplyDelete
  3. Another thing: one of the official Shirt Pocket's answers in the forums to this problem seems to be to manually look for no-longer-existing files in the backup and delete them, again manually.

    Which is nasty.

    Do you really expect that to be less error-prone than simple usage of rsync? Not to talk about time-efficient. Am I expected to do that every time?

    I'd prefer using that time just making a normal backup (like SuperDuper demo does) instead of the fragile smart backup done by the paid SuperDuper. Or, say, even just investigating if the latest rsync does correct backups!

    ReplyDelete