backing up with duply

Duplicity is one of a newer generation of backup tools for Linux and other Unix-like operating systems that follow a decentralized, client-side approach similar to distributed version control systems like git. Duply is a wrapper script front-end to duplicity that reminded me of tools like ruby on rails in the app dev space. I’ve just completed a revamp of my home backup regime that relies on duply to archive data on my Linux workstations and servers.

A duplicity based solution wasn’t my first choice this time around. What I was really looking for was a cross-platform solution with an intuitive gui interface that would be easy enough for my family to use without assistance. In fact I came close to going with CrashPlan, a product I’d kicked the tires on some time ago. The only thing that stopped me was the bad reaction the Gnome session manager on my Linux laptop and workstation had to CrashPlan running in the background.

This past weekend started out with some careful study of the available doc for Amanda Backup and Bacula.

It didn’t take long for me to determine that Amanda wasn’t going to work very well as a home backup solution, it’s too tape centric and cumbersome for a small network like mine. Bacula looked promising at first, but after a few hours struggling with different instructions for configuring it on CentOS I wasn’t able to get all 3 services that comprise it to run. My suspicion is that the authors of the guides I used left out some important chunks of information in their composition.

I was already familiar with duplicity as the back end for the woefully inadequate gui personal backup tool called deja-dup. From what I learned in working with deja-dup, duplicity itself provided was a proven backup solution. The problem with duplicity was that to make it useful you needed to script a wrapper for its configuration. Which is where duply comes in.

Getting duply to work required a couple of preliminary steps: generating ssh and gpg keys. The ssh keys would be used to perform passwordless logins to the backup server using ssh, while the gpg keys would be used to encrypt the backup files for privacy.

Besides backup and restore, duply has a create function for generating backup configurations and a status function for listing and checking up on the backups created.

Creating a backup profile is as easy as

duply {profile-name} create

Each profile’s two configuration files are kept in /etc/duply/{profile-name}.

The conf file contains the basic configuration, including definition of the source and target directories, public gpg key ID and passphrase, and items like how often full and incremental backups should be performed, files purged, the location to store metadata (other than the default “/tmp”) and the backup method to use (e.g. ssh, s3, file).

GPG_KEY='A233FAA'
GPG_PW='secret'
TARGET='ssh://root@backup.example.com//data/backup/newhost'
SOURCE='/'
MAX_AGE=1M
MAX_FULL_BACKUPS=1
MAX_FULLS_WITH_INCRS=1
MAX_FULLBKP_AGE=1M
DUPL_PARAMS="$DUPL_PARAMS --full-if-older-than $MAX_FULLBKP_AGE " 
ARCH_DIR=/data/backup/newhost/.duply-cache

The exclude file contains a listing of the directories to exclude and include in the backup.

- /root/tmp
- /root/Downloads
- /root/.cache
- /root/.gvfs
- /home/*/tmp
- /home/*/Downloads
- /home/*/.cache
- /home/*/.gvfs
+ /root
+ /home
+ /etc
+ /boot
+ /usr/local/etc
+ /usr/local/bin
- **

Running a backup just requires a

duply {profile-name} backup

Listing available backup sets can be done with a

duply {profile-name} status

While restoring it involves invoking

duply {profile-name} restore

Restoring to somewhere other than the original location is done like

duply {profile-name} restore [new-location]

In my environment a cron job has been set up on my servers and workstation to run a duply backup once a week.

Of course now that I have a working Linux backup solution, the next step is to find as compelling a choice for Windows.

NOTES:

I am still trying to determine the best way to backup a laptop running duply would be, at least one other than running the script every 15 minutes.

This entry was posted in System Administration, Systems Analysis on by .

About phil

My name is Phil Lembo. In my day job I’m an enterprise IT architect for a leading distribution and services company. The rest of my time I try to maintain a semi-normal family life in the suburbs of Raleigh, NC. E-mail me at philipATlembobrothersDOTcom. The opinions expressed here are entirely my own and not those of my employers, past, present or future (except where I quote others, who will need to accept responsibility for their own rants).