Software engineering blog of Clément Bouillier: Data backup solution based on RSync with a NAS

Monday, January 3, 2011

Data backup solution based on RSync with a NAS

I have once experienced in the past some limited data loss due to a hard disk crash, and lastly, my first external hard drive starts to have some issues…I can just reiterate popular recommendations to think seriously to backup previous data as soon as you got more and more alerts, like repeated hard drive scans at startup (or when you plug it for external drives), suspicious behavior when reading data on drive…that’s what I have done lastly and I avoid lost of plenty of personal photos and videos…
From that moment forward, I decided to set a permanent backup solution. After having a look at web hosted solutions (not convinced completely convinced), I finally went for my own NAS, a DLink DNS-323, which is really easy to configure and extend (Linux embedded). It was also a chance to get hands dirty with Linux toys Sourire (long time…), but don’t be afraid to try! (except if you are just able to write documents and mails with a computer…else it could take you several long nights to get it running)

Rsync over SSH as the main toys

Rsync is an incremental files synchronization software for Unix systems. It is command-line based, but could be really powerful along with scripts. I let you search over the web for details on this tools, I will only show how I use it for backup. Note that there is several shared solutions around RSync. I was particularly inspired by wiki.dns323.info and BackupNetClone. I created my own scripts since the first one is too minimalist (based on BAT scripts…outch) and I found the second one too intrusive on clients computer (need SSH deamon and RSync server on each).

SSH will be used to secure RSync file synchronization.

To use it with Windows clients, the first thing is to install Cygwin (or other Linux emulator), really simple, you just have to click Next until package selection, then you select RSync and OpenSSH packages (just the main, dependencies will be grabbed automatically), and then you click Next until the end.

I will come back to client set up (don’t be afraid, it is just script that will have to be scheduled…) after a quick view on the server side, i.e. the NAS.

Set up NAS

My NAS, a DLink DNS-323, is Linux based. You have to use a fun_plug script that will be loaded at NAS startup. You can use ffp that includes some applications, particularly SSH and RSync daemons. Follow instructions in the following link to install it: wiki.dns323.info/howto:ffp.

Typically, you will set up a backup account on DNS-323 through admin interface (http://[NAS IP]), add a "backup" account in the Advanced tab. Next, you can change home and shell in /etc/passwd.

Set up clients

First, you have to configure the client once, then you would probably change configuration of which folders to backup.

First time set up

I explain here what you have to do once for each client computer (i.e. one to backup):

1. generate SSH keys that will be used next:

ssh-kengen -t dsa –b1024






You can let the default key path. Do not provide any passphrase if you want to automate your backups (it would ask it each time you want to backup).




2. copy SSH public key of client to the NAS with:




ssh-copy-id -i ~/.ssh/id_dsa.pub backup@[dns-323 IP]


I have packaged this in a script along with some simple configuration (IP, backup user name…).





What to backup?


My scripts (explained below) will search for configuration files, each giving one path to backup with its destination path on the NAS:



# Local path to backup, use /cygdrive/[drive letter]/... syntax
LOCAL_PATH_TO_BACKUP="/cygdrive/c/testbackup"

# [Optional] Target Rsync module -> override global settings
#TARGET_MODULE="backup"

# Target path in module
TARGET_PATH="test"



Launch a backup


I have a launchBackup.bat script that launches the backup.sh script through Cygwin. In this script, I load configuration from setup, I set up a SSH tunnel, then start rsync and finally close SSH tunnel.



RSync command is:



rsync -aivx --port ${SSH_TUNNEL_LOCAL_PORT} --chmod +rwx ${LOCAL_PATH_TO_BACKUP} 127.0.0.1::${TARGET_MODULE}/${TARGET_PATH}



Parameters name talk by themselves, –aivx are some common options of rsync. I don’t have yet set up incremental backup with --link-dest (hard linking option) and I am wondering about using –-delete that removes on server also what have been removed from your client folders (then you have to make sure that one server path is only used by one client to avoid massive deletions…).



Don’t forget to check your Firewall settings if you get some “Connection refused”-like errors.



Scheduling


You can simply rely on Windows Tasks scheduler. And you are done!



Assessment…



Not so pricy, I got the NAS for 100€ + 70€ for 1,5To hard disk drive. It is quite easy to set up, open as you are the only master of your backup, and then easily configurable/extendable and with unlimited possibilities.



About security, it removes hardware failure but do not protect from other more serious domestic risks like burglary, fire…but for that I got an idea, it is to build a small network of NAS like that (two to start…) with some parents for example, providing us a backup solution by the way Sourire



And a final word about environment impacts, I have bough an energy meter and it consumes only 10 Watts when idle (most of the time), quite good finally.

No comments: