We’ve seen plenty of crazy ways to keep your precious data safe. Some people burn a few tons of DVDs, others make a montly habit of swapping hard drives into a safe location. In today’s How-To we’ll show you how to automatically keep your data backed up from your computer with ssh and rsync. Feel that? That’s our warm comfy safe-data blankie. Check it out.
What about backup software? There are so many flavors of software, only Google could count them all. Since we like our choice of operating systems, we want something that’ll work from our Mac, Windows or Linux machines. But we’ll cover some good software backup options next times — for now it’s the down and dirty, nitty gritty network backups.
First of all, we’re going to need somewhere to keep our data. For the tools we’ll be using, we’ll need a server that we can access via secure shell (ssh) from anywhere on the net. If you only want to backup your stuff at home, that’s fine.
For our first example, we’ll be using Ubuntu Linux on our laptop with our Linux web server. You can get an inexpensive shared host from a web host provider or roll your own like we did. Our particular backup solution is to update a copy of our data as rather than take incremental snapshots over time. You can do it either way, but for our needs, we just want our current data set kept alive.
Once you’ve decided where to keep your data, and what you want to backup on your laptop or workstation, you’ll need the tools to keep things rolling.
The heart of our — and many others’ — cross platform backup is a combination of ssh and rsync. The secure shell is probably the most useful networking application ever. We’ll use it to transport our data securely to our backup location. To update the files, rsync will be used — it’s designed to copy and synchronize data from one location to another.
For our example, we’ll backup /home/willo/data to our server. The top directory ‘data’ should be created on the server. We’ll use an Ubuntu laptop, if they’re not already installed, you can easily install rsync and ssh with this command:
sudo apt-get install rsync openssh-client
This command will copy and update the data inside /home/willo/data to our server’s directory /home/willo/data. When it’s run by hand, it requires the ssh password for user willo on the server. Not a big deal, but when it’s automated, we won’t be there to enter the password.
rsync -avz -e ssh data [email protected]:/home/willo
To get around the password requirement, we need to create a pair of ssh keys. The keys will allow our ssh connections without user intervention. (This also means that someone else could connect if they get your key…)
Here’s the command for copy/paste ease:
ssh-keygen -t dsa -b 2048 -f /home/willo/backup-key
The command creates a pair of keys. The private key will allow us to open our connection. We’ll need to copy the public key to our destination/backup server. Once it’s there, we added it to our authorized keys file. If you don’t have one, just rename the file instead of appending it.
Now we can run our backup command without a password, and a simple command. Again, here it is for quick cut and paste:
rsync -avz -e “ssh -i /home/willo/backup-key” /home/willo/data [email protected]:/home/willo/data
As always, replace the paths at will. (Har har.)
Now we’ll put our backup command into a script so we don’t have to remember every detail. This is the quick and dirty version. We ran vi backup.sh and added a bash header and our script. We’ll save it as backup-data.sh and run chmod 500 backup-data.sh so we can execute it, but other users can’t look at it.
Cron is a scheduling program. We can schedule software to run as often, or as rarely as we want. To regularly run our backup program, we’ll create a cron job. Use the command crontab -e to edit your crontab. The first five entries determine how often to run the job, and the command follows. In this case, we’ll sync our data every 30 minutes.
Now you know how to do it from a linux box, but how about Windows? You can do the same thing by installing cygwin – it’s a set of unix tools built to run under the Windows environment. Download the installer here.
Run the installer and step through the process. When you get to the package selection window, you’ll need to select cron, ssh and rsync.
Once the installer finishes, open up the cygwin bash shell program. From here you’ll be able to perform the same steps we outlined for linux. The only real difference is in the directory names. We suggest putting your data in an easy location like C:\data. Then you can use /cygdrive/c/data in your commands instead of the usual c:\data.
You can do the same things under Mac OS X, all the tools are already installed. Just navigate to Applications/Utilites and open up the Terminal program. After that, you can follow the same instructions. A work of warning: rsync’s support of resource forks has been an issue. You’ll probably want to look into using rsyncx. If you’re dealing with simple data like image files, normal rsync should get the job done.
Now that you know how to keep your data backed up to a server, where should it go? Well, how about to an off the shelf NAS like the Buffalo Terastation? With a few modifications, we used the same solution with ours.
A visit to the Terastaion wiki turned up a few hacks that opened up the boxes latent abilities. We installed firmware from this page to gain telnet and root access to the box. The updater is pretty large, but it worked just fine for us.
After the update, we opened a telnet session to our Terastaion. (We gave our a static ip and set the gateway and DNS settings.) To quickly and easily install ssh, we used the following commands logged in as myroot:
tar zxv /home/dropbear.tgz
With ssh running, we created a user using the normal control panel. All users are defaulted to home, but you can edit /etc/passwd and provide something like /home/willo if you want to keep things separate. Create /home/willo/.ssh, copy the public key to authorized_keys. After that, decide where to keep your backup. Don’t use the home directory — put it under one of the normally shared directories under /mnt/array1.
If the worst should happen, or you just need a copy of your data, you can snag it using a couple of tricks. To securely copy the whole shebang, scp is the easiest. (scp – secure copy is built into openssh) You can use a key, but your password will work just as well.
If you’ve got most of your data on another machine, but want to update it with the latest changes from your working copy, you can use rsync, but reverse the source and destination. Again, you can use the key, or not. It’s your choice. (Just don’t try to run this by and and have the crontab rsync running as well.)
It’s important to keep the limitations of this method in mind. If you’re working with huge files, then you’ll need some major bandwidth. You probably won’t get that 2GB file copied to the server from the local coffee shop’s DSL connection. Thanks to rsync, you can pretty easily add or update smaller data files. Even if the upload doesn’t complete, rsycnc will pick up where it left off the next time it connects. Keep your eyes out for part two, when we’ll look at a few backup options that don’t require a terminal to use.