I have not lost much data, either my own or my client's. I guess I have been lucky. But I have had some incidents.
- A personal hard disk crash (lost handful of files).
- One of my PHP hosting providers was hacked and sites were defaced.
- One of my budget VPS providers disappeared overnight.
In all cases it was partially my own fault that it happened:
- I used an hard disk with bad sectors in its 1st GB because I could not afford a new drive back then. When partitioning the disk I thought it was enough to "cut" that part out. Unfortunately it was not. It was just a sign that disk was bad.
- I picked the cheapest hosting provider in Estonia at that time. I was in high school and had really low budget. The provider was new and inexperienced and it did not have its own backups. Since then I do not recommend (and never use myself again) a provider who has not been in business for at least 2-3 years.
- A year ago I needed a machine to run Jenkins CI server. I thought it was a good idea to pick one from http://lowendbox.com/. The machine ran well for 3 months but then it disappeared. The provider's web site disappeared too. No mails were answered. I had no phone numbers (even their invoices lacked contact data and had a placeholder image in the place of the logo). However, because it was a test machine I did not lose much data.
My current solution
Some service/cloud providers have their own backups. That is good but it does not help when things like (4) happen. To avoid it, I now use a machine that does not run in the cloud (it runs at my home - the 60Mbit bandwidth is enough for that). The current total backupped dataset does not exceed 10GB. The following outlines my backup process. There are both automated and manual steps:
- For files I use rsync over SSH. This saves bandwith. I usually only pull files that are needed to restore the machine (but it's always better to take more than less).
- (MySQL) databases are backed up using the
mysqldumputility. This also works over SSH.
- For each machine I keep a separate bash script that pulls the data and I'm trying to keep the scripts as simple as possible.
- There is a "runner" script that oversees backup scripts, logs their output and sends it by mail.
- The "runner" script is run every night by cron.
- When I do not have received backup mails some days I know that the process is borked.
- Once in two weeks I take a snapshot of pulled data and store it in a gzipped tar archive.
- The snapshot gets copied to the external hard drive.
I perform last steps manually and run all scripts right before them. This makes sure that if something is borked I will see it. I only connect the external drive for copying, otherwise it sits non-powered.
I have also looked at more specialized backup solutions but I have not found anything suitable yet (or maybe I haven't searched hard enough). The machines I maintain are quite different and run different distros and versions. Some of the PHP sites I maintain have no SSH access at all with FTP being the best it has. Bash scripts (I run them all with set -e) provide the flexibility that I need.