3.6 Backup and Restore

Why Backup and Restore Matters

Data loss on Linux systems can come from hardware failure, user mistakes (rm -rf in the wrong place), software bugs, or malware. Backup and restore is about having:

A reliable way to capture data and configuration at certain points in time.
A tested process to restore that data quickly and correctly.

For administration work, backups are not optional: they’re part of normal operations and planning, not just emergencies.

Key questions you should be able to answer for any system:

What must be backed up? (data, configs, databases, etc.)
How often?
Where are backups stored?
How do I restore a single file vs an entire system?
When was the last time a restore was actually tested?

These questions drive your backup strategy and choice of tools (covered in later subsections).

What Needs to Be Backed Up on a Linux System

On a typical Linux system, not everything needs backing up. Many files can be reinstalled from packages. Focus on:

Critical data

User data: home directories, e.g. /home/*
Application data: databases, mail spools, application-specific directories (for example /var/lib/mysql, /var/lib/postgresql, /var/www, /srv)

System and service configuration

/etc (configuration for most services and the system itself)
Application-specific configs under /opt/app/.../config or similar
Custom scripts and tools in:

/usr/local/bin, /usr/local/sbin
/opt or other custom locations you use

Optional but useful

Cron jobs: /var/spool/cron, /etc/cron*
Systemd units you created or modified: /etc/systemd/system
Custom firewall rules if not already in /etc
Package lists (to help recreate software sets), for example:

Debian/Ubuntu: dpkg --get-selections
Fedora/RHEL: dnf list installed
Arch: pacman -Qqe

What usually does not need backup

On most systems, you can skip:

Temporary files: /tmp, /var/tmp
Cache directories: /var/cache, browser caches, etc.
Device and pseudo-filesystems: /dev, /proc, /sys, /run
Software from repositories (can be reinstalled), unless you need exact versions or offline restore

You can exclude these from backups to save space and time.

Types of Backups

Backups are usually described by how much data they copy each time:

Full backups

Copy everything selected for backup.
Simplest to understand and restore.
Slowest and largest, so done less frequently (for example weekly).

Incremental backups

Copy only changes since the last backup of any kind.
Example: If you make a full backup Sunday, then incrementals Monday, Tuesday, Wednesday each only contain changes since the previous day.
Faster and smaller, but restoring requires the last full backup plus all subsequent incrementals.

Differential backups

Copy changes since the last full backup.
Example: Sunday full, Monday differential (changes since Sunday), Tuesday differential (still changes since Sunday), etc.
Larger than incrementals, but simpler to restore: just the last full + the latest differential.

Many real-world tools internally manage these differences, but the concepts still apply: you trade backup size/speed against restore complexity.

Backup Destinations

Where you send your backups matters as much as how you create them.

Common destinations:

Local disk: another partition, another physical disk, or an external USB drive.
Network storage: NFS share, SMB share, NAS device.
Remote server: via ssh, rsync, or specialized tools.
Cloud storage: object stores (S3-compatible, etc.), or backup services.

Basic rule: DO NOT keep the only backup on the same physical disk as the data. Disk failure will take both.

Basic Backup Strategies and Policies

A backup strategy defines when and how you run backups and what you keep. Even simple environments should have a written plan.

The 3–2–1 rule

A widely used guideline:

Keep at least 3 copies of your data:

Original + 2 backups.

Store the copies on 2 different media types:

For example, internal disk + external disk, or disk + cloud.

Keep at least 1 copy offsite:

Different physical location or cloud.

This rule helps protect you against disk failure, local disasters, and mistakes.

Retention policies

Retention is about how long you keep each backup before deleting it.

Examples:

Keep daily backups for 7 days.
Keep weekly backups for 4 weeks.
Keep monthly backups for 6 or 12 months.

Even a small system benefits from a simple policy, so backup storage doesn’t grow without limit.

Scheduling

Backups should run automatically:

Use cron or systemd timers to schedule:

Frequent incremental or differential backups (for example daily).
Less frequent full backups (for example weekly or monthly).

Choose times when:

Load is lower.
Files are likely to be in a consistent state (for example, outside business hours).

For databases and some services, coordinate with application-level backup mechanisms (covered in other chapters if needed).

Backup Consistency and Online Data

Backing up files that are actively being written can create inconsistent backups.

Ways to handle this:

Use application-specific tools:

Databases often have their own backup commands (for example logical dumps or snapshot support).

Stop or quiesce services briefly while backing up critical data (where feasible).
Use snapshot-capable filesystems or storage:

LVM snapshots, Btrfs/ZFS snapshots, SAN snapshots.
Common pattern: create snapshot → back up the snapshot → delete snapshot.

The correct approach depends on the service (database, VM, application), but the principle is: aim for data that represents a consistent point in time.

Testing Restores

A backup is only useful if you know you can restore it.

Basic practices:

Verify backup integrity:

Many tools support checksums or verification modes.

Do test restores regularly:

Restore a single file.
Restore a directory tree into a temporary location.
Periodically do a full system restore onto a VM or spare machine, if possible.

Document restore procedures:

Step-by-step notes: where backups are, commands to run, any passwords or keys needed.

Track how long restores take:

Helps plan recovery time expectations (RTO).

Testing reveals missing files, bad assumptions, or broken scripts before a real emergency.

Encryption and Access Control

Backups often contain everything: user data, secrets, configs, database dumps. They must be protected.

Encryption

Encrypt backups at rest, especially:

When stored offsite or in the cloud.
When physical access is not fully under your control.

Options:

Use filesystem or disk encryption (for example LUKS) for the backup storage.
Use tool-level encryption (for example tools that support GPG or built-in encryption).

Keep encryption keys and passphrases:

Secure but recoverable.
Separate from the backup location (or you lose both together).

Permissions and access

Restrict backup storage:

Limit who can read or write backups.
Use secure remote access (SSH, VPN).

Beware of storing root-owned backups where normal users can read them:

They may contain other users’ data, system passwords, or configs.

Disaster Recovery vs. Everyday Restores

Backups serve two main scenarios:

Everyday restore

User deleted a file.
Configuration change broke a service.
Updates went wrong and you want to roll back.

Characteristics:

Small scope: a directory, a config file, or a small data set.
Usually done quickly on the same system.
Often from recent daily backups.

Disaster recovery

Disk failure, hardware failure.
System compromised or corrupted.
Site disaster (fire, flood, etc.).

Characteristics:

Larger scope: whole system or multiple systems.
Requires:

OS reinstall or bare-metal restore.
Recreating partitions and filesystems (see storage chapters for details).
Reinstalling packages and restoring data/configs.

Offsite backups and clear documentation are critical.

Your planning should consider both. Everyday restores are common; disasters are rare but more severe.

Versioning and Snapshots vs Traditional Backups

Backups usually keep archived copies in a separate location. Some systems also use:

Filesystem snapshots (for example Btrfs, ZFS, LVM):

Very fast, space-efficient copy-on-write snapshots.
Great for quick local rollbacks.
Usually live on the same storage pool → not sufficient alone for real backups.

Versioned backups:

Backups that keep multiple older versions of files.
Useful when users want to go back to “how this directory looked last week”.

For robust protection, snapshots complement backups but do not replace them.

Planning a Simple Backup and Restore Workflow

For a beginner-friendly Linux system, an example approach:

Decide what to back up:

/home, /etc, important app data in /var or /srv, and custom scripts in /usr/local or /opt.

Decide where:

External disk and/or remote server.

Decide how often:

Daily incremental, weekly full.

Implement with basic tools:

Use rsync or tar scripts (detailed later in this section).
Schedule with cron or systemd timers.

Secure it:

Use restricted permissions and, if needed, encryption.

Document and test:

Write restore steps.
Do a test restore at least monthly.

The later subsections in this part (rsync, tar archives, snapshot systems, automation) provide concrete commands and examples to fill in each step.

3.6.1 Backup strategies

3.6.2 Using rsync

3.6.3 Creating tar archives

3.6.4 Using snapshot systems

3.6.5 Automating backups

Comments

Please login to add a comment.

Don't have an account? Register now!