3.6 Backup and Restore

Table of Contents

Introduction

Backup and restore form the safety net of any Linux system. At this stage in the course, you already know your way around the filesystem and basic administration tools. This chapter focuses on how to protect data, how to think about risk, and how to design practical backup approaches that you can later implement with tools such as rsync, tar, and snapshot systems, which will each be covered in their own sections.

The goal here is not to turn you into a backup engineer, but to give you enough understanding to design a backup plan that fits your needs and to restore data confidently when something goes wrong.

A backup that has never been tested by doing a restore is not a reliable backup.

Why Backups Matter

On a Linux system, problems rarely announce themselves in advance. A user might accidentally delete important files, a disk might fail without warning, configuration changes can break services, or a ransomware attack can encrypt data. In all these cases, backups are often the only realistic way to return the system to a known good state.

Linux servers are frequently used to host critical services such as web applications, databases, or file shares. On desktops and laptops, you may have personal documents, photos, and development projects. If this data is lost, it may be impossible or very costly to recreate. A backup and restore plan converts potential disasters into inconveniences.

It is also important to realize that backups are not only about hardware failure. Logical problems such as corrupted configuration files, buggy software updates, or user mistakes are often more common than physical disk failure. Good backup plans are designed to handle both physical and logical failures.

Core Backup Concepts

Before you choose specific tools, you need to understand some common backup ideas. These concepts appear in almost every backup system, regardless of the technology you use.

A backup is a separate copy of data stored in a different location from the original. The word “separate” is critical. If you copy /home to another directory on the same disk and that disk fails, both the original and the copy are gone.

A restore is the process of taking data from a backup and putting it back on a system so that it can be used again. Successful restore is the true purpose of a backup. The backup process itself is useful only if it leads to reliable restores.

There are three classic types of backups. A full backup copies all selected data every time the backup runs. An incremental backup copies only the data that has changed since the last backup of any type. A differential backup copies all data that has changed since the last full backup. Incremental and differential backups are used to reduce the amount of data that has to be copied each time, which can save time and storage space, especially on servers with large datasets.

You will also meet the idea of a backup window, which is the period during which backups are allowed to run. On servers, this is usually scheduled during low traffic periods so that backups do not compete heavily with production workloads. On desktops and laptops, this may simply be set to run at night or when the machine is idle.

Another key concept is the retention period, the length of time you keep old backups before deleting them. Longer retention gives more options for restoring historical data, but uses more storage. A common approach is to keep recent backups at a higher frequency, and older backups at a lower frequency, for example daily backups for a week, weekly backups for a month, and monthly backups for a year.

The 3-2-1 Rule and Backup Levels

A widely used guideline for robust backup planning is the 3-2-1 rule. It is simple and can be applied to home and enterprise environments.

3-2-1 rule: Keep 3 copies of your data on 2 different types of storage, with at least 1 copy stored offsite.

The three copies are the original active data and two backup copies. Two different types of storage might mean internal disk and external disk, or local disk and cloud storage. Offsite storage can be a remote server, cloud backup, or a disk stored in a different physical location. The purpose is to protect against both local hardware failure and local disasters such as fire, theft, or flooding.

A second important idea is backup level. Full, incremental, and differential backups can be combined. For instance, you might perform a full backup every Sunday and incremental backups Monday through Saturday. To restore data from Wednesday, you would restore the full backup from Sunday, then the incremental backups from Monday, Tuesday, and Wednesday.

Understanding this structure is important because it affects how long restores will take and how complex they are. More small incremental backups may save space and backup time, but they can make the restore process more complicated.

What to Back Up

Not all data on a Linux system is equally valuable. Before designing a backup strategy, you should decide what you actually need to protect. This step reduces backup time and storage usage and makes restores simpler.

On most systems, the contents of /home are usually critical. This directory contains user documents, desktop settings, SSH keys, source code, and personal data. Losing /home often means losing irreplaceable work.

System configuration files in /etc are another common target. They include service configurations, network settings, and system-wide options. Restoring these files lets you rebuild a system more quickly after reinstalling the operating system.

Application data is often stored under /var, for example in /var/lib for databases or other state. However, some parts of /var, such as caches and log files, may not need to be backed up if they can be rebuilt automatically or are not critical for your purposes. You will learn more specific techniques for capturing and restoring such data in later sections, especially in relation to databases.

System binaries in directories like /usr/bin usually come from packages and can be reinstalled from repositories, so they are less critical to back up, but there are exceptions. Locally installed software under /usr/local or custom scripts in /opt might deserve backup, especially if you have built them from source or customized them.

Finally, you should consider user-specific hidden files, often called dotfiles, such as .bashrc or .config directories in home directories. These hold personal preferences and environment configurations and are usually cheap to back up compared to the inconvenience of recreating them by hand.

Backup Frequency and Scheduling

Once you know what to back up, you must decide how often to do it. The key idea is the recovery point objective, often abbreviated as RPO. This is the maximum amount of data you are willing to lose when a problem occurs.

If your RPO is one day, you are saying that at worst you are willing to lose a day of new or changed data. That implies daily backups. You can think of RPO in simple terms.

If you back up once every $T$ hours, the worst case data loss is at most $T$ hours of changes. In symbolic form:

$$
\text{Max data loss} \le T \text{ hours}
$$

Reducing RPO, for instance from 24 hours to 1 hour, usually means more frequent or more complex backups, and more storage usage. There is always a trade off between cost and risk.

Scheduling on Linux is usually implemented with tools such as cron or systemd timers, but this chapter focuses on the planning aspect. Your scheduling choices should reflect how often the protected data changes. Large media files that rarely change might only need weekly backups, while active databases or project directories may need hourly or continuous protection.

Backup Storage and Media Choices

A backup is only as reliable as the storage that holds it. On Linux systems, you have several common storage options, each with its own advantages and disadvantages.

An external hard drive or SSD connected via USB is a simple way to store backups, especially on desktops and laptops. It gives fast restores and does not usually require an internet connection. However, if the external drive is always connected to the system, it can be affected by local disasters or malware that targets all mounted disks.

Network attached storage, often mounted from a dedicated NAS device, allows backups to be sent to another machine on the same network. This improves durability compared with local-only storage and can centralize backups from multiple systems. It is often used together with snapshot capable filesystems that you will explore in a later section.

Cloud storage and object stores are increasingly used to hold offsite backups. Tools that integrate with services like S3 compatible stores or similar systems can encrypt data locally before upload and maintain versioned backups. Cloud based backups automatically give some level of geographic separation, which is useful for disaster recovery.

Optical media and tape are less common in small setups but still used in enterprises that require long term archival storage. Tapes, for example, can store large amounts of data at relatively low cost per gigabyte and can be physically rotated offsite, but access is slower and management is more complex.

When choosing backup media, you must consider capacity, performance, cost, and how easy it will be to perform a restore. It is common to combine multiple media types to follow the 3-2-1 rule discussed earlier.

Encryption and Security of Backups

Backups can contain the most sensitive data on your system. Passwords, private keys, confidential documents, and database contents are often present in clear form somewhere in the backup. Without proper protection, an attacker may find it easier to compromise your data by stealing a backup than by breaking into the live system.

There are two related concerns. Backups must remain confidential and they must remain unmodified. If you store backups on removable media that you physically control, you might decide to rely on physical security. However, as soon as you send backups over a network or store them offsite or in the cloud, encryption becomes important.

Many backup tools can perform encryption directly during backup creation. In other cases, you may use filesystem level encryption or encrypted container formats. In all cases, the encryption keys or passphrases must be carefully managed. If you lose the keys, the backups become useless.

Integrity is the second concern. You want to be sure that the data you are restoring has not been silently corrupted or tampered with. Some backup systems track checksums, which are cryptographic hashes of the backup data. When you verify a backup, the tool recalculates these hashes and compares them to the stored values. If they differ, you know something has gone wrong.

Testing and Verifying Backups

Even if your backups run on schedule and appear to complete successfully, there is always the possibility of unnoticed problems. Filesystems can suffer from silent corruption, scripts can contain errors, and configuration changes may cause some directories to be skipped accidentally.

Verification reduces this risk. There are two main levels of verification. The first is to check that the backup is technically readable and consistent. Some tools provide built in verification commands that scan the backup storage, check internal indexes, and confirm that stored checksums match the content. This type of verification is fast and can often be automated after each backup job.

The second level is the only one that proves that the backup is genuinely useful, namely performing real restore tests. A good practice is to periodically choose a subset of important files or directories and restore them to a safe temporary location, perhaps another machine or a test directory, then confirm that they are complete and usable. For servers, this can extend to building a full test recovery of a service from backups to verify that all necessary components have been captured.

It is especially important to test restores after any major change in your backup configuration, such as switching tools, adding new filesystems, or changing storage locations. Regular testing ensures that when a real incident occurs, you already know the restore process and can perform it under pressure.

The Restore Process and Recovery Objectives

From an operational point of view, the restore process is the center of your backup strategy. When something fails, you need a clear plan for how to go from a broken state to a working state using the data in your backups.

Two standard concepts guide this planning. You have already met the recovery point objective, which describes acceptable data loss. The second is the recovery time objective, abbreviated as RTO. This is the maximum acceptable downtime before a system or service must be back online.

If your RTO is short, for example minutes instead of hours, you must design both your backups and your restores accordingly. That might mean using snapshots that can be rolled back quickly, keeping spare hardware ready for rapid replacement, or predefining detailed recovery procedures that administrators can follow without improvisation.

The restore process itself usually involves several steps. You decide what point in time you want to restore to, locate the relevant backups, and then bring the data back to the target system. This might require reinstallation of the base operating system first, then restoring configuration, application data, and finally user data. For more complex services, such as databases or multi server applications, ordering matters, and you must ensure consistency between components.

Good documentation is part of a backup strategy. Recording the exact steps required to restore a given service or machine makes it more likely that someone can perform the restore quickly and accurately when needed, even if the original person who designed the system is not available.

Backup Strategies for Different Environments

Different Linux environments require different backup approaches. A personal laptop with photographs and documents does not have the same requirements as a production web server that serves thousands of users.

On personal systems, simplicity is usually more valuable than complex optimization. A common approach is to use a full backup to an external disk on a regular schedule, perhaps with versioning so older copies of files can be restored if needed. Cloud based backups can provide an extra layer of offsite protection without much manual effort.

On single server setups, such as small web servers or internal services, backups need to include the operating system configuration, any databases, application data, and user data. In practice, this often results in a combination of file level backups and application specific database dumps or snapshot mechanisms. The restore plan should include how to reinstall the system, how to restore configurations, and how to bring application data to a consistent state.

In environments with multiple servers or containers, backups must be coordinated. Different components may rely on each other, for example a web application and its database. To restore correctly, backups should represent consistent points in time across these components. This can require careful scheduling or application support for taking consistent snapshots.

In all cases, design your strategy by working backwards from the question: “If this system failed completely, what exact steps would I follow to restore it, and what data would I need at each step?” Your backup choices should then support that restore path.

Common Pitfalls and How to Avoid Them

There are several mistakes that administrators repeatedly make with backups, regardless of experience level. Knowing about them ahead of time helps you avoid them.

One frequent problem is relying on a single backup location. If your only backup is on a disk mounted inside the same server chassis, and the server is stolen or physically damaged, the backup disappears with it. This is precisely what the 3-2-1 rule is designed to prevent.

Another recurring issue is forgetting to include new data locations. As systems evolve, applications may start storing data in new directories or on additional disks. If the backup configuration is not updated, those new items are not protected. Periodic review of what is actually being backed up is necessary.

Insufficient retention is another subtle pitfall. If you only keep a few days of backups and an unnoticed corruption creeps in, you may discover the problem only after all healthy backups have been rotated out and deleted. Balancing retention against storage cost is delicate, but for important systems, conservative retention policies are safer.

Finally, overcomplicating the backup strategy can be its own failure mode. If restores require many manual steps, if the sequence is hard to remember, or if only one person knows how everything works, the system is fragile. Whenever you design a backup plan, consider how it will be used during a stressful incident. Simple, documented, and practiced procedures are usually better than extremely optimized but obscure workflows.

Summary

Backup and restore are central responsibilities in Linux system administration. This chapter has concentrated on the general principles that make backups reliable and restores possible, without focusing on specific tools. You learned about full, incremental, and differential backups, the 3-2-1 rule, decisions about what to back up, and how frequency, storage, and encryption shape a backup strategy.

You also saw why testing restores is essential, how recovery objectives guide design choices, and how backup strategies must adapt to different environments. The following sections in this part of the course will introduce concrete tools such as rsync, tar, and snapshot systems, and will show how to apply these backup principles in practice on Linux systems.

3.6.1 Backup strategies

3.6.2 Using rsync

3.6.3 Creating tar archives

3.6.4 Using snapshot systems

3.6.5 Automating backups