3.6.4 Using snapshot systems

Table of Contents

Introduction

Snapshot systems provide a way to capture the exact state of data at a specific point in time. This is very different from a traditional file copy. A snapshot is typically fast to create, uses little extra space at first, and can be used later to restore files or entire filesystems to a previous state. In Linux, snapshots are usually provided by the underlying storage technology, such as LVM, Btrfs, or ZFS. This chapter focuses on how snapshots behave, how they relate to backups, and how to work with them in practice, without going deep into configuration of each filesystem.

What a Snapshot Really Is

A snapshot is not a full copy of your data at the moment it is created. Instead, it is a logical view of the data as it existed at that time. Internally, snapshot systems use copy on write mechanisms. When a snapshot is created, the system remembers the current layout of data blocks. When you later change a file, only the changed blocks are written to new locations. The snapshot still references the original blocks, so it represents the old version. Because of this, snapshot creation is usually very fast and initially consumes little extra space.

A snapshot is a point in time, space efficient, logical view of existing data, not an independent full copy. It protects against recent mistakes, but not against complete disk failure if it lives on the same storage.

This definition is important for understanding both the advantages and the limits of snapshot systems in a backup strategy.

Copy on Write and Space Usage

The key idea behind most snapshot systems is copy on write, often abbreviated as COW. When there is only the original filesystem, data blocks are written directly. When you create a snapshot, both the original filesystem and the snapshot share the same blocks. From that moment on, whenever you modify data in the live filesystem, the system writes the modified data into new blocks, and the snapshot keeps its reference to the old blocks. This means that unchanged data is shared between snapshots and the live filesystem, while changes alone consume extra space.

You can express the approximate space used by snapshots as:

$$
\text{Snapshot space} \approx \text{Sum of all changed blocks after snapshot creation}
$$

The more changes occur after you create snapshots, the more space is consumed. If you have many snapshots and a high write rate, you must monitor free space carefully. If the snapshot area becomes full, snapshots can become unusable or may be automatically removed, depending on the technology and configuration.

Snapshots vs Traditional Backups

Snapshot systems are often described as backups, but they are not a replacement for independent copies stored elsewhere. Snapshots depend on the same underlying storage. If the disk fails, both live data and snapshots are lost. They are also usually not encrypted separately from the main storage and they are not resistant to physical damage, theft, or catastrophic failures affecting the whole system.

Traditional backups, such as copies to an external disk or a remote server, still have a central role. However, snapshots are very useful as a first line of recovery. They allow you to roll back quickly after a mistake, such as accidental deletion or a system update that breaks configuration.

Never rely on snapshots alone as your only backup strategy. You always need at least one independent copy stored on separate hardware or in a separate location.

In practice, a solid approach combines frequent local snapshots for fast rollback with periodic off site or external backups for disaster recovery.

Local Volume Snapshots (LVM Focus)

Logical Volume Manager (LVM) provides snapshot capabilities at the block level. The snapshot is created for a logical volume, not for individual directories or files. At the moment of creation, the snapshot contains a consistent view of the filesystem on that volume. You can then mount this snapshot separately, inspect data, or use it as a source for backup tools.

A typical workflow for LVM snapshots is as follows. You have a logical volume that contains a filesystem. Before running a backup, you create a snapshot of that volume. Then you mount the snapshot in a temporary location, run your backup tool against this snapshot, and finally remove the snapshot. This approach avoids inconsistencies that could result from files changing while the backup is running. Since the snapshot is read only from the perspective of the backup process, the backup is taken from a stable point in time.

You must choose a size for an LVM snapshot. This size limits how many changes you can accommodate after the snapshot creation. If too many blocks are changed, the snapshot space will fill, and the snapshot may become invalid. Because of this, it is important to keep LVM snapshots short lived and sized appropriately for your expected write activity during their lifetime.

Filesystem Level Snapshots (Btrfs and ZFS Concepts)

Some filesystems include snapshot functionality directly. Btrfs and ZFS are the most common examples on Linux. Instead of working with logical volumes, these filesystems allow you to snapshot a subvolume or dataset. The snapshot is typically created instantly and can be mounted or browsed just like a normal directory tree.

With filesystem level snapshots, you can define snapshot policies more precisely. For example, you might have separate subvolumes for your root filesystem, home directories, and application data. You can then create snapshots on different schedules for each subvolume. A Btrfs snapshot of the root filesystem before a system update allows you to roll back if the update fails, while snapshots of the home subvolume can provide quick recovery for user files.

ZFS uses a similar model with datasets and snapshots. Both filesystems support snapshot naming conventions and time based snapshots, such as hourly, daily, and weekly. Tools exist that automate the creation and rotation of these snapshots, which makes snapshot management more practical in daily administration.

Snapshot Scheduling and Retention

Frequent snapshots provide finer recovery points, but also increase space usage and management overhead. A typical approach is to use a retention strategy that keeps many recent snapshots and fewer older ones. For example, you can keep hourly snapshots for one day, daily snapshots for one week, and weekly snapshots for a longer period.

Since snapshots often exist alongside other backup methods, you should align their schedule with your overall backup plan. Local snapshots can be more frequent, since they are lightweight, while full external backups can be less frequent. When designing a retention policy, remember how copy on write affects space usage. Old snapshots can keep old blocks alive, so removing snapshots can actually free a significant amount of space.

Automated snapshot tools for Btrfs and ZFS often support policies where the number of snapshots retained per period is specified in configuration. For LVM, you might use external scheduling tools to create and remove snapshots at defined intervals, such as around your backup jobs.

Using Snapshots for Quick Recovery

One of the most valuable uses of snapshot systems is quick recovery of recently changed or deleted files. Instead of restoring everything from a traditional backup, you can examine an old snapshot, copy only the needed files back, and leave the rest of the system untouched.

The general process to restore from a snapshot is similar regardless of the technology. First, you identify the appropriate snapshot by its timestamp or name. Then you mount or otherwise access that snapshot as a separate location. You browse within it to find the file or directory you want, and you copy it back into the live filesystem. This does not usually require unmounting or stopping services, which means lower downtime.

Snapshot systems are also helpful for testing configuration changes or software updates. Before making a change, you create a snapshot. If something goes wrong, you either roll back the entire filesystem to that snapshot or selectively restore configuration files. Filesystem level snapshots can sometimes be promoted or rolled back at the dataset or subvolume level, which makes this process particularly straightforward.

Integration with Backup Workflows

Snapshots fit well into structured backup workflows. One common pattern is to use snapshots as the stable source of data for a backup program that writes to remote storage. This way, backup tools never see half written or changing files, and you do not need to stop services during the backup.

A typical workflow might follow these steps. At scheduled times, a snapshot is created of the relevant volume or subvolume. The snapshot is mounted read only in a special directory. Then a backup tool such as rsync or another backup system copies data from this mounted snapshot to a remote backup target. When the backup completes, the snapshot is removed. The snapshot exists only for the duration of the backup, which minimizes its space impact.

Use snapshots as consistent, read only sources for backups, but always ensure that the final backup target resides on separate storage. The snapshot itself does not provide resilience against hardware failure.

Some advanced setups also use remote snapshot replication. With ZFS, for example, snapshots can be sent to another system that stores them as its own snapshots. This combines snapshot convenience with off host resilience, but the details belong to more advanced topics.

Limitations and Risks of Snapshot Systems

Despite their advantages, snapshot systems have several limitations that you must keep in mind. The most important one is the shared fate of snapshots and their underlying storage. If the disk or storage pool is lost, all snapshots on that storage are lost as well. No snapshot system on the same device can protect against total device failure.

Another limitation is performance impact when there are many snapshots or heavy write workloads. Because copy on write adds extra work for each changed block, write performance can degrade if snapshots are kept for too long or are too numerous. Also, extensive fragmentation can develop over time, which affects both read and write performance.

Space usage can also be difficult to predict. A system with low write activity might support many snapshots with little overhead, while a busy database server might quickly fill the snapshot space. You must monitor free space and have policies for automatically pruning old snapshots before the system runs out of capacity.

Security is another aspect to consider. Snapshots preserve old versions of data, including files you might think you have deleted. This can be useful for recovery, but it also means sensitive information can remain present in older snapshots. When you need to permanently remove such data, you may need to delete or expire related snapshots and, depending on the filesystem, perform additional steps.

When to Use Snapshot Systems

Snapshot systems are most appropriate in scenarios where you need fast, frequent checkpoints of data with minimal disruption. This is typical for servers with important configuration, application data, or virtual machine images. Desktop systems can also benefit, particularly for system upgrade safety and easy rollback of personal files.

You should consider snapshot usage in combination with the type of workload and the filesystem technology in use. Filesystem level snapshots are especially helpful for systems that already use Btrfs or ZFS as their main filesystem. LVM snapshots are suitable when your system uses LVM and you need consistent backups of logical volumes. In each case, the design of your snapshot schedule and retention must support your recovery goals without exhausting resources.

Ultimately, snapshot systems are one piece of a broader backup and restore strategy. They offer rapid, fine grained recovery points and help protect against recent mistakes and misconfigurations. Combined with traditional backups to independent storage, they significantly improve the resilience and manageability of Linux systems.

Comments

Please login to add a comment.

Don't have an account? Register now!