Table of Contents
Understanding Snapshots and Rollbacks
In the context of filesystems and storage, a snapshot is a point‑in‑time view of data that can be used to inspect, back up, or restore the state of a filesystem or volume. A rollback is the act of reverting live data back to a state captured in a snapshot.
This chapter focuses on how snapshots and rollbacks are implemented and used on Linux, particularly with common technologies:
- LVM (Logical Volume Manager)
- Btrfs
- ZFS
You’ll see conceptual differences, typical workflows, and how these tools behave in real systems.
Snapshot Design Concepts
Although implementations differ, most snapshot systems revolve around two core ideas:
Copy-on-Write (CoW) vs Full Copies
Two major strategies exist:
- Copy-on-Write (CoW) snapshots
- Initial snapshot creation is nearly instantaneous and tiny.
- The snapshot initially “shares” blocks with the source.
- When the source changes a block, the old data is copied to snapshot storage first, then the source is modified.
- Efficient for many snapshots and frequent changes.
This is how Btrfs, ZFS, and modern LVM thin snapshots work.
- Full copy (clone) snapshots
- Snapshot is a complete copy of data at snapshot time.
- Takes time and space proportional to data size.
- Simple to reason about, but less efficient for frequent snapshots.
- Often used when migrating or backing up data offline.
Sometimes tools call full copies “clones” or “replicas” to distinguish them from CoW snapshots.
Crash-Consistent vs Application-Consistent
Snapshots capture blocks, not necessarily a fully flushed application state.
- Crash-consistent: Equivalent to pulling the power plug. Filesystem will replay its journal or log on mount, but in-flight application writes may be partially completed.
- Application-consistent: Databases or services are told to flush buffers, pause writes, or enter a quiesced state before the snapshot, then resume afterward.
For databases or VMs, use hooks (e.g., pre/post snapshot scripts, systemd units) to quiesce applications if you need application-consistent snapshots.
LVM Snapshots
LVM snapshots operate on logical volumes rather than individual filesystems. You snapshot a logical volume, not the filesystem inside it (although the two are tightly related).
There are two relevant modes:
- Legacy LVM snapshots (not thin provisioning): CoW in a dedicated snapshot LV.
- Thin-provisioned LVM snapshots: Modern, scalable snapshots with a thin pool.
Legacy LVM Snapshots (thick volumes)
Conceptually:
- You have an origin LV, e.g.
vg0/root. - You create a snapshot LV, e.g.
vg0/root_snap, with its own capacity. - As blocks of the origin change, old versions are copied to the snapshot LV.
Key characteristics:
- The snapshot LV must be large enough to hold all original blocks that change after snapshot creation.
- When the snapshot fills, it becomes invalid and is dropped.
- Multiple snapshots significantly increase write I/O due to CoW overhead.
Typical creation flow:
- Ensure filesystem on the origin LV is in a consistent state:
- For
ext4orxfs, flush withsyncand possibly remount read-only if you want a very “clean” snapshot. - For live systems, this is often skipped in favor of crash-consistent snapshots.
- Create a snapshot:
- Example (conceptual only):
lvcreate -s -L 5G -n root_snap /dev/vg0/root - Use the snapshot LV for:
- Backup (mount it read-only and archive).
- Testing changes.
- Remove snapshot when no longer needed to free space.
Rollback strategy with legacy snapshots
LVM’s native “rollback” is not as direct as on Btrfs/ZFS:
- Normally you:
- Boot from rescue media or a different root.
- Optionally make sure the origin is not mounted.
- Restore the origin by merging snapshot content back into it (or by overwriting the origin from backup/clone).
Using the LVM merge feature (where supported):
- Concept: “rollback” by merging snapshot LV into origin LV.
- Typical flow:
- Unmount origin.
- Request merge of snapshot into origin.
- Reboot or remount the restored volume.
This is more disruptive than filesystems designed around snapshots.
Thin-Provisioned LVM Snapshots
Thin provisioning adds a thin pool that stores data blocks for many thin volumes and snapshots.
Properties:
- Snapshots are cheap to create (metadata-only changes).
- Thin pool space is shared across all thin LVs and their snapshots.
- More scalable than legacy snapshots.
Typical structure:
- Thin pool LV:
vg0/thinpool - Thin volumes:
vg0/root,vg0/home - Snapshots:
vg0/root_snap1,vg0/root_snap2, …
Workflow:
- Create a thin pool and thin LV (covered elsewhere in the storage chapter).
- Create snapshots of thin LVs inside the pool.
- Rollback:
- Use
lvconvert --merge(thin-snapshot aware) or create a new LV from a snapshot, then switch to it.
You must monitor thin pool usage; if the pool fills, all thin volumes become read-only or fail, depending on configuration.
Btrfs Snapshots and Rollbacks
Btrfs implements snapshots at the filesystem level on subvolumes. A Btrfs subvolume is a separately mountable, CoW-managed tree inside a Btrfs filesystem.
Key properties:
- Snapshots are instantaneous and CoW-based.
- Snapshots can be read-only or read-write.
- Snapshots are space-efficient, growing only as data diverges.
- Rollbacks can be done by changing which subvolume is mounted as root or by replacing data with a snapshot.
Subvolumes vs Filesystems
Btrfs typically uses one block device (e.g. /dev/sda2) with many subvolumes inside:
- Example subvolumes:
@(root filesystem)@home@snapshots
The block device is mounted once, and subvol/subvolid mount options choose which subvolume to expose at a given mount point.
Snapshots are always of subvolumes, not of individual files.
Creating and Managing Snapshots
Typical operations (conceptual):
- Create a read-only snapshot:
- Use
btrfs subvolume snapshot -r source target. - Create a read-write snapshot:
- Use
btrfs subvolume snapshot source target. - List subvolumes/snapshots:
- Use
btrfs subvolume list.
Read-only snapshots are ideal for backup and rollback operations, because they cannot be accidentally modified.
Rollbacks with Btrfs
You don’t “rewind in place” as with some LVM merges; instead, you pick which subvolume the system should use as the active root.
Common distro strategy:
- Your root filesystem is a subvolume
@. - Snapshots of root live in
@/.snapshots/<id>/snapshotor under a separate@snapshotssubvolume. - Bootloader (e.g., GRUB with a Btrfs-aware plugin) can boot directly into a selected snapshot.
Manual rollback workflow (conceptually):
- Boot into a rescue system or another Linux environment with the Btrfs volume mounted.
- Mount the Btrfs device with a generic mountpoint, e.g.
/mnt, and locate your subvolumes and snapshots. - Decide which snapshot to roll back to.
- Either:
- Rename current
@to something like@.brokenand rename snapshot to@, or - Adjust
/etc/fstaband bootloader entries to use the snapshot subvolume. - Reboot into the rolled-back system.
Advantages:
- Rollback can be nearly instantaneous.
- Multiple snapshots can be kept for incremental rollback options.
- Very common in snapshot‑based update systems (openSUSE, some Debian derivatives).
Snapshots and System Updates
Btrfs snapshots pair well with transactional or snapshot-aware update tools:
- Before applying a large update:
- Take a snapshot of the root subvolume.
- Apply updates to a new snapshot or to the current root.
- If the system fails to boot or misbehaves:
- Boot into the pre-update snapshot and roll back, or
- Switch the default boot entry to a known-good snapshot.
This pattern dramatically reduces the risk of system updates.
ZFS Snapshots and Rollbacks
ZFS integrates volume management and filesystem functionality. Snapshots operate at the dataset level.
Key properties:
- Snapshots are instant, CoW-based, and immutable.
- Datasets are mounted filesystems; each can have its own snapshots.
- ZFS has powerful send/receive features to replicate snapshots.
ZFS Datasets and Snapshots
A dataset is a ZFS filesystem or volume:
- Example datasets:
pool/rootpool/homepool/var/log
Snapshots are named with an @ suffix:
pool/root@before_upgradepool/home@2025-01-01
Snapshots do not appear as normal directories unless exposed via specific mountpoints or tools; they’re managed via ZFS commands.
Typical operations:
- Create snapshot:
zfs snapshot pool/root@before_upgrade- List snapshots:
zfs list -t snapshot- Destroy snapshot:
zfs destroy pool/root@before_upgrade
Rollbacks with ZFS
ZFS provides a dedicated rollback mechanism:
zfs rollback dataset@snapshotreplaces the dataset’s current contents with that snapshot.- Any changes made after the snapshot are discarded.
Workflow:
- Identify the snapshot to roll back to.
- Unmount or stop services using the dataset if needed (some operations require this).
- Run rollback command.
- Remount or restart services.
Caveats:
- You cannot roll back if the dataset has snapshots newer than the target snapshot, unless you destroy those newer snapshots first or use specific flags to discard them.
- Rolling back is a destructive operation for changes made after the snapshot.
Clone vs Snapshot
ZFS supports both:
- Snapshot: read-only point-in-time reference.
- Clone: a writable dataset created from a snapshot.
Workflow:
- Take a snapshot of a dataset.
- Create a clone from that snapshot.
- Use the clone to test changes, upgrades, or experiments.
- Either promote the clone to be the main dataset, or discard it.
This is useful for safe testing environments or quickly spawning development sandboxes.
Snapshots for Replication
A major strength of ZFS is snapshot-based replication:
zfs sendstreams data corresponding to a snapshot (or differences between snapshots).zfs receiveapplies that stream on another system or pool.
Conceptual flow:
zfs snapshot pool/data@backup1zfs send pool/data@backup1 | ssh backuphost zfs receive backup/data- Next time:
zfs snapshot pool/data@backup2zfs send -i pool/data@backup1 pool/data@backup2 | ssh backuphost zfs receive backup/data
This creates incremental backups based on snapshots with minimal data transfer.
Comparing Snapshot Approaches
Granularity and Scope
- LVM:
- Operates at the block device/logical volume level.
- Filesystem-agnostic.
- Good for whole-volume rollback or consistent backups across multiple filesystems (if coordinated).
- Btrfs:
- Operates at subvolume level.
- Filesystem-integrated; aware of file metadata.
- Very convenient for OS-level snapshot/rollback mechanisms.
- ZFS:
- Operates at dataset level.
- Deep integration with storage, quotas, compression, and replication.
Performance Considerations
- Snapshots introduce extra metadata and CoW overhead:
- Writes may slow as more snapshots accumulate.
- On heavily written systems, too many snapshots can degrade performance.
- Cleanup is crucial:
- Deleting unneeded snapshots frees space and can reduce CoW complexity.
- For high‑churn workloads (e.g., databases):
- Prefer fewer, well-timed snapshots.
- Use application-level backup tools in combination with lower-level snapshots.
Space Management
- Snapshots share blocks with their origin until data diverges.
- Space used by snapshots grows with the size of changed data since snapshot creation.
- You must monitor:
- LVM thin pool usage.
- Btrfs filesystem free space.
- ZFS pool capacity.
If the underlying pool or filesystem runs out of space, all operations, including snapshots, can fail or destabilize the system.
Safe Rollback Practices
Rollbacks are powerful but disruptive; they rewrite history.
Guidelines:
- Always have backups separate from snapshots
- Snapshots protect against accidental modification or short-term issues.
- They do not protect against:
- Disk failure.
- Pool corruption.
- Catastrophic hardware loss.
- Use off-host or off-site backups (e.g. rsync, ZFS send, Btrfs send) in addition to snapshots.
- Plan for configuration and data separation
- Put system and user data on separate logical units (LVM volumes, Btrfs subvolumes, ZFS datasets).
- Roll back system snapshots without losing user data.
- Or use different snapshot policies for system vs data volumes.
- Use read-only snapshots for rollback
- Take read-only snapshots (Btrfs/ZFS) as “golden” points.
- For experiments, create writable snapshots/clones derived from read-only snapshots.
- Roll back from known-good read-only snapshots instead of mutable ones.
- Integrate with the bootloader and package manager
- Some distros integrate:
- snapshots with
grub-btrfsor similar tools, - transactional upgrades with automatic snapshots (e.g., openSUSE).
- This makes rollback from failed updates routine and safer.
- Test rollback procedures
- Practice in a virtual machine:
- Take snapshots, break the system, then roll back.
- Confirm:
- Services start correctly after rollback.
- Databases and applications behave as expected.
- Document the procedure for your environment.
Example Use Cases
Desktop OS Protection
- Before large updates or configuration changes:
- Take snapshots of the root filesystem (
BtrfsorZFS). - If the system fails to boot or becomes unstable:
- Select an older snapshot from the boot menu.
- Validate it, then make it the new default.
Developer Sandboxes
- Base image stored as a read-only snapshot/dataset/subvolume.
- Each developer:
- Creates a writable snapshot (clone) to experiment.
- Discards and recreates snapshots when environment breaks.
- Very fast compared to rebuilding environments from scratch.
Quick “Oh No” Recovery
- Accidentally deleted or modified files:
- Find the nearest snapshot preceding the mistake.
- Copy files from snapshot back into live filesystem.
- No need for full rollback; fine-grained restoration from snapshots is possible (Btrfs/ZFS, or by mounting LVM snapshots read-only and copying data).
Backup Integration
- Use snapshots to freeze a consistent view of data:
- Create snapshot.
- Run backup from the snapshot (not from the live filesystem).
- Destroy snapshot when backup is complete.
- This avoids issues where files change during backup and can reduce downtime for backup windows.
When to Use Which Snapshot Tool
Given a choice:
- LVM snapshots:
- You already use LVM for partition management.
- Your filesystem doesn’t provide snapshots (e.g.
ext4,xfs). - You need volume-level snapshots, possibly across multiple filesystems.
- Btrfs snapshots:
- Your distribution supports Btrfs-based root filesystems.
- You want easy OS rollback; openSUSE, Fedora variants, or some Ubuntu flavors often use this.
- You want per-subvolume snapshots, especially for configuration vs data separation.
- ZFS snapshots:
- You are using ZFS pools for data or root.
- You need strong replication, checksumming, and advanced data integrity.
- You are comfortable with ZFS licensing implications on your platform.
Each technology can implement snapshots and rollbacks; pick the one that fits your filesystem choice and operational needs.