Kahibaro
Discord Login Register

Snapshots and rollbacks

Understanding Snapshots and Rollbacks

In the context of filesystems and storage, a snapshot is a point‑in‑time view of data that can be used to inspect, back up, or restore the state of a filesystem or volume. A rollback is the act of reverting live data back to a state captured in a snapshot.

This chapter focuses on how snapshots and rollbacks are implemented and used on Linux, particularly with common technologies:

You’ll see conceptual differences, typical workflows, and how these tools behave in real systems.

Snapshot Design Concepts

Although implementations differ, most snapshot systems revolve around two core ideas:

Copy-on-Write (CoW) vs Full Copies

Two major strategies exist:

  1. Copy-on-Write (CoW) snapshots
    • Initial snapshot creation is nearly instantaneous and tiny.
    • The snapshot initially “shares” blocks with the source.
    • When the source changes a block, the old data is copied to snapshot storage first, then the source is modified.
    • Efficient for many snapshots and frequent changes.

This is how Btrfs, ZFS, and modern LVM thin snapshots work.

  1. Full copy (clone) snapshots
    • Snapshot is a complete copy of data at snapshot time.
    • Takes time and space proportional to data size.
    • Simple to reason about, but less efficient for frequent snapshots.
    • Often used when migrating or backing up data offline.

Sometimes tools call full copies “clones” or “replicas” to distinguish them from CoW snapshots.

Crash-Consistent vs Application-Consistent

Snapshots capture blocks, not necessarily a fully flushed application state.

For databases or VMs, use hooks (e.g., pre/post snapshot scripts, systemd units) to quiesce applications if you need application-consistent snapshots.


LVM Snapshots

LVM snapshots operate on logical volumes rather than individual filesystems. You snapshot a logical volume, not the filesystem inside it (although the two are tightly related).

There are two relevant modes:

Legacy LVM Snapshots (thick volumes)

Conceptually:

Key characteristics:

Typical creation flow:

  1. Ensure filesystem on the origin LV is in a consistent state:
    • For ext4 or xfs, flush with sync and possibly remount read-only if you want a very “clean” snapshot.
    • For live systems, this is often skipped in favor of crash-consistent snapshots.
  2. Create a snapshot:
    • Example (conceptual only): lvcreate -s -L 5G -n root_snap /dev/vg0/root
  3. Use the snapshot LV for:
    • Backup (mount it read-only and archive).
    • Testing changes.
  4. Remove snapshot when no longer needed to free space.

Rollback strategy with legacy snapshots

LVM’s native “rollback” is not as direct as on Btrfs/ZFS:

Using the LVM merge feature (where supported):

This is more disruptive than filesystems designed around snapshots.

Thin-Provisioned LVM Snapshots

Thin provisioning adds a thin pool that stores data blocks for many thin volumes and snapshots.

Properties:

Typical structure:

Workflow:

  1. Create a thin pool and thin LV (covered elsewhere in the storage chapter).
  2. Create snapshots of thin LVs inside the pool.
  3. Rollback:
    • Use lvconvert --merge (thin-snapshot aware) or create a new LV from a snapshot, then switch to it.

You must monitor thin pool usage; if the pool fills, all thin volumes become read-only or fail, depending on configuration.


Btrfs Snapshots and Rollbacks

Btrfs implements snapshots at the filesystem level on subvolumes. A Btrfs subvolume is a separately mountable, CoW-managed tree inside a Btrfs filesystem.

Key properties:

Subvolumes vs Filesystems

Btrfs typically uses one block device (e.g. /dev/sda2) with many subvolumes inside:

The block device is mounted once, and subvol/subvolid mount options choose which subvolume to expose at a given mount point.

Snapshots are always of subvolumes, not of individual files.

Creating and Managing Snapshots

Typical operations (conceptual):

Read-only snapshots are ideal for backup and rollback operations, because they cannot be accidentally modified.

Rollbacks with Btrfs

You don’t “rewind in place” as with some LVM merges; instead, you pick which subvolume the system should use as the active root.

Common distro strategy:

Manual rollback workflow (conceptually):

  1. Boot into a rescue system or another Linux environment with the Btrfs volume mounted.
  2. Mount the Btrfs device with a generic mountpoint, e.g. /mnt, and locate your subvolumes and snapshots.
  3. Decide which snapshot to roll back to.
  4. Either:
    • Rename current @ to something like @.broken and rename snapshot to @, or
    • Adjust /etc/fstab and bootloader entries to use the snapshot subvolume.
  5. Reboot into the rolled-back system.

Advantages:

Snapshots and System Updates

Btrfs snapshots pair well with transactional or snapshot-aware update tools:

This pattern dramatically reduces the risk of system updates.


ZFS Snapshots and Rollbacks

ZFS integrates volume management and filesystem functionality. Snapshots operate at the dataset level.

Key properties:

ZFS Datasets and Snapshots

A dataset is a ZFS filesystem or volume:

Snapshots are named with an @ suffix:

Snapshots do not appear as normal directories unless exposed via specific mountpoints or tools; they’re managed via ZFS commands.

Typical operations:

Rollbacks with ZFS

ZFS provides a dedicated rollback mechanism:

Workflow:

  1. Identify the snapshot to roll back to.
  2. Unmount or stop services using the dataset if needed (some operations require this).
  3. Run rollback command.
  4. Remount or restart services.

Caveats:

Clone vs Snapshot

ZFS supports both:

Workflow:

This is useful for safe testing environments or quickly spawning development sandboxes.

Snapshots for Replication

A major strength of ZFS is snapshot-based replication:

Conceptual flow:

  1. zfs snapshot pool/data@backup1
  2. zfs send pool/data@backup1 | ssh backuphost zfs receive backup/data
  3. Next time:
    • zfs snapshot pool/data@backup2
    • zfs send -i pool/data@backup1 pool/data@backup2 | ssh backuphost zfs receive backup/data

This creates incremental backups based on snapshots with minimal data transfer.


Comparing Snapshot Approaches

Granularity and Scope

Performance Considerations

Space Management

If the underlying pool or filesystem runs out of space, all operations, including snapshots, can fail or destabilize the system.


Safe Rollback Practices

Rollbacks are powerful but disruptive; they rewrite history.

Guidelines:

  1. Always have backups separate from snapshots
    • Snapshots protect against accidental modification or short-term issues.
    • They do not protect against:
      • Disk failure.
      • Pool corruption.
      • Catastrophic hardware loss.
    • Use off-host or off-site backups (e.g. rsync, ZFS send, Btrfs send) in addition to snapshots.
  2. Plan for configuration and data separation
    • Put system and user data on separate logical units (LVM volumes, Btrfs subvolumes, ZFS datasets).
    • Roll back system snapshots without losing user data.
    • Or use different snapshot policies for system vs data volumes.
  3. Use read-only snapshots for rollback
    • Take read-only snapshots (Btrfs/ZFS) as “golden” points.
    • For experiments, create writable snapshots/clones derived from read-only snapshots.
    • Roll back from known-good read-only snapshots instead of mutable ones.
  4. Integrate with the bootloader and package manager
    • Some distros integrate:
      • snapshots with grub-btrfs or similar tools,
      • transactional upgrades with automatic snapshots (e.g., openSUSE).
    • This makes rollback from failed updates routine and safer.
  5. Test rollback procedures
    • Practice in a virtual machine:
      • Take snapshots, break the system, then roll back.
    • Confirm:
      • Services start correctly after rollback.
      • Databases and applications behave as expected.
    • Document the procedure for your environment.

Example Use Cases

Desktop OS Protection

Developer Sandboxes

Quick “Oh No” Recovery

Backup Integration

When to Use Which Snapshot Tool

Given a choice:

Each technology can implement snapshots and rollbacks; pick the one that fits your filesystem choice and operational needs.

Views: 21

Comments

Please login to add a comment.

Don't have an account? Register now!