Kahibaro
Discord Login Register

3.6.3 Creating tar archives

Understanding `tar` in the Context of Backups

tar (tape archiver) is the standard tool on Linux for creating and restoring archive files. In backup workflows it is most often used to:

Most Linux backup tools and scripts either invoke tar directly or mimic its behavior.

Basic `tar` Syntax for Archives

The general tar structure:

Commonly used short options (can be combined, e.g., czf):

Examples:

# Create an uncompressed archive of /etc
tar -cf etc-backup.tar /etc
# List contents
tar -tf etc-backup.tar
# Extract into current directory
tar -xf etc-backup.tar

Preserving Metadata for System Backups

For system or configuration backups, preserving metadata is crucial.

Key options:

When creating an archive for system-level restore:

sudo tar -cpf rootfs-backup.tar \
  --acls --xattrs --numeric-owner \
  /

In practice you will exclude runtime directories (covered below), but this shows the pattern.

Using Compression with `tar`

tar itself does not compress by default; it can delegate compression to other tools.

Common options:

Trade-offs (roughly):

Examples:

# gzip-compressed archive
tar -czf home-backup.tar.gz /home
# xz-compressed archive
tar -cJf config-backup.tar.xz /etc
# zstd-compressed archive (if supported)
tar --zstd -cf data-backup.tar.zst /var/www

To extract, you only need -xf; tar auto-detects the compressor:

tar -xf home-backup.tar.gz
tar -xf config-backup.tar.xz
tar -xf data-backup.tar.zst

Relative vs Absolute Paths in Archives

How you specify paths when creating an archive affects where files extract later.

Safer approach for portable backups:

# From root, but store as relative paths
cd /
sudo tar -czf root-backup.tar.gz \
  --acls --xattrs --numeric-owner \
  etc var home

Extraction example into /restore-root:

sudo mkdir -p /restore-root
cd /restore-root
sudo tar -xzf /path/to/root-backup.tar.gz
# etc, var, home now appear under /restore-root, not overwriting your live system

Excluding Files and Directories

For practical backups, you rarely want to archive everything.

--exclude allows you to skip paths:

# Exclude a directory
tar -czf etc-backup.tar.gz \
  --exclude='/etc/ssl/private' \
  /etc
# Multiple excludes
tar -czf root-backup.tar.gz \
  --exclude='/proc' \
  --exclude='/sys' \
  --exclude='/dev' \
  --exclude='/run' \
  /

You can also exclude by pattern:

# Exclude all .cache directories inside /home
tar -czf home-backup.tar.gz \
  --exclude='*/.cache' \
  /home

Exclusion file (for many patterns):

# exclude.txt
/proc
/sys
/dev
/run
/tmp
/var/tmp
/var/cache
/home/*/.cache
# Use it
sudo tar -czf root-backup.tar.gz \
  --exclude-from=exclude.txt \
  /

Incremental and Differential Archives with `tar`

GNU tar supports incremental-style backups using snapshot files.

Core idea:

Example workflow:

# 1. Full backup
sudo tar -czf full-backup.tar.gz \
  --listed-incremental=snapshot.snar \
  /
# 2. Incremental backup next day
sudo tar -czf incr-2025-12-13.tar.gz \
  --listed-incremental=snapshot.snar \
  /
# 3. Another incremental backup later
sudo tar -czf incr-2025-12-14.tar.gz \
  --listed-incremental=snapshot.snar \
  /

To inspect what an incremental archive contains:

tar -tvf incr-2025-12-13.tar.gz

Restoring incremental backups is more complex than simple archives and must follow the sequence (full, then each incremental). For more complex strategies, many admins prefer external backup tools that manage this layering for you, but this demonstrates tar’s built-in capability.

Managing Large Archives and Splitting

When archiving large datasets, you may want to split archives into manageable pieces (for example, to fit on media or upload limits).

Typical approach: combine tar with split:

# Create and split into 1G chunks
tar -czf - /home | split -b 1G - home-backup.tar.gz.part-
# This produces files like:
# home-backup.tar.gz.part-aa, home-backup.tar.gz.part-ab, ...

To restore:

# Reassemble and extract
cat home-backup.tar.gz.part-* | tar -xzf -

You can also use --multi-volume with tar itself, but combining with split is simpler and more common.

Using `tar` with Pipes and Remote Backups

Because tar reads/writes to standard input/output, it integrates well with other tools and remote transfers.

Using `tar` over `ssh`

Create a backup on a remote host:

# From local machine, backing up /var/www on remote host to local file
ssh user@remote 'tar -czf - /var/www' > remote-www-backup.tar.gz

Or send local backup to remote host:

# From local machine, store/archive on remote host
tar -czf - /var/www | ssh user@backup-host 'cat > www-backup.tar.gz'

Combining with `rsync` or other tools

You can also compress or transform a stream:

# tar + xz via pipe, then encrypt with gpg (example)
tar -cf - /home | xz | gpg --symmetric -o home-backup.tar.xz.gpg

Verifying Archive Integrity

tar itself does not embed checksums of the entire archive, but it will report I/O errors and format issues. To add stronger verification, combine tar with checksum tools.

Basic verification approach:

  1. After creating an archive, compute a checksum:
   sha256sum home-backup.tar.gz > home-backup.tar.gz.sha256
  1. Later, verify:
   sha256sum -c home-backup.tar.gz.sha256
   # OK if it prints: 'home-backup.tar.gz: OK'

You can also use tar --compare (-d) to compare archive contents with the filesystem:

# Compare archive vs current filesystem
sudo tar -df etc-backup.tar /etc

Differences may appear if files have changed since the backup; this is more useful immediately after creation to confirm consistency.

Practical Backup Examples with `tar`

Example: Backing Up `/etc` Configuration

sudo tar -czf etc-$(date +%F).tar.gz \
  --acls --xattrs --numeric-owner \
  /etc

Example: Home Directory Backup with Exclusions

tar -czf home-$(date +%F).tar.gz \
  --exclude='*/.cache' \
  --exclude='*/Downloads' \
  /home

Example: Root Filesystem Backup (Non-Live Restore)

Using an exclusion file:

# exclude-root.txt
/proc
/sys
/dev
/run
/tmp
/var/tmp
/var/cache
/home/*/.cache
# Create backup
sudo tar -czf root-$(date +%F).tar.gz \
  --acls --xattrs --numeric-owner \
  --exclude-from=exclude-root.txt \
  /

This archive is suitable for restoring into a non-live environment (e.g., rescue system, chroot, or another disk) as part of a broader restore process.

Restoring from `tar` Archives Safely

Restoring is the other half of using tar for backups. Some practical points:

Examples:

# Inspect archive structure
tar -tf root-2025-12-12.tar.gz | head
# Extract into a custom directory (e.g., new root)
sudo mkdir -p /mnt/restore
sudo tar -xzf root-2025-12-12.tar.gz -C /mnt/restore

For system-wide restore (overwriting existing paths), use absolute paths with care and usually from a non-booted system (e.g., using a live USB or recovery mode).

Integrating `tar` into Backup Strategies

Within a broader backup and restore strategy, tar is typically:

Understanding tar’s options and behavior lets you build reliable backup routines and reason about how to reconstruct systems from those archives when needed.

Views: 107

Comments

Please login to add a comment.

Don't have an account? Register now!