3.6.5 Automating backups

Table of Contents

Why Automate Backups

Automating backups means arranging your system so that backups run by themselves at regular times without you having to remember or type commands. For a single user on a laptop this can be simple, but on servers it is essential. Human memory is unreliable, and manual backups tend to be forgotten or postponed. Automation gives you consistency, predictability, and faster recovery when something goes wrong.

Automation does not replace backup planning. You still need a backup strategy, you still need to decide what to back up and where to store it, and you still need to test that restores work. Automation simply executes the plan for you, according to a schedule and with the same commands every time.

The rest of this chapter focuses on the mechanics that make backups automatic on Linux, mainly cron, systemd timers, scripts that wrap backup tools like rsync and tar, and practical concerns such as logging and rotation.

Automated backups are useful only if restores are tested. Always verify that you can restore from your automated backups before you rely on them.

Scheduling with Cron

Cron is a time based job scheduler that runs commands at specific times or intervals. For backups this is often the first and simplest tool to use. You define jobs in crontab files, and cron runs them in the background at the specified times.

Each user, including root, can have a personal crontab. System wide schedules are stored in /etc/crontab and in directories such as /etc/cron.daily. For automated backups that affect the entire system, it is common to use root’s crontab so the job has permission to read all necessary files.

You edit a user’s crontab with:

bash

crontab -e

A cron line has the general form:

text

MIN HOUR DOM MON DOW COMMAND

where:

MIN is the minute (0 to 59),
HOUR is the hour (0 to 23),
DOM is the day of month (1 to 31),
MON is the month (1 to 12),
DOW is the day of week (0 to 7, where both 0 and 7 are Sunday),
COMMAND is what will be executed.

For example, to run a backup script every day at 2:30 in the morning:

text

30 2 * * * /usr/local/sbin/backup.sh

Cron uses absolute paths by default and does not load your usual interactive shell configuration. For automated backups this means:

Specify full paths for commands, such as /usr/bin/rsync or /usr/bin/tar.
Set any needed environment variables inside the script rather than relying on your interactive environment.

You can also use special time directives instead of numeric schedules. On many systems you can write:

text

@daily /usr/local/sbin/backup.sh
@weekly /usr/local/sbin/backup-full.sh

These are convenient when you want “once a day” without worrying about the exact hour.

danger
Do not depend on interactive environment settings in cron jobs. Always use absolute paths and set any required variables explicitly in your script.

Scheduling with Systemd Timers

On systems that use systemd, you can schedule tasks using systemd timers instead of cron. Timers integrate with systemd units, logging, and service management. This makes them attractive for more complex or critical backup workflows.

A systemd timer works together with a service. The service unit describes what to run, and the timer unit defines when to run it. For example, you might have a service:

text

/etc/systemd/system/backup.service

with contents similar to:

[Unit]
Description=Run backup script
[Service]
Type=oneshot
ExecStart=/usr/local/sbin/backup.sh

and a timer:

text

/etc/systemd/system/backup.timer

with contents like:

[Unit]
Description=Daily backup timer
[Timer]
OnCalendar=daily
Persistent=true
[Install]
WantedBy=timers.target

The OnCalendar directive specifies when the service should run. The value daily is a shortcut that means “once per day”. You can also use more specific expressions, for example:

OnCalendar=03:00 for 3:00 every day.
OnCalendar=Mon --01..07 02:00:00 for the first Monday of each month at 2:00.

The Persistent=true setting tells systemd to run missed jobs immediately when the system comes back from downtime. For example, if your system was off at the scheduled time, the backup will run once it boots again.

To enable and start the timer you run:

bash

sudo systemctl enable --now backup.timer

You can check the status with:

bash

systemctl list-timers
systemctl status backup.timer

Timers are particularly useful for server backups because they integrate with journalctl logging and with dependency management, such as ordering backups relative to other services.

Writing Backup Scripts for Automation

Automation typically calls a script that wraps the actual backup commands. This script can use tools such as rsync and tar and can implement your backup strategy. The script acts as a single entry point for cron or a systemd timer.

A basic backup script often includes the following elements:

Definition of source directories to back up.
Definition of a destination, often an external disk or remote server.
Time stamped filenames or directories to separate backup runs.
Logging of progress and errors.
Return codes that indicate success or failure.

A simple example might look like this:

#!/bin/bash
set -e
DATE=$(date +%F)
SRC="/home"
DEST="/backup/$DATE"
LOG="/var/log/backup-$DATE.log"
mkdir -p "$DEST"
/usr/bin/rsync -a --delete "$SRC"/ "$DEST"/ >> "$LOG" 2>&1

This type of script can then be executed automatically by cron or a systemd timer. It is important that the script not require any user interaction. All choices should be made beforehand in the script or in configuration files.

To support your backup strategy, the script might create daily, weekly, and monthly backups in different locations, rotate older backups by deleting them after a time limit, or maintain incremental backups that rely on previous runs. These details belong to the strategy itself, and automation simply executes them according to plan.

danger
Automated backup scripts must run non interactively. Do not use commands that prompt for input, open text editors, or require confirmation unless you provide automatic answers inside the script.

Noninteractive Authentication

When automation involves remote storage, such as backing up to another server via SSH, you need a way for the backup job to authenticate without manual password entry. The typical approach is to use SSH keys with restricted permissions.

You can create a key pair with:

bash

ssh-keygen -t ed25519 -f ~/.ssh/backup_key

Then you copy the public key to the remote backup server and configure that server so this key is allowed to connect, usually by adding the public key to ~/.ssh/authorized_keys on the remote account used for backups.

For additional safety you can restrict the key on the remote side to running only specific backup commands. This is done by prefixing the key line in authorized_keys with options, which limit its capabilities and reduce the impact if the key is compromised.

Similar noninteractive mechanisms exist for other protocols and tools. For example, some cloud storage tools use configuration files with tokens or access keys that the automated job can read. In all cases the backup automation must have a way to access its target without stopping for a password.

Logging and Monitoring Automated Backups

Automated jobs run in the background. If they fail silently for weeks, you might not notice until you try to restore data. To prevent this, your automation should always produce logs, and you should have a way to monitor them.

For cron jobs, a simple pattern is to redirect both standard output and standard error to a log file, for example:

text

30 2 * * * /usr/local/sbin/backup.sh >> /var/log/backup.log 2>&1

For systemd services, output usually goes to the journal. You can view it with:

bash

journalctl -u backup.service

It is also useful to make your script exit with a nonzero status if something important fails. Cron and systemd can react to these exit codes, and monitoring tools can alert you when a job fails.

Some administrators configure cron to send email when a job produces output or fails. On systems with mail delivery configured, cron can send the script’s output to a designated address. For systemd timers, services can be combined with monitoring units or external monitoring software to trigger alerts.

danger
Every automated backup job must be observable. Store logs, check exit codes, and set up alerts so that failures are noticed quickly.

Controlling Backup Retention Automatically

Automating backups also means automating cleanup. Without retention control, an automated job can quickly fill disks. Retention defines how many backups you keep and for how long.

Your backup script is often the right place to implement retention rules. A simple approach is to delete backups older than a certain age. For example:

bash

find /backup -maxdepth 1 -type d -mtime +30 -exec rm -rf {} \;

would remove backup directories within /backup that are older than 30 days. This allows a rolling window of recent backups to exist at any time.

More advanced retention schemes keep a mix of daily, weekly, and monthly backups. The automation then decides which backups to delete based on name patterns or timestamps. For example, it might keep all daily backups for the last week, weekly backups for the last month, and monthly backups for the last year.

In all cases you should make sure that deletion rules are correct before putting them into unattended use. It is sensible to start with commands that only list what would be deleted, then review them manually, and only then activate the deletion step.

Testing Automated Backups and Restores

Automation introduces its own risks. A small mistake in a script can propagate quietly through many runs. Because of this, regular testing is crucial.

You can test your automation in several ways. First, run the backup script manually and verify that it creates data in the expected location. Next, run a small restore test. For example, pick a file from the backup and restore it to a temporary directory, then compare it to the original. This confirms that your automated job is producing usable backups.

Once the job is scheduled, check that it runs at the expected times and produces logs that show success. Occasionally simulate a full restore in a safe environment such as a virtual machine. Use only the automated backups as the source. This exercise confirms that your entire automated chain, from scheduling to storage, is working as intended.

Automated backups are powerful, and once you have them in place they can run reliably for long periods. However, they should never be left completely unattended. Periodic review of logs, retention, storage capacity, and restore procedures keeps the automation trustworthy and ready for the moment you need it.

Comments

Please login to add a comment.

Don't have an account? Register now!

3.6.5 Automating backups

Why Automate Backups

Scheduling with Cron

Scheduling with Systemd Timers

Writing Backup Scripts for Automation

Noninteractive Authentication

Logging and Monitoring Automated Backups

Controlling Backup Retention Automatically

Testing Automated Backups and Restores

Comments

Where to Move