Table of Contents
Introduction
In Linux, rsync is one of the most important tools for efficient file copying and backup. It can synchronize files between directories on the same machine or across a network, and it is designed to transfer only the minimum amount of data necessary. In this chapter you will focus on what makes rsync special for backup and restore tasks and how to use its most important options safely.
Basic rsync usage
At its core, rsync copies data from a source to a destination. The simplest form is:
rsync [options] SOURCE DESTINATION
The source and destination can be local paths or remote paths that use SSH. A local copy might look like this:
rsync -av /home/alice/Documents/ /backup/alice/Documents/
The same style works over SSH using a remote host:
rsync -av /home/alice/Documents/ backupuser@backupserver:/backup/alice/Documents/
Here backupuser@backupserver: tells rsync to connect to the remote system using SSH. For the purpose of backup you will nearly always use the archive option and normally also want some form of progress or statistics, which will be covered later.
The archive mode and key options
For backup tasks the single most important option is -a, called archive mode. It is a shorthand that enables many options at once to preserve file attributes.
In practical terms, -a tells rsync to copy files recursively and to preserve permissions, modification times, symbolic links, device files, and other important metadata that ordinary copy operations might lose.
Use -a for backup tasks when you want to preserve file attributes.
Typical backup command pattern:
rsync -a SOURCE/ DESTINATION/
You will usually add other options to adapt the behavior. Some of the most common combinations for backup are:
rsync -av SOURCE/ DESTINATION/
Verbose mode -v gives you detailed output.
rsync -aP SOURCE/ DESTINATION/
The -P option combines --partial and --progress. It is useful when copying large files, since it shows progress and keeps partially transferred files if the transfer is interrupted.
Trailing slashes and what they mean
One of the most confusing aspects of rsync for new users is the meaning of a trailing slash / on the source path. This affects how the destination directory will look.
If you run:
rsync -a /home/alice/Documents /backup/alice/
you are telling rsync to copy the directory Documents itself into /backup/alice/, resulting in /backup/alice/Documents/.
If you run:
rsync -a /home/alice/Documents/ /backup/alice/
you are telling rsync to copy the contents of Documents into /backup/alice/, so the files that were in /home/alice/Documents/ end up directly under /backup/alice/.
Rule:
No trailing slash on the source path copies the directory itself.
A trailing slash copies only the contents of that directory.
This distinction is critical when designing backup scripts and when planning how to restore data back into place.
Using rsync for local backups
Local backups are useful for quick copies to a second disk, an external drive, or another directory on the same system. A simple home directory backup could be:
rsync -aP /home/alice/ /mnt/backupdisk/home/alice/
This copies all files from /home/alice/ to the backup disk while preserving permissions, timestamps, and symbolic links. If you run the same command again later, rsync will only transfer changed files or changed parts of files, which makes incremental style backups very fast.
If your backup destination is on a filesystem that does not support Unix permissions or ownership, some preserved attributes might not be stored correctly. The command still works, but the result is less faithful than on a native Linux filesystem.
Using rsync over SSH
A common pattern for backup is to send data to a remote backup server. rsync integrates smoothly with SSH and uses it as a transport by default when a remote host is involved.
A typical remote backup command is:
rsync -aP /home/alice/ backupuser@backupserver:/backup/alice/
You can also set a custom SSH port or options using -e. For example, if the remote SSH server listens on port 2222, you can use:
rsync -aP -e "ssh -p 2222" /home/alice/ backupuser@backupserver:/backup/alice/
SSH keys are often used to avoid typing passwords each time the backup runs. That part belongs to authentication and system security, so here it is enough to know that rsync can work non-interactively if SSH key authentication is configured.
Dry runs and safety checks
Since rsync can overwrite or delete data, it is vital to test your commands before you run them for real. The option --dry-run lets you see what would happen without changing anything.
A dry run looks like this:
rsync -av --dry-run /home/alice/ /backup/alice/
The output shows which files would be copied, created, or deleted, but no actual modification is done.
Always use --dry-run when you test a new or complex rsync command, especially when using delete options.
You can add --progress or -v to get more detail during the dry run. Once you are satisfied that the operation is safe, remove --dry-run and run the command for real.
Synchronizing and deleting files
One of the strengths of rsync is that it can keep the destination synchronized with the source, not only by copying new or changed files, but also by removing files from the destination that have been removed from the source.
This is controlled by the --delete option. For example:
rsync -a --delete /home/alice/ /backup/alice/
This command ensures that /backup/alice/ is a nearly exact mirror of /home/alice/. Any file that exists in the backup but no longer exists in the source will be deleted from the backup.
This behavior is powerful and dangerous at the same time. If you accidentally reverse source and destination, or if you point to the wrong directory, --delete can remove many files.
When using --delete, always double check source and destination and first run with --dry-run to confirm what would be removed.
For some backup strategies you may prefer not to delete anything at all, or you might combine rsync with snapshot systems so that deleted files are preserved in older snapshots. Which strategy to choose belongs to overall backup design, but here you can see that --delete is the tool that aligns destination with source.
Excluding files and directories
In many backups you do not want to copy every single file. Temporary files, cache directories, or very large directories can be skipped with exclude patterns.
The main option is --exclude. For example:
rsync -a --exclude=".cache/" /home/alice/ /backup/alice/
This excludes any item named .cache from the transfer. You can use wildcards to match patterns:
rsync -a --exclude="*.iso" /home/alice/ /backup/alice/
You can specify multiple --exclude options if needed:
rsync -a --exclude=".cache/" --exclude="Downloads/" /home/alice/ /backup/alice/
For more complex sets of rules you can store patterns in a file and use --exclude-from=FILE. A simple exclude file might contain lines like:
.cache/
Downloads/
*.iso
*.tmpThe command then becomes:
rsync -a --exclude-from=/home/alice/.rsync-excludes /home/alice/ /backup/alice/
The exact patterns you use depend on your backup strategy. The key idea is that --exclude gives you control over what to leave out.
Compression and bandwidth control
If you transfer backups over the network and especially across slow or congested links, you may want to reduce the amount of data sent and control how quickly it is sent.
The -z option enables compression during transfer. For example:
rsync -azP /home/alice/ backupuser@backupserver:/backup/alice/
Compression is helpful when files are compressible. It is less useful for already compressed files, such as many media files or archives, and may even slow down the transfer if CPU time is limited.
You can also limit the bandwidth used by rsync with --bwlimit=RATE, where RATE is in kilobytes per second. For instance, to limit to 2 megabytes per second:
rsync -az --bwlimit=2048 /home/alice/ backupuser@backupserver:/backup/alice/
This prevents backup traffic from saturating a connection and disturbing other network use.
Partial transfers and resuming
For large files or unreliable connections, it is important that you can resume a transfer without starting from the beginning. With the -P option, rsync keeps partially transferred files and shows progress.
The progress output shows how much of each file has been transferred and the overall speed. If the transfer stops, you can simply run the same command again and rsync will continue from where it left off, instead of copying everything again.
The --partial option is what keeps the incomplete files. The related --partial-dir=DIR option lets you store partial files in a separate directory during transfer, which can help keep the destination directory clean if you want to avoid seeing incomplete files in the main location.
Verifying integrity
For backup tasks it is not enough to copy files. You also want to be confident that the copied data matches the source. rsync has several options related to verification.
By default, rsync decides whether to copy a file based on size and modification time. This is efficient but does not confirm content equality in every possible case. To compare file contents, you can use -c, which tells rsync to use checksums for deciding what to transfer.
For example:
rsync -aPc /home/alice/ /backup/alice/
With -c, rsync computes a checksum for each source and destination file and compares them. If they differ, the file is transferred. This is slower but more thorough.
You can also use --checksum explicitly. The internal algorithm uses checksums that behave like hash functions. At a conceptual level, you can think of a checksum as a function:
$$
h: \text{file data} \rightarrow \text{fixed-length value}
$$
Two matching checksum values are strong evidence that the files have identical contents.
Use checksum based comparison (-c or --checksum) when content integrity is more important than speed.
In some workflows you may first create a backup without checksums for speed and occasionally run an extra rsync with -c to verify that everything still matches.
Logging and monitoring rsync runs
For regular backup jobs you will want a record of what happened. rsync can produce structured logs with the --log-file option.
A typical use in a script might be:
rsync -a --delete --log-file=/var/log/rsync-home-backup.log /home/alice/ /backup/alice/
The log file then contains a list of transferred files and summary information. To keep logs useful, you can combine human readable output with --stats. For example:
rsync -a --stats --log-file=/var/log/rsync-home-backup.log /home/alice/ /backup/alice/
The --stats option adds information about the total number of files, the total transferred bytes, and the transfer efficiency.
System wide log rotation can be used to manage growing log files, but that subject belongs to general logging and is covered elsewhere. For rsync itself, the important part is that --log-file gives you a clear record for later troubleshooting and auditing.
Typical backup and restore patterns with rsync
Although full backup strategies are described separately, it is useful here to look at what rsync commands look like when used for backup and restore.
For a regular home directory backup to a local disk, a common pattern is:
rsync -aP --delete --exclude-from=/home/alice/.rsync-excludes /home/alice/ /mnt/backupdisk/home/alice/
This mirrors the home directory, excluding certain patterns, and keeps the backup synchronized by deleting files that were removed from the source.
To push the same data to a remote server:
rsync -aP --delete --exclude-from=/home/alice/.rsync-excludes /home/alice/ backupuser@backupserver:/backup/alice/
Restoring data is conceptually the reverse operation. If you want to restore the entire backed up home directory from the remote server, you might run:
rsync -aP backupuser@backupserver:/backup/alice/ /home/alice/
Before restoring, using --dry-run is again highly recommended, especially if you are overwriting existing files on the system.
Conclusion
rsync is one of the central tools for backup and restore operations on Linux. It preserves file attributes with archive mode, transfers only necessary data, and works efficiently both locally and over SSH. By understanding the meaning of trailing slashes, using --dry-run for safety, controlling deletes with --delete, excluding files, enabling compression and bandwidth limits, and optionally using checksum verification and logging, you can build reliable and controllable backup routines around rsync.