3.2.2 Filesystems (EXT4, XFS, Btrfs)

Introduction

Linux can use many different filesystems. In practice, three native Linux filesystems appear most often on modern systems: EXT4, XFS, and Btrfs. Each has a different design philosophy, feature set, and set of trade‑offs. As an administrator, you do not need to become a filesystem developer, but you do need to understand what makes these three special, how they behave, and where each one is typically used.

This chapter focuses specifically on these three filesystems. Device and partition concepts, mounting, and generic disk tools are discussed in other chapters and are not repeated here.

Common Concepts Across Filesystems

All three filesystems discussed here serve the same basic purpose: they organize data on a block device so that Linux can store files and directories. They all support standard Unix permissions, ownership, and timestamps. They all work with the usual Linux tools such as mount, fsck, and df.

They also all implement some form of journaling or copy‑on‑write to protect metadata against corruption during crashes. However, how they do it and how far the protection goes is different, and this difference is one of the main reasons to choose one filesystem over another.

The kernel exposes each filesystem through the Virtual Filesystem (VFS) layer, so from the point of view of most tools and applications, a directory tree on EXT4 looks similar to one on XFS or Btrfs. The differences matter when you create or tune filesystems, select mount options, and plan for growth, backup, and recovery.

EXT4: The General Purpose Workhorse

EXT4 is the fourth generation of the extended filesystem family. It is the default choice in many distributions and is considered stable, mature, and predictable. For a long time, EXT4 has been the safe default for almost every use case that does not require specific advanced features.

Design Characteristics

EXT4 is a journaling filesystem. It uses a journal to record metadata updates before they are committed to their final locations. This reduces the risk of corruption after an unexpected power loss and makes recovery faster.

EXT4 introduced extents, which represent ranges of contiguous blocks with a single descriptor. This change from older block lists improves performance and reduces fragmentation for large files. EXT4 also supports large files and large filesystem sizes, with limits far above what most small and medium servers will ever need.

Internally, EXT4 still follows a traditional layout with block groups. This structure can be useful when you examine the filesystem with lower level tools, but for everyday administration you usually only interact with high level tools that hide these details.

Typical Use Cases

EXT4 is often used for:

System root partitions on desktops and servers.

General purpose data partitions.

Workloads where predictability and simplicity matter more than special features.

Because of its long history, many tools, installers, and recovery utilities have very robust support for EXT4. If you are unsure what to use and you do not need snapshots, integrated checksums for data, or advanced allocation behavior, EXT4 is usually the safest starting point.

Journaling Modes

EXT4 supports multiple journaling modes that trade off between safety and performance. The most frequently encountered modes are:

data=ordered, which is usually the default. File data is written to disk before the corresponding metadata is committed to the journal. This ensures that after a crash, you do not see new metadata pointing to old garbage data.

data=writeback, which does not order data writes with respect to journal commits. This can improve performance, but you may see old file data make a brief reappearance after a crash because metadata can point to blocks that were not fully updated.

data=journal, which writes both metadata and file data through the journal. This is safer but slower for heavy write workloads because the data is written twice.

These modes are configured as mount options, usually in /etc/fstab or a direct mount command.

Journal modes affect data safety in the event of a crash. data=ordered is the common balance between performance and integrity. Avoid data=writeback for critical data unless you fully understand the risks.

Administration Notes

You typically create an EXT4 filesystem with mkfs.ext4 or mke2fs. The standard consistency checker is fsck.ext4. Because of its journaling design and maturity, full filesystem checks are not needed as often as with older non‑journaling filesystems, but they are still sometimes necessary, for example after serious hardware issues.

EXT4 also supports features such as quotas, online resizing in some situations, and directory indexing. Most distributions use conservative defaults that work well enough for typical installations, so you do not often need to fine tune these features for basic use.

XFS: High Performance and Large Scale

XFS is a high performance journaling filesystem originally developed by SGI for IRIX and later ported to Linux. Modern Linux distributions increasingly choose XFS as the default for server installations, especially where large filesystems or heavy parallel I/O are expected.

Design Characteristics

XFS is designed with scalability and parallelism in mind. It uses extent based allocation and several forms of internal parallel structures. These design choices let it handle very large filesystems and high levels of concurrency efficiently.

XFS also journals metadata to provide crash consistency. It is particularly good at streaming large files and handling workloads where many threads write concurrently to different files or different parts of the filesystem.

There are some trade‑offs. XFS is not as flexible with certain operations such as shrinking a filesystem. It is optimized for growing and handling large volumes rather than frequently changing size in both directions.

Typical Use Cases

Common scenarios for XFS include:

Large servers that host big data sets and large files.

High throughput workloads such as media production, backup targets, or scientific data.

Systems where filesystem size grows to many terabytes and must remain responsive under load.

Because XFS scales well with multiple CPUs and high I/O concurrency, it is often a better choice than EXT4 for heavy write intensive applications or very large volumes.

Allocation and Fragmentation Behavior

XFS tries to allocate large contiguous extents to minimize fragmentation. It is also sensitive to how the underlying storage is laid out. Aligning the filesystem to the underlying RAID or SSD characteristics can help performance. When created with modern tools and defaults, XFS usually takes care of reasonable alignment automatically.

XFS has mechanisms for delayed allocation. Instead of allocating blocks at the moment data is written into page cache, it can wait until it has more information about how much data will actually reach disk, which allows it to choose better extents. This improves performance and reduces fragmentation but makes some behaviors during crashes slightly different from a filesystem that allocates immediately.

Administration Notes

You typically create an XFS filesystem with mkfs.xfs. The main maintenance tool is xfs_repair. Unlike EXT4, traditional fsck is not used for XFS, and xfs_repair is intended for more serious corruption rather than routine checks.

Resizing XFS filesystems is mostly limited to growing when they are mounted on an underlying block device that is itself extended, for example with LVM. Shrinking is generally not supported, so plan capacity ahead and consider logical volume management if you might need to adjust sizes.

XFS supports quotas, project quotas, and various tuning options through mount parameters. Distributions that use XFS by default usually configure it with a profile that matches common server workloads.

Btrfs: Copy‑on‑Write and Advanced Features

Btrfs is a modern copy‑on‑write filesystem that aims to integrate many advanced features into a single coherent design. It can act as both a filesystem and a volume manager. Although it has been in development for years, some distributions still treat it as more specialized than EXT4 or XFS and focus it on specific roles where its features shine.

Copy‑on‑Write Basics

The most distinctive trait of Btrfs is that it uses copy‑on‑write, often abbreviated as CoW. Instead of overwriting data blocks in place, Btrfs writes new versions of blocks to free locations and then updates metadata to point to the new blocks. The old blocks remain untouched until they are no longer referenced and can be reclaimed.

This design enables features such as inexpensive snapshots and efficient cloning of files because the filesystem can share blocks between versions as long as their contents remain identical.

It also means that writes can be more scattered over the device, and special attention is needed to avoid unnecessary overhead in workloads that repeatedly rewrite the same files.

Snapshots and Subvolumes

Btrfs organizes its tree of files and directories into subvolumes. A subvolume looks like a directory when you traverse the filesystem, but from the filesystem point of view it is a separate logical tree that can be configured and managed independently.

You can create snapshots of subvolumes. A snapshot is a copy of the subvolume metadata that initially shares all data blocks with the original. As changes occur in the original or in the snapshot, CoW semantics cause new blocks to be allocated for the changed portions, while unchanged blocks remain shared. This keeps snapshots lightweight when only a fraction of the data changes.

Snapshots can be created as read only or read write. Read only snapshots are useful for backup and rollback. Read write snapshots can be used as the basis for experiments or alternative environments.

These capabilities are often integrated into distribution level tools. For example, some distributions use Btrfs snapshots to allow rollback of system upgrades.

Because snapshots share blocks with their source, deleting files inside a snapshot does not immediately free space that is still referenced elsewhere. Space usage on Btrfs depends on how many snapshots exist and how much of their data remains unique.

Checksumming and Data Integrity

Btrfs stores checksums for both data and metadata. For each block, it computes a checksum and stores it separately. When the block is read, the checksum can be validated. If the filesystem is on top of redundant devices such as a Btrfs RAID configuration, it can sometimes detect corruption and recover from another copy automatically.

This end‑to‑end integrity model is different from traditional journaling filesystems, which typically protect only metadata against sudden interruptions but do not verify data blocks over the long term. With Btrfs, you can run scrub operations to systematically read blocks, verify checksums, and fix errors if possible.

Integrity protection is especially valuable on large storage pools or long lived data where silent bit rot is a concern.

Storage Pooling and RAID Capabilities

Btrfs can manage multiple underlying devices as a single filesystem. Within one Btrfs filesystem you can add devices and configure how data and metadata are replicated or striped across them. This can provide some of the functionality that in traditional setups would require a separate volume manager and a RAID layer.

You can configure different RAID profiles for data and metadata. For example, you might choose a mirrored profile for metadata and a striped or mirrored profile for data. Btrfs supports several redundancy schemes, but details of each RAID level are discussed in another chapter.

This integrated pooling allows online addition of new devices and migration between profiles in some cases. It also means that managing Btrfs at scale requires awareness of both filesystem and storage layout concepts in a combined way.

Performance and Workload Considerations

Copy‑on‑write has benefits but also overhead. For workloads that frequently rewrite existing files, such as database files with in‑place updates, pure CoW behavior can cause additional fragmentation and write amplification. Btrfs provides a nodatacow option for specific files or subvolumes to mitigate this, but then those regions behave more like a traditional non‑CoW filesystem and lose some benefits such as checksumming for data.

Btrfs performs well for workloads that make use of snapshots, cloning, and copy‑on‑write friendly access patterns, such as virtual machine images, container storage, and backup targets where many versions of similar data coexist.

It also performs adequately for general desktop usage on many distributions that have chosen it as their default, as long as tuning and maintenance practices recommended by those distributions are followed, including regular scrub and balance operations for larger pools.

Administration Notes

You typically create a Btrfs filesystem with mkfs.btrfs. The main tools for administration are grouped under btrfs subcommands, such as btrfs filesystem, btrfs subvolume, btrfs balance, and btrfs scrub. Traditional fsck tools exist but are seldom used for routine operations; integrity is usually maintained by checksumming and scrub.

Btrfs has many mount options and operational commands. Because of its integrated volume management, tasks such as adding a new disk, changing RAID profiles, or defragmenting the filesystem are often handled with Btrfs native tools rather than external layers.

Comparing EXT4, XFS, and Btrfs

From an administrator’s perspective, the main differences between EXT4, XFS, and Btrfs can be summarized across a few key dimensions.

In terms of maturity and simplicity, EXT4 is the most conservative choice. It has fewer advanced features but a very long track record, stable defaults, and wide tooling support.

In terms of scalability and high performance on large filesystems, XFS is usually superior, especially for very large volumes and parallel I/O workloads. It lacks integrated snapshots and copy‑on‑write but excels at streaming and managing big data sets.

In terms of features, Btrfs is the most ambitious. It combines copy‑on‑write, snapshots, checksums for data and metadata, integrated pooling, and flexible subvolumes. These features are powerful but require more understanding to administer effectively.

You should also consider how your distribution integrates each filesystem. Some distributions use EXT4 almost everywhere. Others offer XFS as the default for server installations. Some integrate Btrfs deeply with tools for system snapshots and rollback. In practice, the distribution’s tested combinations and tooling can be as important as theoretical capabilities.

Choose the filesystem according to workload and operational needs. For generic systems without special requirements, EXT4 is a safe default. For large, high throughput servers, XFS is often preferred. For systems that need built in snapshots, checksums, and pooling, Btrfs is the natural candidate.

Practical Selection Guidelines

When you plan a new system and must choose between these filesystems, think about specific requirements:

If you need something straightforward for a general desktop or small server and want maximum compatibility with guides and tools, choose EXT4.

If you manage large storage for media, backups, or heavy concurrent write workloads and you will not rely on integrated snapshots or copy‑on‑write, consider XFS.

If you specifically want filesystem snapshots for rollback, end‑to‑end checksums, lightweight cloning, or integrated multi device management, and especially if your distribution documents Btrfs as a supported default, Btrfs can be the most appropriate choice.

You can mix filesystems on different partitions of the same system. For example, a server can have an EXT4 root filesystem, an XFS partition for large data, and a Btrfs filesystem for virtual machine images. The kernel and tools can handle multiple filesystems at once as long as each is mounted on its own mount point.

Conclusion

EXT4, XFS, and Btrfs all provide solid foundations for Linux storage, but they are optimized for different priorities. EXT4 offers stability and simplicity, XFS offers scalability and performance on large workloads, and Btrfs offers advanced features such as snapshots, checksumming, and integrated storage pooling.

Understanding these characteristics helps you make informed decisions when provisioning new systems, planning data layouts, and responding to performance or reliability requirements.

Comments

Please login to add a comment.

Don't have an account? Register now!

3.2.2 Filesystems (EXT4, XFS, Btrfs)

Introduction

Common Concepts Across Filesystems

EXT4: The General Purpose Workhorse

Design Characteristics

Typical Use Cases

Journaling Modes

Administration Notes

XFS: High Performance and Large Scale

Design Characteristics

Typical Use Cases

Allocation and Fragmentation Behavior

Administration Notes

Btrfs: Copy‑on‑Write and Advanced Features

Copy‑on‑Write Basics

Snapshots and Subvolumes

Checksumming and Data Integrity

Storage Pooling and RAID Capabilities

Performance and Workload Considerations

Administration Notes

Comparing EXT4, XFS, and Btrfs

Practical Selection Guidelines

Conclusion

Comments

Where to Move