Table of Contents
Why Storage Matters in Linux Administration
In day‑to‑day Linux administration, storage is one of the most critical areas you’ll manage. Everything from user data and databases to logs and backups ultimately lives on some kind of storage. Mismanaging it can lead to:
- Full disks that crash services
- Data loss or corruption
- Slow systems and I/O bottlenecks
- Failed backups and restores
This chapter gives you the big picture of storage and filesystems on a Linux system, so the later chapters in this section (devices/partitions, mounting, disk tools, archiving) make sense in context.
You won’t learn specific commands in depth here—that’s for the child chapters—but you’ll understand the concepts they operate on.
Basic Storage Building Blocks
Think of Linux storage as a stack of layers:
- Physical storage devices
- Partitions and volume managers
- Filesystems
- Mount points and directory layout
- Data and applications
Physical Storage Devices
At the lowest level are devices you plug into servers or virtual machines:
- HDDs (Hard Disk Drives)
- Spinning platters, mechanical head
- Cheaper, larger capacities
- Higher latency, slower random access
- Good for archives, logs, backups
- SSDs (Solid State Drives)
- Flash memory, no moving parts
- Much faster, especially random I/O
- Limited write endurance (but good enough for most use)
- Ideal for OS, databases, VMs
- NVMe drives
- SSDs using PCIe instead of SATA
- Much higher bandwidth and lower latency
- Common in newer servers and laptops
- Removable media
- USB sticks, external drives, SD cards
- Used for installation, offline backups, or transferring data
In Linux, these appear as device files, typically under /dev (details are covered in “Devices and partitions”).
Partitions, LVM, and RAID (Conceptual Overview)
Between the raw device and your filesystem, you often have some abstraction:
- Partitions
- Divide a physical disk into logical sections
- Each partition can hold a filesystem or be part of a volume/RAID
- Useful to separate system data from user data or logs
- LVM (Logical Volume Manager) (covered deeply later)
- Lets you group physical storage into flexible logical volumes
- You can resize volumes, create snapshots, and move data between disks
- RAID
- Combines multiple physical disks for redundancy and/or performance
- Levels like RAID 1, 5, 10 have different trade‑offs
- Implemented in hardware (RAID controllers) or software (e.g.
mdadm)
You don’t need to be an expert yet—just recognize that a filesystem might sit on top of a plain partition, an LVM logical volume, or a RAID array.
What Is a Filesystem (in Practice)?
A filesystem is the structure that organizes how data is stored and retrieved on a storage device or volume.
Conceptually, a filesystem provides:
- A directory tree: hierarchical structure (
/,/home,/var/log, …) - Metadata: ownership, permissions, timestamps, file size, etc.
- Data storage: where the actual content of files lives
- Allocation strategies: how to place data on disk for performance and reliability
- Consistency mechanisms: journaling, checksums, copy‑on‑write, etc.
Different filesystems make different trade‑offs in performance, reliability, features, and complexity. Later in this section you’ll see specific Linux filesystems (like EXT4, XFS, Btrfs), but here we’ll focus on shared concepts.
Key Filesystem Concepts
You’ll see these terms often when dealing with filesystems:
- Blocks
- Basic unit of storage managed by a filesystem
- Typical size: 4 KiB (but can vary)
- Files bigger than one block are stored in multiple blocks
- Inodes
- Data structures storing metadata:
- Owner, group
- Permissions (
rwx) - Timestamps (created/modified/accessed)
- File size
- Pointers to data blocks
- Directories map filenames to inode numbers
- Journaling
- A log of filesystem changes, used to recover after crashes
- Reduces risk of corruption and long “fsck” times
- Many modern Linux filesystems are journaling filesystems (e.g. EXT4, XFS)
- Mounting
- Attaching a filesystem to a directory (mount point) in the single unified tree
- For example,
/dev/sdb1mounted on/data
The chapter “Mounting and unmounting” will handle the practical side; here, you just need to know that a filesystem is “usable” only when it is mounted somewhere.
The Storage Stack in Linux
It helps to visualize the full path from hardware to files:
$$
\text{Application} \rightarrow \text{File} \rightarrow \text{VFS} \rightarrow \text{Filesystem} \rightarrow \text{Block Layer} \rightarrow \text{Device}
$$
Breaking this down:
- Applications
- Use system calls like
open(),read(),write()via libraries and shells - They work with paths like
/var/log/syslog, not devices - Virtual Filesystem (VFS)
- Kernel layer that provides a common interface to all filesystems
- Makes different filesystem types and devices appear uniform
- Filesystem driver
- Code in the kernel that knows how to read/write a specific filesystem type (e.g. EXT4 driver, XFS driver)
- Block layer
- Generic layer for reading/writing fixed‑size blocks from/to block devices
- Block device
- Physical disk, partition, LVM volume, RAID device, etc.
Understanding this hierarchy is important when troubleshooting: a problem could be at any layer—application, filesystem, device, or hardware.
Common Storage Use Cases and Layouts
On a real system, you rarely have “one big disk with one filesystem”. Instead, you design a layout that fits your needs.
Typical Server Layout (Conceptual)
A common simple scheme might look like:
/on one filesystem (OS and tools)/homeon another filesystem (user data)/varor/var/logon its own filesystem (logs, variable data)/srvor/dataon a dedicated data filesystem
Why separate?
- Prevent logs filling up the whole disk
- Allow different mount options (e.g.
noexecon some directories) - Place heavy‑I/O data on faster or dedicated storage
Storage Types by Use Case
Different data has different requirements:
- OS and system binaries
- Need reliability and read speed
- Usually on SSD/NVMe
- Often standard filesystem like EXT4 or XFS
- Databases
- Very sensitive to latency and I/O patterns
- Often use fast SSD/NVMe
- Filesystem tuned for small random writes and fsync performance
- Backups and archives
- Capacity more important than speed
- Often HDDs, possibly large RAID arrays
- Sometimes deduplication or compression (via filesystem or tools)
- Logs
- Constantly appended to
- Can fill disks quickly
- Usually placed on partitions that won’t impact system stability if full
As an administrator, you design both physical placement (which disk or RAID) and logical layout (which mount points, filesystem options) based on these needs.
Performance, Reliability, and Trade‑offs
Storage decisions always involve trade‑offs between:
- Performance
- Throughput (MB/s)
- IOPS (I/O operations per second)
- Latency (time per operation)
- Capacity
- How much data you can store
- Reliability
- How well data is preserved despite crashes or failures
- Consistency guarantees after unexpected power loss
- Complexity and manageability
- How hard it is to configure, monitor, and troubleshoot
Performance Factors (High‑Level)
Performance depends on:
- Type of media (HDD vs SSD vs NVMe)
- Filesystem design and options
- Access pattern:
- Sequential vs random
- Large files vs many small files
- Kernel tunables and I/O scheduler
- Underlying abstraction:
- RAID level
- LVM layering
- Network storage
Later chapters on disk usage tools and monitoring will help you measure these in practice.
Reliability and Data Integrity
Filesystems and storage stacks use several mechanisms to protect data:
- Journaling
- Logs metadata (and sometimes data) changes before committing them
- Helps recover after crashes with minimal corruption
- Checksumming and copy‑on‑write (COW)
- Some filesystems verify data with checksums
- Copy‑on‑write ensures that old data is preserved until new data is safely written
- RAID and redundancy
- Protect against disk failure (but not user errors, bugs, or malware)
- Still need backups
- Snapshots
- Point‑in‑time views of data
- Useful for quick rollbacks and backups
Long‑term safety requires backups and sometimes offsite copies, which are addressed in the “Backup and Restore” section.
Local vs Network Storage
Not all storage is physically attached to your machine.
Local Storage
- Directly connected devices:
- SATA/NVMe disks
- Local RAID controllers
- Lowest latency
- Generally simplest to manage
- Used for OS, local data, and high‑performance workloads
Network Storage (Overview Only)
Linux can use network‑hosted storage as if it were local:
- File‑level protocols
- Export directories over the network (NFS, Samba)
- Mounted on clients as normal directories
- Permissions and performance depend on server and network
- Block‑level over network
- iSCSI, Fibre Channel, etc.
- Appear as block devices to the OS; you create filesystems on them as usual
These are covered in much more detail in later “Network Services” chapters (e.g., NFS, Samba). At this level, you just need to know that a “disk” might be a network device, not a local one.
Managing Storage Over Time
Storage management is not a one‑time activity during installation; it’s an ongoing responsibility.
Key recurring tasks include:
- Monitoring free space
- Directories like
/var,/tmp, and user home directories can grow unexpectedly - Logs and databases are common culprits
- Extending capacity
- Adding new disks or enlarging LVM volumes
- Creating new filesystems and mount points when needed
- Resizing filesystems
- Some filesystems support online grow or shrink
- Often involves adjusting underlying partitions or logical volumes first
- Cleaning up
- Rotating logs
- Removing old backups or temporary files
- Archiving old data to cheaper storage
- Checking and repairing filesystems
- Periodic integrity checks
- Running filesystem repair tools after failures or improper shutdowns
Those practical tools and workflows will be covered in the subsequent chapters in this section.
How This Chapter Fits with the Rest
This chapter gave you the “map” of Linux storage:
- How physical devices, partitions, LVM, RAID, and filesystems stack together
- What a filesystem does conceptually
- Why layouts and mount points matter
- The main trade‑offs around performance and reliability
Next, the child chapters in “Storage and Filesystems” will drill into specific topics:
- Devices and partitions: how Linux represents disks, and how to slice them
- Filesystems (EXT4, XFS, Btrfs): concrete types and their strengths
- Mounting and unmounting: making storage visible in the directory tree
- Disk usage tools: inspecting where your space goes and how disks perform
- Archiving and compression: how to store and move data efficiently
Keep this mental model of the storage stack in mind as you learn the individual components—it will make the commands and tools much easier to understand.