Table of Contents
Understanding Boot Failures
When a Linux system fails to boot correctly, the problem usually lies in one of these stages (covered conceptually in the parent chapter):
- Firmware / Bootloader stage (BIOS/UEFI, GRUB)
- Kernel loading and initramfs
- Hand‑off to
PID 1(typicallysystemd) and user space
For troubleshooting, your job is to:
- Identify at which stage the failure happens.
- Collect error messages.
- Apply stage‑appropriate recovery steps.
This chapter focuses on practical diagnostic and recovery techniques for each stage, not on re‑explaining how the boot process works.
General Strategy for Troubleshooting Boot
- Observe what you see:
- Nothing on screen / firmware only → firmware or disk/bootloader problem.
- GRUB menu or GRUB error → bootloader stage.
- Kernel messages then panic / stuck → kernel or initramfs.
- Systemd messages then hang / login not available → user space / services.
- Get more information:
- Remove
quietandsplashfrom kernel command line to show verbose output. - Use recovery / rescue / live media when the system can’t boot at all.
- Make minimal changes:
- Prefer temporary changes from GRUB command line first.
- Use chroot from a live system for permanent repair.
Common Symptoms and Where to Look
- “No bootable device” / straight to firmware: disk not seen or no bootloader.
- “GRUB rescue>” / “error: file not found” (GRUB): GRUB misconfigured or missing.
- “Kernel panic – not syncing” early: kernel or initramfs issue.
- “cannot mount root fs” / “VFS: Unable to mount root fs”: wrong root device, missing drivers.
- Boot hangs with last message about a service: user‑space / systemd issue.
- Black screen after graphical splash: graphics, display manager, or desktop login issue.
Using GRUB for On‑the‑Fly Troubleshooting
Editing Kernel Parameters Temporarily
- At the GRUB menu, highlight your Linux entry.
- Press
eto edit. - Find the line starting with
linuxorlinuxefi. - At the end of this line, you can:
- Remove:
quietsplash - Add useful options:
systemd.unit=multi-user.target(boot to text console)systemd.unit=rescue.target(single‑user mode)systemd.unit=emergency.target(minimal environment)nomodeset(basic video, avoid proprietary/buggy GPU drivers)init=/bin/bash(very low‑level shell; for advanced rescue)- Press
Ctrl+xorF10to boot with modified options.
These changes are temporary and will be lost on the next reboot. This is ideal for testing whether a parameter fixes the issue.
Recovering from GRUB Problems
Typical GRUB Errors
error: no such device <UUID>error: file '/boot/grub/x86_64-efi/normal.mod' not found- Dropped to
grub rescue>prompt.
Basic Steps from GRUB Prompt
If you see a grub> prompt (normal mode):
- List available disks and partitions:
grub> ls- Inspect a partition:
grub> ls (hd0,1)- Look for your
/boot: - When you find a partition whose
lsshows/bootor/vmlinuz, note its(hdX,Y). - Set root and prefix:
grub> set root=(hd0,1)
grub> set prefix=(hd0,1)/boot/grub
grub> insmod normal
grub> normalIf the configuration is mostly intact, this may bring up the GRUB menu.
Reinstalling GRUB (from a Live System)
When GRUB is broken or missing, you typically:
- Boot from a live USB of the same (or compatible) distribution.
- Identify your root partition:
$ lsblk- Mount it (assuming root is
/dev/sda2):
$ sudo mount /dev/sda2 /mnt- If
/bootis separate (e.g./dev/sda1):
$ sudo mount /dev/sda1 /mnt/boot- For UEFI systems, mount the EFI System Partition (e.g.
/dev/sda1or/dev/nvme0n1p1):
$ sudo mount /dev/sda1 /mnt/boot/efi- Bind mount system directories and chroot:
$ sudo mount --bind /dev /mnt/dev
$ sudo mount --bind /proc /mnt/proc
$ sudo mount --bind /sys /mnt/sys
$ sudo chroot /mnt- Reinstall GRUB (examples):
- BIOS systems (MBR):
# grub-install /dev/sda
# update-grub # Debian/Ubuntu-based
# grub2-mkconfig -o /boot/grub2/grub.cfg # many RHEL/Fedora systems- UEFI systems:
# grub-install --target=x86_64-efi --efi-directory=/boot/efi --bootloader-id=GRUB
# update-grub- Exit the chroot, unmount, reboot:
# exit
$ sudo umount /mnt/dev /mnt/proc /mnt/sys
$ sudo umount /mnt/boot/efi /mnt/boot /mnt # as applicable
$ sudo reboot
Exact commands vary slightly by distribution, but the workflow—mount → chroot → grub-install → regenerate config—is the same.
Dealing with Kernel and Initramfs Issues
Common Symptoms
- Early kernel panic with messages like:
VFS: Unable to mount root fs on unknown-block(0,0)Kernel panic - not syncing: Attempted to kill init!- Errors about missing root device or drivers.
Quick Tests via GRUB
From the GRUB edit screen:
- Try an older kernel (another menu entry).
- Remove
quietandsplashto see full messages. - Check for obviously wrong options on the
linuxline: root=UUID=...pointing to a device that doesn’t exist- Wrong
root=/dev/sdXYorroot=/dev/mapper/...
You can temporarily adjust the root= parameter to match the correct device (found with ls in GRUB).
Recreating Initramfs
If the initramfs is corrupted or missing drivers for your root filesystem:
- Boot from:
- An older working kernel, or
- A rescue mode, or
- Live media + chroot as shown for GRUB repair.
- Inside the real system environment:
- On Debian/Ubuntu:
# update-initramfs -u -k all- On Fedora/RHEL (dracut):
# dracut --force- Reboot and test.
Fixing Incorrect Root Device in Config
If /etc/fstab or bootloader configuration points to the wrong root:
- Boot into rescue mode (e.g.
systemd.unit=emergency.target) or from live-chroot. - Check device names and UUIDs:
# lsblk -f- Compare with
/etc/fstaband GRUB config (e.g./boot/grub/grub.cfgor/boot/grub2/grub.cfg). - Update entries to match current:
- Prefer using
UUID=rather than/dev/sdXwhen possible. - Regenerate GRUB config if necessary:
# update-grub
# grub2-mkconfig -o /boot/grub2/grub.cfgSystemd and User‑Space Boot Problems
Once the kernel starts systemd (or another init), failures often manifest as hanging on certain services, or as dropping into emergency or rescue mode.
Booting into Rescue or Emergency Mode
Use GRUB to add on the kernel line:
systemd.unit=rescue.target(single‑user, most filesystems mounted)systemd.unit=emergency.target(very minimal, usually root only)
In these modes, troubleshoot:
- Broken
/etc/fstab - Services that block boot
- File system problems
Fixing Broken /etc/fstab
A common cause of hanging at boot is an invalid or missing mount device in /etc/fstab.
- In emergency shell, root is often read‑only. Remount it read‑write:
# mount -o remount,rw /- Inspect problematic entries in
/etc/fstab:
# nano /etc/fstab- Typical issues:
- Referencing a non‑existent partition.
- Wrong filesystem type.
/root line incorrect.- For non‑critical mounts, temporarily:
- Comment them with
#, or - Add
nofailornoautooptions. - Save, then reboot:
# reboot
If the error message mentions “A start job is running for …” pointing to a mount, that’s your clue to check /etc/fstab.
Identifying Problematic Services
If boot seems to hang on a service:
- From a working or rescue boot, run:
# systemctl --failed
# systemctl status <service-name>- Disable a problematic service:
# systemctl disable <service-name>
# systemctl mask <service-name> # prevent it from starting at all- After masking/disabling, reboot and see if boot completes.
For systems that almost boot but drop to console, journalctl -b is invaluable to review the last boot’s logs.
Using Live Media and Chroot for Recovery
When the system won’t boot at all, a live USB is your main tool.
Standard chroot Procedure
- Boot live environment.
- Identify your Linux partitions:
$ lsblk -f- Mount them under
/mnt:
$ sudo mount /dev/sdXn /mnt # root partition
$ sudo mount /dev/sdYm /mnt/boot # if separate
$ sudo mount /dev/sdZp /mnt/boot/efi # if UEFI- Bind mount system directories:
$ sudo mount --bind /dev /mnt/dev
$ sudo mount --bind /proc /mnt/proc
$ sudo mount --bind /sys /mnt/sys- Enter chroot:
$ sudo chroot /mnt- Now you can:
- Reinstall GRUB
- Regenerate initramfs
- Edit
/etc/fstab - Remove or reconfigure problematic packages/services
- When done:
# exit
$ sudo umount /mnt/dev /mnt/proc /mnt/sys
$ sudo umount /mnt/boot/efi /mnt/boot /mnt # as applicable
$ sudo rebootFile System Corruption and fsck
File system problems often cause boot failures or drops to emergency mode with messages suggesting fsck.
Running fsck Safely
- Never run
fsckon a mounted read‑write filesystem. - Best practice:
- Boot from a live USB, or
- Boot into emergency mode where affected filesystem is not in use.
Example (for /dev/sda2):
# fsck -f /dev/sda2Options:
-f: force a full check- Many tools (e.g.
e2fsckfor ext4) will ask before fixing; use-yto auto‑answer yes (be cautious).
If root filesystem is dirty and system suggests fsck, you can also:
- Add
fsck.mode=force fsck.repair=yesto the kernel command line temporarily.
Recovering from Misconfigured Graphics / Display Manager
If the system appears to boot (you see kernel / systemd messages) but ends with a black screen:
- Try text mode boot:
- Edit GRUB kernel line to add
systemd.unit=multi-user.target. - Or add
nomodesetto disable some advanced graphics drivers.
Once at a text login:
- Check display manager/service:
# systemctl status display-manager
# journalctl -u display-manager -b- Disable problematic display manager and use another (e.g., switch from GDM to LightDM).
- Remove or reconfigure problematic proprietary drivers (NVIDIA, etc.).
Dealing with Boot After Hardware Changes
Common scenarios:
- Disk reordering (new disk added).
- Moving disk to another machine.
- Changing storage controllers (IDE/AHCI/RAID mode).
Checklist:
- Use
lsblk -fto confirm device names and UUIDs. - Align:
/etc/fstabentries (preferUUID=orLABEL=).- GRUB root configuration (regenerate config).
- If moved to new hardware and missing drivers:
- Rebuild initramfs (
update-initramfsordracut). - Ensure modules for new controller/filesystem are included.
Using Logs to Understand Past Boot Failures
When the system eventually boots (even with issues), you can review previous failed boots via journalctl:
- Last boot:
$ journalctl -b- Previous boot:
$ journalctl -b -1- Filter by service:
$ journalctl -u <service-name> -b- Show only errors and higher:
$ journalctl -p err -bThis helps trace what failed and when, especially for intermittent boot issues.
Safe Practices and Recovery Mindset
- Keep a live USB handy for your main distribution.
- Avoid random edits to bootloader or
/etc/fstabwithout notes; document what you change. - Make regular backups of:
/etc/boot- Important user data
- When stuck:
- Identify the stage (firmware, GRUB, kernel, systemd).
- Collect exact messages.
- Test minimal, reversible fixes (temporary kernel params, disabling one service).
- Only then commit to permanent configuration changes.
With these techniques, you’ll be able to systematically approach most Linux boot issues and bring systems back without resorting to full reinstallation.