4.2.5 Troubleshooting boot issues

Table of Contents

Understanding Boot Failures

When a Linux system fails to boot correctly, the problem usually lies in one of these stages (covered conceptually in the parent chapter):

Firmware / Bootloader stage (BIOS/UEFI, GRUB)
Kernel loading and initramfs
Hand‑off to PID 1 (typically systemd) and user space

For troubleshooting, your job is to:

Identify at which stage the failure happens.
Collect error messages.
Apply stage‑appropriate recovery steps.

This chapter focuses on practical diagnostic and recovery techniques for each stage, not on re‑explaining how the boot process works.

General Strategy for Troubleshooting Boot

Observe what you see:

Nothing on screen / firmware only → firmware or disk/bootloader problem.
GRUB menu or GRUB error → bootloader stage.
Kernel messages then panic / stuck → kernel or initramfs.
Systemd messages then hang / login not available → user space / services.

Get more information:

Remove quiet and splash from kernel command line to show verbose output.
Use recovery / rescue / live media when the system can’t boot at all.

Make minimal changes:

Prefer temporary changes from GRUB command line first.
Use chroot from a live system for permanent repair.

Common Symptoms and Where to Look

“No bootable device” / straight to firmware: disk not seen or no bootloader.
“GRUB rescue>” / “error: file not found” (GRUB): GRUB misconfigured or missing.
“Kernel panic – not syncing” early: kernel or initramfs issue.
“cannot mount root fs” / “VFS: Unable to mount root fs”: wrong root device, missing drivers.
Boot hangs with last message about a service: user‑space / systemd issue.
Black screen after graphical splash: graphics, display manager, or desktop login issue.

Using GRUB for On‑the‑Fly Troubleshooting

Editing Kernel Parameters Temporarily

At the GRUB menu, highlight your Linux entry.
Press e to edit.
Find the line starting with linux or linuxefi.
At the end of this line, you can:

Remove: quiet splash
Add useful options:

systemd.unit=multi-user.target (boot to text console)
systemd.unit=rescue.target (single‑user mode)
systemd.unit=emergency.target (minimal environment)
nomodeset (basic video, avoid proprietary/buggy GPU drivers)
init=/bin/bash (very low‑level shell; for advanced rescue)

Press Ctrl+x or F10 to boot with modified options.

These changes are temporary and will be lost on the next reboot. This is ideal for testing whether a parameter fixes the issue.

Recovering from GRUB Problems

Typical GRUB Errors

error: no such device <UUID>
error: file '/boot/grub/x86_64-efi/normal.mod' not found
Dropped to grub rescue> prompt.

Basic Steps from GRUB Prompt

If you see a grub> prompt (normal mode):

List available disks and partitions:

   grub> ls

Inspect a partition:

   grub> ls (hd0,1)

Look for your /boot:

When you find a partition whose ls shows /boot or /vmlinuz, note its (hdX,Y).

Set root and prefix:

   grub> set root=(hd0,1)
   grub> set prefix=(hd0,1)/boot/grub
   grub> insmod normal
   grub> normal

If the configuration is mostly intact, this may bring up the GRUB menu.

Reinstalling GRUB (from a Live System)

When GRUB is broken or missing, you typically:

Boot from a live USB of the same (or compatible) distribution.
Identify your root partition:

   $ lsblk

Mount it (assuming root is /dev/sda2):

   $ sudo mount /dev/sda2 /mnt

If /boot is separate (e.g. /dev/sda1):

   $ sudo mount /dev/sda1 /mnt/boot

For UEFI systems, mount the EFI System Partition (e.g. /dev/sda1 or /dev/nvme0n1p1):

   $ sudo mount /dev/sda1 /mnt/boot/efi

Bind mount system directories and chroot:

   $ sudo mount --bind /dev  /mnt/dev
   $ sudo mount --bind /proc /mnt/proc
   $ sudo mount --bind /sys  /mnt/sys
   $ sudo chroot /mnt

Reinstall GRUB (examples):

BIOS systems (MBR):

     # grub-install /dev/sda
     # update-grub       # Debian/Ubuntu-based
     # grub2-mkconfig -o /boot/grub2/grub.cfg  # many RHEL/Fedora systems

UEFI systems:

     # grub-install --target=x86_64-efi --efi-directory=/boot/efi --bootloader-id=GRUB
     # update-grub

Exit the chroot, unmount, reboot:

   # exit
   $ sudo umount /mnt/dev /mnt/proc /mnt/sys
   $ sudo umount /mnt/boot/efi /mnt/boot /mnt  # as applicable
   $ sudo reboot

Exact commands vary slightly by distribution, but the workflow—mount → chroot → grub-install → regenerate config—is the same.

Dealing with Kernel and Initramfs Issues

Common Symptoms

Early kernel panic with messages like:

VFS: Unable to mount root fs on unknown-block(0,0)
Kernel panic - not syncing: Attempted to kill init!

Errors about missing root device or drivers.

Quick Tests via GRUB

From the GRUB edit screen:

Try an older kernel (another menu entry).
Remove quiet and splash to see full messages.
Check for obviously wrong options on the linux line:

root=UUID=... pointing to a device that doesn’t exist
Wrong root=/dev/sdXY or root=/dev/mapper/...

You can temporarily adjust the root= parameter to match the correct device (found with ls in GRUB).

Recreating Initramfs

If the initramfs is corrupted or missing drivers for your root filesystem:

Boot from:

An older working kernel, or
A rescue mode, or
Live media + chroot as shown for GRUB repair.

Inside the real system environment:

On Debian/Ubuntu:

     # update-initramfs -u -k all

On Fedora/RHEL (dracut):

     # dracut --force

Reboot and test.

Fixing Incorrect Root Device in Config

If /etc/fstab or bootloader configuration points to the wrong root:

Boot into rescue mode (e.g. systemd.unit=emergency.target) or from live-chroot.
Check device names and UUIDs:

   # lsblk -f

Compare with /etc/fstab and GRUB config (e.g. /boot/grub/grub.cfg or /boot/grub2/grub.cfg).
Update entries to match current:

Prefer using UUID= rather than /dev/sdX when possible.

Regenerate GRUB config if necessary:

   # update-grub
   # grub2-mkconfig -o /boot/grub2/grub.cfg

Systemd and User‑Space Boot Problems

Once the kernel starts systemd (or another init), failures often manifest as hanging on certain services, or as dropping into emergency or rescue mode.

Booting into Rescue or Emergency Mode

Use GRUB to add on the kernel line:

systemd.unit=rescue.target (single‑user, most filesystems mounted)
systemd.unit=emergency.target (very minimal, usually root only)

In these modes, troubleshoot:

Broken /etc/fstab
Services that block boot
File system problems

Fixing Broken /etc/fstab

A common cause of hanging at boot is an invalid or missing mount device in /etc/fstab.

In emergency shell, root is often read‑only. Remount it read‑write:

   # mount -o remount,rw /

Inspect problematic entries in /etc/fstab:

   # nano /etc/fstab

Typical issues:

Referencing a non‑existent partition.
Wrong filesystem type.
/ root line incorrect.

For non‑critical mounts, temporarily:

Comment them with #, or
Add nofail or noauto options.

Save, then reboot:

   # reboot

If the error message mentions “A start job is running for …” pointing to a mount, that’s your clue to check /etc/fstab.

Identifying Problematic Services

If boot seems to hang on a service:

From a working or rescue boot, run:

  # systemctl --failed
  # systemctl status <service-name>

Disable a problematic service:

  # systemctl disable <service-name>
  # systemctl mask <service-name>   # prevent it from starting at all

After masking/disabling, reboot and see if boot completes.

For systems that almost boot but drop to console, journalctl -b is invaluable to review the last boot’s logs.

Using Live Media and Chroot for Recovery

When the system won’t boot at all, a live USB is your main tool.

Standard chroot Procedure

Boot live environment.
Identify your Linux partitions:

   $ lsblk -f

Mount them under /mnt:

   $ sudo mount /dev/sdXn /mnt          # root partition
   $ sudo mount /dev/sdYm /mnt/boot     # if separate
   $ sudo mount /dev/sdZp /mnt/boot/efi # if UEFI

Bind mount system directories:

   $ sudo mount --bind /dev  /mnt/dev
   $ sudo mount --bind /proc /mnt/proc
   $ sudo mount --bind /sys  /mnt/sys

Enter chroot:

   $ sudo chroot /mnt

Now you can:

Reinstall GRUB
Regenerate initramfs
Edit /etc/fstab
Remove or reconfigure problematic packages/services

When done:

   # exit
   $ sudo umount /mnt/dev /mnt/proc /mnt/sys
   $ sudo umount /mnt/boot/efi /mnt/boot /mnt  # as applicable
   $ sudo reboot

File System Corruption and fsck

File system problems often cause boot failures or drops to emergency mode with messages suggesting fsck.

Running fsck Safely

Never run fsck on a mounted read‑write filesystem.
Best practice:

Boot from a live USB, or
Boot into emergency mode where affected filesystem is not in use.

Example (for /dev/sda2):

# fsck -f /dev/sda2

Options:

-f: force a full check
Many tools (e.g. e2fsck for ext4) will ask before fixing; use -y to auto‑answer yes (be cautious).

If root filesystem is dirty and system suggests fsck, you can also:

Add fsck.mode=force fsck.repair=yes to the kernel command line temporarily.

Recovering from Misconfigured Graphics / Display Manager

If the system appears to boot (you see kernel / systemd messages) but ends with a black screen:

Try text mode boot:

Edit GRUB kernel line to add systemd.unit=multi-user.target.
Or add nomodeset to disable some advanced graphics drivers.

Once at a text login:

Check display manager/service:

   # systemctl status display-manager
   # journalctl -u display-manager -b

Disable problematic display manager and use another (e.g., switch from GDM to LightDM).
Remove or reconfigure problematic proprietary drivers (NVIDIA, etc.).

Dealing with Boot After Hardware Changes

Common scenarios:

Disk reordering (new disk added).
Moving disk to another machine.
Changing storage controllers (IDE/AHCI/RAID mode).

Checklist:

Use lsblk -f to confirm device names and UUIDs.
Align:

/etc/fstab entries (prefer UUID= or LABEL=).
GRUB root configuration (regenerate config).

If moved to new hardware and missing drivers:

Rebuild initramfs (update-initramfs or dracut).
Ensure modules for new controller/filesystem are included.

Using Logs to Understand Past Boot Failures

When the system eventually boots (even with issues), you can review previous failed boots via journalctl:

Last boot:

  $ journalctl -b

Previous boot:

  $ journalctl -b -1

Filter by service:

  $ journalctl -u <service-name> -b

Show only errors and higher:

  $ journalctl -p err -b

This helps trace what failed and when, especially for intermittent boot issues.

Safe Practices and Recovery Mindset

Keep a live USB handy for your main distribution.
Avoid random edits to bootloader or /etc/fstab without notes; document what you change.
Make regular backups of:

/etc
/boot
Important user data

When stuck:

Identify the stage (firmware, GRUB, kernel, systemd).
Collect exact messages.
Test minimal, reversible fixes (temporary kernel params, disabling one service).
Only then commit to permanent configuration changes.

With these techniques, you’ll be able to systematically approach most Linux boot issues and bring systems back without resorting to full reinstallation.

Comments

Please login to add a comment.

Don't have an account? Register now!