Kahibaro
Discord Login Register

Troubleshooting boot issues

Understanding Boot Failures

When a Linux system fails to boot correctly, the problem usually lies in one of these stages (covered conceptually in the parent chapter):

  1. Firmware / Bootloader stage (BIOS/UEFI, GRUB)
  2. Kernel loading and initramfs
  3. Hand‑off to PID 1 (typically systemd) and user space

For troubleshooting, your job is to:

This chapter focuses on practical diagnostic and recovery techniques for each stage, not on re‑explaining how the boot process works.


General Strategy for Troubleshooting Boot

  1. Observe what you see:
    • Nothing on screen / firmware only → firmware or disk/bootloader problem.
    • GRUB menu or GRUB error → bootloader stage.
    • Kernel messages then panic / stuck → kernel or initramfs.
    • Systemd messages then hang / login not available → user space / services.
  2. Get more information:
    • Remove quiet and splash from kernel command line to show verbose output.
    • Use recovery / rescue / live media when the system can’t boot at all.
  3. Make minimal changes:
    • Prefer temporary changes from GRUB command line first.
    • Use chroot from a live system for permanent repair.

Common Symptoms and Where to Look

Using GRUB for On‑the‑Fly Troubleshooting

Editing Kernel Parameters Temporarily

  1. At the GRUB menu, highlight your Linux entry.
  2. Press e to edit.
  3. Find the line starting with linux or linuxefi.
  4. At the end of this line, you can:
    • Remove: quiet splash
    • Add useful options:
      • systemd.unit=multi-user.target (boot to text console)
      • systemd.unit=rescue.target (single‑user mode)
      • systemd.unit=emergency.target (minimal environment)
      • nomodeset (basic video, avoid proprietary/buggy GPU drivers)
      • init=/bin/bash (very low‑level shell; for advanced rescue)
  5. Press Ctrl+x or F10 to boot with modified options.

These changes are temporary and will be lost on the next reboot. This is ideal for testing whether a parameter fixes the issue.


Recovering from GRUB Problems

Typical GRUB Errors

Basic Steps from GRUB Prompt

If you see a grub> prompt (normal mode):

  1. List available disks and partitions:
   grub> ls
  1. Inspect a partition:
   grub> ls (hd0,1)
  1. Look for your /boot:
    • When you find a partition whose ls shows /boot or /vmlinuz, note its (hdX,Y).
  2. Set root and prefix:
   grub> set root=(hd0,1)
   grub> set prefix=(hd0,1)/boot/grub
   grub> insmod normal
   grub> normal

If the configuration is mostly intact, this may bring up the GRUB menu.

Reinstalling GRUB (from a Live System)

When GRUB is broken or missing, you typically:

  1. Boot from a live USB of the same (or compatible) distribution.
  2. Identify your root partition:
   $ lsblk
  1. Mount it (assuming root is /dev/sda2):
   $ sudo mount /dev/sda2 /mnt
  1. If /boot is separate (e.g. /dev/sda1):
   $ sudo mount /dev/sda1 /mnt/boot
  1. For UEFI systems, mount the EFI System Partition (e.g. /dev/sda1 or /dev/nvme0n1p1):
   $ sudo mount /dev/sda1 /mnt/boot/efi
  1. Bind mount system directories and chroot:
   $ sudo mount --bind /dev  /mnt/dev
   $ sudo mount --bind /proc /mnt/proc
   $ sudo mount --bind /sys  /mnt/sys
   $ sudo chroot /mnt
  1. Reinstall GRUB (examples):
    • BIOS systems (MBR):
     # grub-install /dev/sda
     # update-grub       # Debian/Ubuntu-based
     # grub2-mkconfig -o /boot/grub2/grub.cfg  # many RHEL/Fedora systems
     # grub-install --target=x86_64-efi --efi-directory=/boot/efi --bootloader-id=GRUB
     # update-grub
  1. Exit the chroot, unmount, reboot:
   # exit
   $ sudo umount /mnt/dev /mnt/proc /mnt/sys
   $ sudo umount /mnt/boot/efi /mnt/boot /mnt  # as applicable
   $ sudo reboot

Exact commands vary slightly by distribution, but the workflow—mount → chroot → grub-install → regenerate config—is the same.


Dealing with Kernel and Initramfs Issues

Common Symptoms

Quick Tests via GRUB

From the GRUB edit screen:

You can temporarily adjust the root= parameter to match the correct device (found with ls in GRUB).

Recreating Initramfs

If the initramfs is corrupted or missing drivers for your root filesystem:

  1. Boot from:
    • An older working kernel, or
    • A rescue mode, or
    • Live media + chroot as shown for GRUB repair.
  2. Inside the real system environment:
    • On Debian/Ubuntu:
     # update-initramfs -u -k all
     # dracut --force
  1. Reboot and test.

Fixing Incorrect Root Device in Config

If /etc/fstab or bootloader configuration points to the wrong root:

  1. Boot into rescue mode (e.g. systemd.unit=emergency.target) or from live-chroot.
  2. Check device names and UUIDs:
   # lsblk -f
  1. Compare with /etc/fstab and GRUB config (e.g. /boot/grub/grub.cfg or /boot/grub2/grub.cfg).
  2. Update entries to match current:
    • Prefer using UUID= rather than /dev/sdX when possible.
  3. Regenerate GRUB config if necessary:
   # update-grub
   # grub2-mkconfig -o /boot/grub2/grub.cfg

Systemd and User‑Space Boot Problems

Once the kernel starts systemd (or another init), failures often manifest as hanging on certain services, or as dropping into emergency or rescue mode.

Booting into Rescue or Emergency Mode

Use GRUB to add on the kernel line:

In these modes, troubleshoot:

Fixing Broken /etc/fstab

A common cause of hanging at boot is an invalid or missing mount device in /etc/fstab.

  1. In emergency shell, root is often read‑only. Remount it read‑write:
   # mount -o remount,rw /
  1. Inspect problematic entries in /etc/fstab:
   # nano /etc/fstab
  1. Typical issues:
    • Referencing a non‑existent partition.
    • Wrong filesystem type.
    • / root line incorrect.
  2. For non‑critical mounts, temporarily:
    • Comment them with #, or
    • Add nofail or noauto options.
  3. Save, then reboot:
   # reboot

If the error message mentions “A start job is running for …” pointing to a mount, that’s your clue to check /etc/fstab.

Identifying Problematic Services

If boot seems to hang on a service:

  # systemctl --failed
  # systemctl status <service-name>
  # systemctl disable <service-name>
  # systemctl mask <service-name>   # prevent it from starting at all

For systems that almost boot but drop to console, journalctl -b is invaluable to review the last boot’s logs.


Using Live Media and Chroot for Recovery

When the system won’t boot at all, a live USB is your main tool.

Standard chroot Procedure

  1. Boot live environment.
  2. Identify your Linux partitions:
   $ lsblk -f
  1. Mount them under /mnt:
   $ sudo mount /dev/sdXn /mnt          # root partition
   $ sudo mount /dev/sdYm /mnt/boot     # if separate
   $ sudo mount /dev/sdZp /mnt/boot/efi # if UEFI
  1. Bind mount system directories:
   $ sudo mount --bind /dev  /mnt/dev
   $ sudo mount --bind /proc /mnt/proc
   $ sudo mount --bind /sys  /mnt/sys
  1. Enter chroot:
   $ sudo chroot /mnt
  1. Now you can:
    • Reinstall GRUB
    • Regenerate initramfs
    • Edit /etc/fstab
    • Remove or reconfigure problematic packages/services
  2. When done:
   # exit
   $ sudo umount /mnt/dev /mnt/proc /mnt/sys
   $ sudo umount /mnt/boot/efi /mnt/boot /mnt  # as applicable
   $ sudo reboot

File System Corruption and fsck

File system problems often cause boot failures or drops to emergency mode with messages suggesting fsck.

Running fsck Safely

Example (for /dev/sda2):

# fsck -f /dev/sda2

Options:

If root filesystem is dirty and system suggests fsck, you can also:

Recovering from Misconfigured Graphics / Display Manager

If the system appears to boot (you see kernel / systemd messages) but ends with a black screen:

Once at a text login:

  1. Check display manager/service:
   # systemctl status display-manager
   # journalctl -u display-manager -b
  1. Disable problematic display manager and use another (e.g., switch from GDM to LightDM).
  2. Remove or reconfigure problematic proprietary drivers (NVIDIA, etc.).

Dealing with Boot After Hardware Changes

Common scenarios:

Checklist:

  1. Use lsblk -f to confirm device names and UUIDs.
  2. Align:
    • /etc/fstab entries (prefer UUID= or LABEL=).
    • GRUB root configuration (regenerate config).
  3. If moved to new hardware and missing drivers:
    • Rebuild initramfs (update-initramfs or dracut).
    • Ensure modules for new controller/filesystem are included.

Using Logs to Understand Past Boot Failures

When the system eventually boots (even with issues), you can review previous failed boots via journalctl:

  $ journalctl -b
  $ journalctl -b -1
  $ journalctl -u <service-name> -b
  $ journalctl -p err -b

This helps trace what failed and when, especially for intermittent boot issues.


Safe Practices and Recovery Mindset

With these techniques, you’ll be able to systematically approach most Linux boot issues and bring systems back without resorting to full reinstallation.

Views: 24

Comments

Please login to add a comment.

Don't have an account? Register now!