6.1.3 Debugging tools (gdb, strace)

Table of Contents

Overview

In software development on Linux, debugging tools let you observe what a program is doing internally and how it interacts with the operating system. Two core tools for this are gdb, the GNU Debugger, and strace, a system call tracer.

In this chapter you will see how these tools are used in practice, what kinds of problems they help you find, and how to interpret their basic output. The goal is not to teach all of C or low level programming, but to make you comfortable running these tools and understanding their role in a developer workflow on Linux.

Preparing Programs for Debugging

gdb is most useful when your program is compiled with debugging information. For compiled languages such as C or C++, this usually means compiling with the -g flag and avoiding heavy optimization during debugging.

For example, a simple C program might be compiled like this:

gcc -g -O0 -o myprog main.c

The flag -g tells the compiler to include debug symbols such as line numbers and variable names. The option -O0 disables optimizations that would reorder or remove code and make debugging more confusing.

Without debug symbols, gdb can still work, but it can only show raw addresses and limited information. It is still possible to attach gdb to a running process, but you will see fewer details.

strace does not need any special compilation flags. It works by intercepting system calls at runtime, so you can usually use it on any executable on the system, even binaries you did not compile yourself.

Always compile debug builds with -g when you expect to use gdb. Debug symbols are essential for readable backtraces and stepping through source code.

Basic Concepts: Debugger vs System Call Tracer

gdb and strace solve different parts of the debugging puzzle.

gdb inspects and controls the program itself. It lets you pause execution, view and change variables, step through code line by line, examine the stack, and investigate crashes. It understands symbols if they are available and can map from machine instructions back to your source files.

strace focuses on the boundary between your program and the Linux kernel. It shows which system calls your program makes, which file paths it opens, which network connections it attempts, how long these calls take, and which errors it receives from the kernel. It does not know about your local variables or source lines. Instead, it reveals how the process interacts with the operating system.

In practice, gdb helps you answer questions like “What is the value of this variable when the bug happens?” while strace helps you answer questions like “Why does this program fail to open this file or connect to that address?” Both tools are complementary.

Getting Started with GDB

gdb is usually installed as a package, often named gdb. Once installed, you can run it by passing the program you want to debug:

gdb ./myprog

This starts an interactive session that prints some startup messages and then displays a prompt that looks like:

(gdb)

Most gdb commands are entered at this prompt. You can start the program from inside gdb with:

(gdb) run

If your program expects arguments, add them after run:

(gdb) run arg1 arg2

When the program exits normally, gdb will tell you the exit status and return to the prompt.

You can also attach gdb to a process that is already running. For that, you need the process ID, often found with tools like ps or pgrep. Once you know it, attach like this:

gdb -p 12345

or from inside gdb:

(gdb) attach 12345

Attaching pauses the process so that you can inspect its state.

Running to and From Breakpoints

One of the central ideas in gdb is the breakpoint. A breakpoint is a location in the program where execution will pause so you can inspect state. Once a breakpoint is hit, you can continue execution, step through code, or modify things.

You can set a breakpoint by function name:

(gdb) break main

or by file and line, for example:

(gdb) break main.c:42

When the program reaches that line or function, it stops and returns control to the gdb prompt. From there, you can run commands before letting it continue.

To continue execution until the next breakpoint or until the program finishes, use:

(gdb) continue

If you want to run only a single line of source code at a time, use:

(gdb) next

This steps over function calls. It executes them as a single unit and then returns to the next line in the current function.

If you want to step into functions, use:

(gdb) step

This takes you inside the function calls so you can follow execution into deeper levels of the call stack.

You can list active breakpoints with:

(gdb) info breakpoints

and delete one by its number, for example:

(gdb) delete 1

Inspecting Variables and State in GDB

Once your program is paused at a breakpoint or after a crash, you can examine variables and state. The simplest inspection command is:

(gdb) print x

which prints the value of a variable x in the current scope. You can also print expressions, such as:

(gdb) print x + y

or dereference pointers where appropriate. If the program is compiled with debug symbols, gdb can show structured variables and their fields.

You can change variables at runtime:

(gdb) set variable x = 10

After this, if you continue the program, it will run with the new value. This can be useful for testing conditions or bypassing certain code paths while investigating.

You can also see where you are in the program and which functions were called. The command:

(gdb) backtrace

shows a stack trace with frames, which correspond to function calls. Each frame has a number. You can select a specific frame with:

(gdb) frame 2

and then inspect local variables within that frame.

If you want to see the source code around the current line, use:

(gdb) list

This shows several lines near the current position. You can provide a function name or file and line to list to show specific parts of the code.

When using gdb on production systems, be careful about changing variable values. Modifying state inside a debugger can alter program behavior and may hide or introduce bugs.

Debugging Crashes and Core Dumps with GDB

On Linux, if a program crashes due to a serious error such as a segmentation fault, the system can save a “core dump”, which is a snapshot of the program memory at the time of the crash. If core dumps are enabled, you can analyze them later with gdb.

First, you need to allow core dumps. A simple way in a shell is:

ulimit -c unlimited

Now, when the program crashes, it may produce a file named core or something similar, depending on system settings. You can open this with gdb:

gdb ./myprog core

Inside gdb, you can run backtrace, inspect variables, and see where the crash occurred, even though the process is no longer running. This is helpful for intermittent crashes that are hard to reproduce interactively.

For programs without debug symbols, backtrace will show function addresses only. If the binary is stripped but you have a separate debug information package installed (common on some distributions), gdb can still use those to reconstruct readable traces.

Introducing strace

While gdb works with internal program state, strace records the system calls that a process makes. A system call is a request from user space to the kernel, such as opening a file, reading data, creating a socket, or allocating memory. Many bugs manifest as failures of system calls: a file is missing, permissions are wrong, a network host is unreachable, or a resource limit is exceeded.

You can run a program under strace like this:

strace ./myprog

This prints a line for nearly every system call the process makes. A typical line might look like:

openat(AT_FDCWD, "config.txt", O_RDONLY) = 3

This means the program called openat with the directory descriptor AT_FDCWD, the path "config.txt", and the flag O_RDONLY. The return value is 3, which is a file descriptor number. If the call had failed, you might see:

openat(AT_FDCWD, "config.txt", O_RDONLY) = -1 ENOENT (No such file or directory)

Here -1 indicates an error and ENOENT is the error code. The human readable part in parentheses explains the error.

If you want to attach strace to a running process instead of starting a new one, you can pass the process ID:

strace -p 12345

This prints system calls as they happen. To stop tracing, interrupt strace with the usual interrupt key such as Ctrl+C.

Reducing strace Output and Focusing on Issues

By default, strace prints every system call, which can be overwhelming. It is often more useful to filter or summarize.

To focus on file related calls, you can use:

strace -e trace=file ./myprog

This will show only calls like openat, stat, access, and others concerning file system operations. A similar filter for network related calls is:

strace -e trace=network ./myprog

Another useful command is:

strace -e trace=process ./myprog

which focuses on process management calls such as fork, execve, and clone.

You can also trace only a few specific calls:

strace -e trace=openat,read,write ./myprog

If you want to see how long each system call takes, use:

strace -T ./myprog

This adds timing information in seconds to each line, such as:

read(3, "data", 4096) = 512 <0.000012>

which shows the result and the time spent in the system call.

Sometimes you want to save all output for later analysis. You can redirect it like this:

strace -o trace.log ./myprog

The option -o sends the trace to a file instead of standard error.

strace can reveal sensitive information, including file paths, environment variables, and data being read or written. Be careful when sharing trace logs and avoid running strace on processes that handle confidential data unless necessary.

Using strace to Debug Common Problems

Certain categories of problems are particularly suited to strace. For instance, if a program complains that it cannot find a configuration file, but you believe it is present, run the program with strace -e trace=file and inspect which paths it actually tries to open. You might discover that it looks in a different directory or under a different filename than you expected.

For permission issues, strace shows the return codes and error names. If openat returns EACCES, the process does not have permission to open the file, possibly due to file mode, ownership, or a protective mechanism like a mandatory access control system.

For networking, tracing with -e trace=network lets you see socket creation, connection attempts, and errors like ECONNREFUSED or ETIMEDOUT. This can confirm whether the program is trying to connect where you think it is, and how the kernel responds.

If a program appears to hang, you can attach strace to it to see if it is blocked in a system call, perhaps waiting for input from a file descriptor. If all you see is repeated calls to read on a descriptor that never returns data, it suggests the process is waiting on that resource.

Combining GDB and strace in a Debugging Workflow

In practice, you may use both gdb and strace in the same investigation. A common pattern is to start with strace to see if the failure is due to an obvious system level issue such as missing files or denied access. If the system calls succeed but the program logic still behaves incorrectly, move to gdb to inspect variables and control flow.

You can also run strace on a process that is being debugged by gdb, although care is required. One approach is to start the program under gdb, then from another terminal use strace -p to attach to the same process. This is most helpful when you need both internal state and external system call information.

In some situations, you might use strace first on a misbehaving process in a test or production environment, then reproduce the behavior locally with the same inputs and use gdb in a development environment where you have debug symbols and full source access.

Summary

gdb and strace are foundational debugging tools for developers on Linux. gdb lets you pause execution, step through code, examine variables, view stack traces, and analyze crashes, especially when programs are compiled with -g. strace records system calls, return values, errors, and timings so you can see how a process interacts with the kernel and external resources. Used together, they give you a detailed view of both internal logic and external behavior, which is essential when diagnosing real world bugs on Linux.

Comments

Please login to add a comment.

Don't have an account? Register now!