4.1.3 Debugging scripts

Table of Contents

Why Debugging Matters in Shell Scripts

Shell scripts often start small and simple but quickly grow in complexity. As they grow, mistakes become harder to spot by just reading the code. Debugging is the practice of finding and fixing those mistakes in a controlled and systematic way.

In shell scripting, debugging is especially important because the shell is very permissive. By default, many errors do not stop the script, variables are created implicitly, and unexpected input can silently change how your script behaves. Debugging tools and techniques help you observe what the script is actually doing, so you can compare that with what you intended.

Shell debugging focuses on three main goals. First, reveal what commands are being executed and in what order. Second, show the values of important variables as they change. Third, stop the script early when something goes wrong so that errors do not cascade and hide the real cause.

Always distinguish between the symptom of a bug and its root cause. Debugging is the process of moving from symptom to cause using evidence, not guesswork.

Built‑in Shell Debugging Options

The Bourne compatible shells, such as bash, dash, and sh, provide special options that help you see what is happening inside the script. The two most important options for debugging are -x and -v. These can be enabled on the command line, inside the script, or in the shebang.

The -x option prints each command line as it is executed, after variable expansion and substitutions. This is excellent for seeing the actual values that your script is using. For example, if your script line is echo "Hello $USER", the debug output with -x will show something like + echo 'Hello alice' just before the line runs.

The -v option prints each line of the script as it is read, before expansion. This shows the raw script code, which can be useful when the script modifies itself or reads commands from other files. It is less commonly used than -x for ordinary debugging but can still be helpful when you want to verify the exact syntax being interpreted.

You can combine options when invoking the shell. For instance, if you have a script script.sh, you can debug it with:

bash -x script.sh

This runs the script with tracing enabled, even if the script itself does not enable debugging. To include options directly in the shebang, you can write:

#!/usr/bin/env bash -x

Some systems do not pass additional parameters properly when using /usr/bin/env in the shebang, so a more portable pattern is to enable debugging inside the script, as described in the next section.

Enabling and Disabling Debugging Within a Script

You do not always want to debug an entire script. Often you only need to inspect one problematic function or section. The set built‑in lets you enable and disable debugging locally without changing how the rest of the script runs.

To turn tracing on within a script, use:

set -x

From this point, the shell prints each command and its expanded arguments before running it. To turn tracing off again, use:

set +x

The + means disable the option, while - means enable it. You can wrap suspicious code with these commands to focus your debugging output.

For example:

#!/usr/bin/env bash
echo "Starting script"
set -x
process_data "$input_file"
generate_report "$output_dir"
set +x
echo "Script finished"

Here, only the part that calls process_data and generate_report is traced. This limits the amount of debug output and makes it easier to read.

You can also selectively enable other useful options for debugging, such as -u to treat unset variables as errors and -e to exit when a command fails, but these belong to error handling and script structure rather than pure tracing. For targeted debugging, focus on set -x and set +x around suspect sections.

Use set -x only around the minimum code you need to inspect. Tracing large scripts that handle secrets or huge loops can flood your terminal and expose sensitive data.

Reading and Interpreting Trace Output

When -x tracing is active, the shell prints a line for each command it executes. These lines usually start with a prefix character, often +, followed by the command with all substitutions applied. Understanding this output is essential for effective debugging.

Consider the following code:

name="Alice"
greeting="Hello, $name"
echo "$greeting"

With set -x enabled, you might see output such as:

+ name=Alice
+ greeting='Hello, Alice'
+ echo 'Hello, Alice'
Hello, Alice

Everything beginning with + is trace output from the shell, not your script’s normal output. The last line, without +, is what your script prints to standard output.

The trace output shows the command after parameter expansion. This makes it clear what values variables have at each step. If a variable is empty or contains spaces, you will see this directly in the trace. If you expect file.txt but see an empty string or a completely different value, you have found a clue.

In more complex cases, such as loops or conditions, the trace output helps you confirm which branches are taken and how many times loops run. For example, a loop over files will show a traced line for each echo or other command inside the loop, with the file name expanded.

When commands are grouped in subshells or pipelines, tracing may show additional lines that begin with + and curly braces or parentheses. These are indications that control flow is entering or leaving subshells. This is useful if variable values seem to change unexpectedly or not persist, because subshells do not share variable changes back to the parent shell.

Using `PS4` to Improve Debug Output

The traces produced by set -x use a prompt string stored in the PS4 variable. By default, PS4 is often just a plus sign. You can customize this variable to include more information, such as line numbers or function names, which makes debugging of larger scripts much easier.

Before enabling tracing, you can set PS4 to a formatted string. For example:

PS4='+ ${BASH_SOURCE}:${LINENO}:${FUNCNAME[0]}: '
set -x

Now each traced line will include the script file name from ${BASH_SOURCE}, the line number from ${LINENO}, and the current function name from ${FUNCNAME[0]}. An example trace might look like:

+ ./script.sh:42:process_data: input_file=data.csv

This tells you that the traced command comes from line 42 inside the process_data function in script.sh. With this information, you can jump directly to the relevant place in your editor.

The exact variables available in PS4 depend on the shell. The example above uses bash specific variables. If you are working in a more minimal shell, you may only have access to $LINENO or nothing at all for function names, but adding even a simple line number can greatly help.

Always set PS4 before set -x. The shell reads PS4 at trace time. If you change it after enabling tracing, earlier lines will still use the old format.

Trapping Errors and Inspecting the Environment

When a script fails, it is often useful to inspect the state of the environment right at the point of failure. Shell traps allow you to run a diagnostic command when an error or a signal occurs. This lets you capture variable values and call stacks automatically.

A common debugging pattern in bash is to use trap with the ERR pseudo signal. For example:

debug_on_error() {
    echo "Error at line $LINENO"
    echo "Exit status: $?"
}
trap debug_on_error ERR

With this in your script, whenever a command returns a nonzero exit status and is not part of a conditional test that ignores it, debug_on_error runs. Inside that function, $LINENO gives you the current line number, and $? gives you the exit status of the failing command.

You can extend this function to print extra diagnostic information, such as key variable values, the current working directory from pwd, or the command history if available. The goal is to automatically capture enough context to understand what went wrong without having to rerun the script many times.

Another useful trap for debugging is on EXIT. This runs whenever the shell is about to exit, whether due to success, failure, or a signal. For example:

cleanup_and_report() {
    status=$?
    echo "Script exiting with status $status"
}
trap cleanup_and_report EXIT

Here, status captures the overall exit code of the script. You can use this to print a final summary, write to a log file, or perform final checks about what the script accomplished.

Remember that trap handlers themselves can contain bugs. If a trap function fails, it may obscure the original problem. During debugging, keep trap handlers simple, focused, and well tested.

Debugging Logic with Echo and Logging

Not all bugs are visible as syntax errors or failing commands. Often the script runs without formal errors, but it produces the wrong results. In those cases, you need to understand how data flows through the script and how decisions are made.

A straightforward technique is to insert temporary echo statements that print variable values at key points. For example:

echo "DEBUG: user_name='$user_name'"

Using a consistent prefix like DEBUG: makes it easy to filter these lines later. This helps you see whether variables are being set correctly, and whether conditional branches receive the expected input.

For more structured debugging, you can define a simple logging function that writes messages with timestamps or log levels. For example:

log_debug() {
    echo "[$(date +%T)] DEBUG: $*" >&2
}

Then use log_debug instead of scattered echo statements. Directing debug output to standard error with >&2 keeps it separate from the script’s normal output, which is important if the script output is processed by other tools.

You can conditionally enable debugging with environment variables. For instance, check whether DEBUG is set:

log_debug() {
    [ -n "$DEBUG" ] && echo "DEBUG: $*" >&2
}

Now you can turn debug messages on and off without editing the script by setting DEBUG=1 in the environment before running it. This keeps debug scaffolding in your script but lets you run in a quiet mode during normal use.

Never leave permanent debug messages that expose passwords, tokens, or other secrets. Debugging output should not reveal sensitive data in logs or terminals.

Isolating Problems and Minimizing Test Cases

Large scripts can make it difficult to see where a bug originates. A key debugging skill is to isolate the problem. This means taking the part of the script that behaves incorrectly and extracting it into a smaller sample that still shows the bug.

Start by copying the suspected section of code into a new temporary script. Reduce it step by step. Remove unrelated functions, input, and logic while running the sample frequently. If the bug disappears, undo the last removal and try reducing in another area.

As you shrink the sample, use the debugging tools described earlier. Enable set -x to see every step. Add minimal echo or log_debug statements to show the values of critical variables. Pay attention to environmental differences, such as the current directory and input files, and try to replicate them as closely as possible.

This process helps in two ways. First, it narrows your search to a smaller amount of code that you can reason about more easily. Second, it often reveals hidden assumptions. For example, your script might rely on a variable already being defined by the caller, or on a file that accidentally exists in your normal environment but not in a clean test.

Once you understand the cause, you can correct the original script with confidence and remove the temporary debug script. Keeping these small samples around can also form the basis of future tests that verify the bug does not return.

Debugging Subshells, Pipelines, and External Commands

Certain shell features can make bugs harder to track because they hide how data and variables move between parts of the script. Subshells, pipelines, and external commands are especially important to understand when debugging.

A subshell, created by parentheses, runs in its own copy of the environment. Variable assignments inside a subshell do not affect the parent shell. If you expect a variable to change and remain changed after a group of commands, using a subshell by accident can lead to confusing behavior. When you see unexpected variable values, look for parentheses that might be creating a separate shell.

Pipelines connect commands with |. In many shells, each part of the pipeline runs in a separate process. This means that capturing an exit code or a variable from the middle of a pipeline usually does not work as expected. When debugging, consider breaking a pipeline into separate steps to inspect intermediate results.

External commands behave differently from built‑ins. Unlike shell built‑ins, they do not inherit shell options like set -x inside them. However, you can still check their exit codes using $? and you can sometimes run those commands with their own debug or verbose options. For example, many tools accept -v or --debug to show more internal information.

If a script calls another script, you can combine debugging techniques. Use set -x in both the parent and the child, or run the child with bash -x. Make sure you can distinguish their outputs, for instance by using different PS4 values that identify which script is currently tracing.

Whenever you encounter difficult behavior in these areas, slow the script down conceptually. Replace pipelines with temporary files, remove subshells while preserving logic, and wrap external commands with logging so you always know what inputs and outputs they are seeing. Over time, these simplifications often reveal unexpected side effects or misunderstandings about how the shell executes complex constructs.

Comments

Please login to add a comment.

Don't have an account? Register now!