Kahibaro
Discord Login Register

4.1 Advanced Shell Scripting

Overview

By this point you already know how to write basic shell scripts: variables, conditionals, loops, and making scripts executable. This chapter takes you to the next level:

Examples focus on bash, but most concepts apply to other POSIX shells unless noted.


Script Structure

As scripts grow, structure matters as much as the individual commands. A good layout makes scripts easier to read, debug, and maintain.

Recommended Layout

A common, readable structure:

  1. Shebang and options
  2. Metadata and usage/help
  3. Global configuration and constants
  4. Functions
  5. Argument parsing
  6. Main logic (often in a main function)

1. Shebang and Shell Options

Use an explicit shebang and safe defaults:

#!/usr/bin/env bash
set -euo pipefail
IFS=$'\n\t'

You may relax or locally override some options for specific parts of the script if needed.

2. Metadata and Usage

Provide a short description and usage output:

script_name="${0##*/}"
usage() {
    cat <<EOF
Usage: $script_name [-h] [-v] INPUT_FILE
Process INPUT_FILE and print a summary.
Options:
  -h   Show this help and exit
  -v   Enable verbose output
EOF
}

3. Configuration and Constants

Centralize configurable values:

readonly DEFAULT_OUTPUT_DIR="/var/tmp/report"
readonly LOG_FILE="/var/log/mytool.log"
VERBOSE=0

Using readonly helps prevent accidental modification.

4. Functions First, Then Flow

Define helper functions near the top, then have a clear entry point:

main() {
    parse_args "$@"
    validate_env
    process_file "$INPUT_FILE"
}
main "$@"

This pattern:

Functions

You’ve seen simple functions; here we use them more systematically and safely.

Defining Functions

Preferred style:

my_function() {
    local arg1="$1"
    local arg2="$2"
    # ...
}

Key points:

Return Values vs Output

Bash functions return an exit status (0–255). Use this for success/failure, not data:

is_readable() {
    local path="$1"
    [[ -r "$path" ]]
}
if is_readable "/etc/passwd"; then
    echo "OK"
else
    echo "Not readable"
fi

To return data, write to stdout (or a variable by indirection):

get_username() {
    printf '%s\n' "$USER"
}
name="$(get_username)"

Argument Handling and Shifting

For functions that take options, pattern after small scripts:

parse_args() {
    while [[ $# -gt 0 ]]; do
        case "$1" in
            -v|--verbose) VERBOSE=1; shift ;;
            -o|--output)  OUTPUT_FILE="$2"; shift 2 ;;
            -h|--help)    usage; exit 0 ;;
            --)           shift; break ;;
            -*)
                echo "Unknown option: $1" >&2
                usage
                exit 1
                ;;
            *)
                # Positional arguments
                INPUT_FILE="$1"
                shift
                ;;
        esac
    done
}

shift adjusts $@ so you can process arguments one by one.

Using `local` and Avoiding Globals

Globals make large scripts fragile. Prefer:

process_file() {
    local file="$1"
    local line
    while IFS= read -r line; do
        handle_line "$line"
    done < "$file"
}

Only keep true configuration or shared state as globals.

Recursion and Higher‑Level Patterns

Bash is not ideal for heavy recursion, but simple recursive patterns are okay:

walk_dir() {
    local dir="$1"
    local path
    for path in "$dir"/*; do
        [[ -e "$path" ]] || continue
        if [[ -d "$path" ]]; then
            walk_dir "$path"
        else
            echo "File: $path"
        fi
    done
}

Be careful with very deep directory trees; recursion depth is limited.


Error Handling

Production scripts must handle failures predictably.

Exit Codes

Commands set $? to an exit status:

You can set your own with exit N or return N.

Custom Exit Codes

Define meaningful codes:

readonly E_BADARGS=2
readonly E_IO=3
if [[ $# -lt 1 ]]; then
    echo "Missing argument" >&2
    exit "$E_BADARGS"
fi

`set -euo pipefail` and Its Traps

set -e is helpful but can be surprising in certain constructs (like if, while, ||, &&). Example:

set -e
false || echo "will still run"

set -e is suppressed in || and && chains, which can be useful.

When you expect failures, handle them explicitly:

if ! some_command; then
    echo "some_command failed" >&2
    cleanup
    exit 1
fi

Or use a wrapper function that always checks status.

`trap` for Cleanup

Use trap to run code on exit or signals:

cleanup() {
    [[ -n "${TMPFILE:-}" && -f "$TMPFILE" ]] && rm -f "$TMPFILE"
}
trap cleanup EXIT INT TERM
TMPFILE="$(mktemp)"
# use TMPFILE

Common signals:

trap is crucial for scripts that create temp files, lock files, or modify state.

Validating Input and Environment

Check preconditions upfront:

require_cmd() {
    command -v "$1" >/dev/null 2>&1 || {
        echo "Required command not found: $1" >&2
        exit 1
    }
}
require_cmd "curl"
require_cmd "jq"
[[ -r "$CONFIG_FILE" ]] || {
    echo "Cannot read config: $CONFIG_FILE" >&2
    exit 1
}

Debugging Scripts

When scripts misbehave, use built‑in tracing and debugging tools.

`set -x` and `PS4`

set -x prints each command before executing it:

set -x
do_something
set +x

To make traces more readable, customize PS4:

export PS4='+ ${BASH_SOURCE}:${LINENO}:${FUNCNAME[0]}: '
set -x

Trace output will then include file, line, and function info.

Tracing Parts of a Script

You rarely want full‑script tracing; enable it only around suspicious sections:

debug_section() {
    set -x
    critical_function "$@"
    set +x
}

`bash -x`, `bash -n`, `bash -u`

These are useful during development even if you don’t want those options enabled permanently in the script.

Logging

Use a simple logging function with levels:

log() {
    local level="$1"; shift
    local ts
    ts="$(date '+%Y-%m-%d %H:%M:%S')"
    printf '%s [%s] %s\n' "$ts" "$level" "$*" >&2
}
log info "Starting import"
log warn "Using default configuration"
log error "Failed to connect to database"

You can control verbosity with a global LOG_LEVEL if desired.


Scheduled Tasks (cron)

cron runs commands or scripts on a schedule. It’s a core way to automate your shell scripts.

Basic `crontab` Usage

Each user can have a crontab:

A crontab entry has this format:

MIN HOUR DOM MON DOW COMMAND

Where:

Examples

Run a backup script every day at 02:30:

30 2 * * * /usr/local/bin/backup.sh

Run a cleanup script every 15 minutes:

*/15 * * * * /usr/local/bin/cleanup-temp.sh

Run a report on weekdays at 06:00:

0 6 * * 1-5 /usr/local/bin/report.sh

Environment Differences in `cron`

Cron jobs run in a minimal environment:

To avoid surprises:

  1. Use absolute paths to commands and files:
   /usr/bin/find /var/log -type f -mtime +7 -delete
  1. Or set PATH explicitly at top of crontab:
   PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
  1. Set any required environment variables:
   MAILTO=admin@example.com
   LANG=en_US.UTF-8

Redirecting Output

By default, cron emails any output to the user (if mail is configured). Explicitly handle logs:

0 3 * * * /usr/local/bin/backup.sh >>/var/log/backup.log 2>&1

This appends stdout and stderr to /var/log/backup.log.

Testing Cron‑Friendly Scripts

When writing scripts that will run under cron:

DRY_RUN=0
while [[ $# -gt 0 ]]; do
    case "$1" in
        --dry-run) DRY_RUN=1; shift ;;
        *) shift ;;
    esac
done
run_cmd() {
    if (( DRY_RUN )); then
        echo "[DRY-RUN] $*"
    else
        "$@"
    fi
}

Putting It Together: Example Pattern

A moderately advanced skeleton script:

#!/usr/bin/env bash
set -euo pipefail
IFS=$'\n\t'
export PS4='+ ${BASH_SOURCE}:${LINENO}:${FUNCNAME[0]}: '
script_name="${0##*/}"
readonly DEFAULT_LOG="/var/log/${script_name%.sh}.log"
LOG_FILE="$DEFAULT_LOG"
VERBOSE=0
usage() {
    cat <<EOF
Usage: $script_name [OPTIONS] INPUT
Process INPUT and generate a report.
Options:
  -o, --output FILE   Write output to FILE
  -l, --log FILE      Log to FILE (default: $DEFAULT_LOG)
  -v, --verbose       Verbose logging
  -h, --help          Show this help
EOF
}
log() {
    local level="$1"; shift
    local ts
    ts="$(date '+%Y-%m-%d %H:%M:%S')"
    if (( VERBOSE )) || [[ "$level" != "DEBUG" ]]; then
        printf '%s [%s] %s\n' "$ts" "$level" "$*" | tee -a "$LOG_FILE" >&2
    fi
}
cleanup() {
    [[ -n "${TMPDIR:-}" && -d "$TMPDIR" ]] && rm -rf "$TMPDIR"
}
trap cleanup EXIT INT TERM
require_cmd() {
    command -v "$1" >/dev/null 2>&1 || {
        log "ERROR" "Missing required command: $1"
        exit 1
    }
}
parse_args() {
    OUTPUT=""
    INPUT=""
    while [[ $# -gt 0 ]]; do
        case "$1" in
            -o|--output)
                OUTPUT="$2"; shift 2 ;;
            -l|--log)
                LOG_FILE="$2"; shift 2 ;;
            -v|--verbose)
                VERBOSE=1; shift ;;
            -h|--help)
                usage; exit 0 ;;
            --)
                shift; break ;;
            -*)
                log "ERROR" "Unknown option: $1"
                usage
                exit 2
                ;;
            *)
                INPUT="$1"; shift ;;
        esac
    done
    if [[ -z "$INPUT" ]]; then
        log "ERROR" "Missing INPUT"
        usage
        exit 2
    fi
}
process_input() {
    local input="$1"
    log "INFO" "Processing $input"
    # ... do work here ...
}
main() {
    require_cmd "awk"
    require_cmd "sed"
    TMPDIR="$(mktemp -d)"
    parse_args "$@"
    process_input "$INPUT"
    log "INFO" "Done"
}
main "$@"

This script demonstrates:

Use this as a starting point for your own advanced scripts, expanding it with patterns from this chapter as your needs grow.

Views: 91

Comments

Please login to add a comment.

Don't have an account? Register now!