4.6 Logging and Auditing

Why Logging and Auditing Matter

Linux gives you enormous power—but with power comes responsibility. Logging and auditing are how you answer questions like:

What happened?
When did it happen?
Who did it?
From where?
Was it allowed or blocked?

In this chapter, you’ll learn how Linux records system activity, how to read those records efficiently, and how to design a logging/auditing strategy that is useful in real-world operations, troubleshooting, and security.

Systemd’s journalctl and traditional syslog files (e.g. in /var/log) are covered in other chapters; here we look at how to think about logging holistically and how to tie logs and auditing together as an advanced skill.

Types of Events You Should Log

Before focusing on tools, decide what kinds of events you care about. Typical categories:

Authentication & authorization

SSH logins and failures
sudo and privilege escalations
Account lockouts, password changes

System changes

Service restarts/reloads
Package installs/removals
Kernel messages, hardware events

Application and web access

HTTP access logs, error logs
API calls and failures
Database connections and slow queries

Security-relevant events

Firewall blocks
SELinux/AppArmor denials
IDS/IPS alerts
File integrity alerts

Compliance/audit events

Changes to critical configs
Access to sensitive files
Administrative operations (user/group changes, mount, chmod, chown)

You won’t log everything forever; instead, decide:

What is must-keep (for security, compliance, or forensic reasons)
What is useful (for troubleshooting)
What is noise (events that can be sampled or discarded)

Core Logging Architecture Concepts

Linux logging is usually built from a few core building blocks:

Producers
Programs that generate log messages:

kernel
daemons and services
applications (web servers, DB servers, etc.)
security components (firewalls, SELinux/AppArmor, IDS)
audit subsystems (e.g. auditd)

Collectors / routers

systemd-journald
syslog daemons (e.g. rsyslog, syslog-ng)
audit daemons (auditd)
agents for external platforms (e.g. filebeat, fluentd, vector)

Storage

Local log files under /var/log
Binary journals (systemd)
Remote log servers via syslog or custom protocols
Central log platforms (ELK/Opensearch, Graylog, Splunk, Loki, etc.)

Consumers

Humans (admins, security teams)
Alerting systems (email, PagerDuty, Slack, etc.)
Dashboards and reports
Automated responses (scripts, SOAR tools)

Important design principle:

Logging and auditing become powerful when you centralize and correlate logs from many sources.

Designing a Logging Strategy

Instead of simply “turning on everything”, design your logging with intent.

1. Define Goals

Common goals:

Operations: detect failures and performance issues quickly
Security: detect suspicious behavior, policy violations, intrusions
Compliance: provide evidence for audits (e.g. who did what, when)
Forensics: reconstruct timelines after an incident

Each goal implies different retention, detail level, and alerting.

2. Decide on Log Levels

Log messages typically use levels such as:

emerg / alert / crit – system unusable or critical failures
err – errors that require attention
warning – potential issues
notice – normal but significant
info – routine information
debug – very detailed, for troubleshooting

Guidelines:

On production systems:

Default to info or notice for services.
Enable debug only for limited time when debugging.

On security-sensitive components:

Use more detailed logging (but watch out for size and privacy).

3. Retention and Rotation

Logs grow quickly. You need a policy:

Rotation (e.g. via logrotate):

Rotate by size or time (daily/weekly).
Compress older logs (gzip, xz).
Limit number of rotated archives.

Retention:

Short (hours–days) for debug noise.
Medium (weeks–months) for operational logs.
Long (months–years) for security/compliance, but often on central servers or cold storage, not on each host.

Balance cost vs. usefulness: old logs that nobody can query are not helpful.

4. Centralization

For anything beyond a few servers, centralize logs:

Use syslog over the network (@server or @@server conventions) or log agents.
Ensure reliable transport (TCP, TLS) for critical logs.
Tag logs with:

hostname
application
environment (prod/test/dev)
service name or role

Centralization enables:

Cross-host correlation
Single place for searching
Simple integration with alerting and dashboards

Auditing vs Logging

Logging and auditing overlap but are not identical:

Logging:

Broad, mostly for troubleshooting and visibility.
May be incomplete or high-level (“service X restarted”, “HTTP 500 error”).

Auditing:

Focused on security-relevant, accountability, or compliance events.
Aimed at answering “who did what, where, and when?”.
More structured and controlled.

On Linux, auditing typically means:

Kernel-level auditing via the audit subsystem (handled by auditd).
High-fidelity records about:

syscalls (open, execve, chown, mount, etc.)
access to specific files or directories
changes to identities (users, groups, capabilities)
security policy decisions (e.g. MAC denials)

You’ll usually use:

Logs for: routine operations, debugging, and basic security monitoring.
Audit trails for: investigations, compliance, and fine-grained trace of actions.

Common Events to Include in an Audit Policy

A typical Linux audit configuration focuses on events like:

Identity & access

Logins and logouts
SSH key usage
sudo invocations and their results
Changes to users and groups

Privilege boundaries

Use of setuid binaries
Raising capabilities
su and doas usage

Critical file accesses

/etc/passwd, /etc/shadow
/etc/sudoers and sudoers.d
Security configs (e.g. SSHD configs, PAM configs, firewall configs)
Application secrets (keys, credentials) directories

System configuration changes

Changes to network configuration
Mount/umount operations
Kernel parameter tuning (e.g. via /proc/sys)

Audit system integrity

Modifications of audit rules themselves
Attempts to disable logging or auditing

Not all systems require the same level. For a personal laptop, heavy audit rules may be overkill; for a payment processing server, they may be mandatory.

Building an End‑to‑End Logging & Auditing Workflow

This is about how logs and audits flow through your environment.

1. Ingestion

System and app logs go to:

journald
syslog daemons
application-specific log files

Audit events go to:

auditd logs

2. Normalization

Different sources use different formats. Normalization common steps:

Parse timestamps into a consistent format (e.g. UTC).
Extract fields such as:

source IP
username
process name / PID
event type (login, file access, command execution)

Assign categories (auth, system, network, application, audit, etc.).

Central platforms and agents often provide parsers for common log formats.

3. Storage and Indexing

Design decisions:

Hot storage (fast search, recent days/weeks).
Warm storage (slower, longer retention).
Cold/archive (cheap storage, rarely queried, maybe offline).

Ensure:

Time-synchronized across servers (NTP is critical).
Index by at least:

time
host
program/service
severity

4. Detection and Alerting

You can’t manually read all logs. Rely on:

Static rules:

“Alert if more than N failed SSH logins from same IP in 5 minutes”
“Alert if sudo fails more than N times”
“Alert on any modification of /etc/sudoers”

Thresholds:

High 5xx rate in HTTP logs
Sudden spike in auth failures
Unusual rate of rm or chmod calls (via audit)

Baselines & anomaly detection (in advanced systems):

Detect deviations from normal access patterns.

Set alerts for actionable conditions; too many alerts cause people to ignore them.

5. Response and Enrichment

Responses range from manual to automatic:

Manual:

Investigate the logs, correlate events, maybe escalate.

Semi-automatic:

Pre-written runbooks and scripts help respond quickly.

Automatic:

Block IPs based on thresholds.
Disable accounts after suspicious events.
Revert configuration via configuration management.

Enrichment adds value:

Map IPs to geographic locations or internal roles.
Link logs to asset inventory (which app/team owns this server?).
Attach context (for example, which change/commit was deployed just before an error spike).

Logging and Auditing for Different Environments

The “right” setup depends on scale and importance.

Single Server / Small Homelab

Goals:

Basic forensic and troubleshooting ability.
Some security visibility.

Typical approach:

Use default system logs and rotate with logrotate.
Keep authentication logs and key service logs for at least a few weeks.
Optionally enable lightweight audit rules for critical files.
Manually inspect logs when troubleshooting or after suspicious events.

Small–Medium Organization

Goals:

Centralize logs from multiple servers.
Provide basic alerting and dashboards.

Typical approach:

Choose a central log server or log platform.
Standardize:

log formats (where possible)
timezone (UTC)
host naming conventions

Deploy minimal, standardized audit rules on all servers, with more detailed rules on critical systems.

Enterprise / Compliance-Driven

Goals:

Detailed, reliable, and tamper-resistant audit trails.
Strong incident response and compliance reporting.

Typical approach:

Dedicated logging infrastructure with redundancy.
Syslog or agents sending to SIEM or log platforms.
Advanced audit rules on sensitive systems, carefully tuned to avoid overload.
Immutable or write-once storage for certain logs.
Strict access controls and monitoring for the logging system itself.

Best Practices and Common Pitfalls

Best Practices

Time synchronization is non-negotiable

Always run NTP/chrony.
Use UTC in logs and dashboards.

Protect log integrity

Restrict access to logs (/var/log, audit logs, central platform).
Consider:

checksums/hashes for critical logs
append-only flags
shipping logs off-host as soon as possible

Avoid logging sensitive data unnecessarily

Do not log:

plain-text passwords
full credit card numbers
secret keys and tokens

If necessary, mask or tokenize sensitive fields.

Test your logging and alerting regularly

Simulate:

failed login bursts
service failures
audit events (e.g. reading a critical file)

Confirm:

events are generated
they reach the central system
alerts fire as expected

Document your log schema and rules

Where are logs stored?
How long are they kept?
Which events are monitored?
What alerts exist, and who owns them?

Common Pitfalls

Overlogging

Turning on verbose debug logging everywhere indefinitely.
Consequences: high disk use, unreadable noise, and cost.

Underlogging

Missing key events, such as:

failed login attempts
changes to security-related configs
service crashes/restarts

Not watching the watchers

Ignoring logs from:

the logging infrastructure
the audit subsystem

If those fail or are tampered with, you may lose visibility entirely.

Treating logs as an afterthought

Trying to “retrofit” logging/auditing only after an incident.
Set up at least minimal logging & auditing when deploying new services.

Using Logs and Audit Trails During an Incident

When something goes wrong—security breach, data loss, or major outage—logs and audits are your primary evidence.

A typical workflow:

Define the timeframe

When was the issue first noticed?
Use alerts, user reports, or monitoring data as boundaries.

Identify relevant sources

Authentication logs
Service logs
System logs (kernel, hardware)
Audit logs (for file and process activity)

Build a timeline

List events by time:

user logins
commands run
configuration changes
crashes, restarts

Look for cause-and-effect relationships.

Correlate across systems

Use hostnames, IPs, user IDs, and session IDs.
Tie together:

web requests
app logs
DB queries
OS-level audit events

Preserve evidence

Copy relevant logs to a safe place.
Ensure they cannot be modified.
Document the steps you take.

Feed back into improvements

After the incident, update:

logging rules (to capture what was missing)
alert rules (to detect earlier next time)
documentation and training

Integrating Logging and Auditing with Other Tools

Logging and auditing become much more powerful when combined with other systems:

Configuration management (Ansible, Puppet, etc.)

Use to deploy standardized logging/audit configurations.

Monitoring + alerting (Prometheus, Nagios, etc.)

Combine metrics (CPU, errors/sec) with logs for full context.

Security tools (IDS/IPS, vulnerability scanners)

Feed their alerts into the central log system.

Orchestration / containers

Ensure containers send logs to the host or directly to your log platform.
Understand how audit rules behave in containerized environments (namespaces, cgroups can change how events are seen).

Summary

In advanced Linux administration, logging and auditing are not optional—they are how you observe, understand, and defend your systems.

Key ideas:

Treat logs and audits as part of system design, not an afterthought.
Decide what to log, how long to keep it, and how to centralize it.
Use audit trails for fine-grained accountability and compliance.
Normalize, index, and monitor logs to support rapid troubleshooting and incident response.
Continuously refine your logging and auditing based on real-world incidents and evolving requirements.

Subsequent sections will dive into the specifics of systemd logging, traditional /var/log files, auditd, log rotation, and creating custom logs.

4.6.1 Systemd logging

4.6.2 Traditional logs in /var/log

4.6.3 Auditd

4.6.4 Log rotation

4.6.5 Creating custom logs

Comments

Please login to add a comment.

Don't have an account? Register now!

4.6 Logging and Auditing

Why Logging and Auditing Matter

Types of Events You Should Log

Core Logging Architecture Concepts

Designing a Logging Strategy

1. Define Goals

2. Decide on Log Levels

3. Retention and Rotation

4. Centralization

Auditing vs Logging

Common Events to Include in an Audit Policy

Building an End‑to‑End Logging & Auditing Workflow

1. Ingestion

2. Normalization

3. Storage and Indexing

4. Detection and Alerting

5. Response and Enrichment

Logging and Auditing for Different Environments

Single Server / Small Homelab

Small–Medium Organization

Enterprise / Compliance-Driven

Best Practices and Common Pitfalls

Best Practices

Common Pitfalls

Using Logs and Audit Trails During an Incident

Integrating Logging and Auditing with Other Tools

Summary

Comments

Where to Move