Table of Contents
Overview
Digital forensics and incident response (DFIR) on Linux is about answering four questions in a structured way:
- What happened?
- Where did it happen?
- When did it happen?
- How bad is it (impact and scope)?
This chapter gives you a high‑level, Linux‑focused view of how these activities fit together so you can later dive into the more detailed sub‑chapters: collecting evidence, log and file recovery, analyzing suspicious processes, and incident response workflow.
You won’t become a professional forensics analyst from this one chapter, but you’ll understand:
- What makes Linux DFIR different from “generic” incident handling
- What kinds of questions and artifacts matter on Linux systems
- How the sub‑topics in this part of the course connect into a coherent practice
Typical Linux Incident Scenarios
Understanding common scenarios helps you know what to look for:
- Compromised SSH access
- Stolen credentials, key misuse, brute‑force, or exposed root login.
- Often leaves traces in
auth.log/secure,lastlog,wtmp,journalctl. - Web application compromise
- Vulnerable PHP/CGI apps running on Apache/Nginx.
- Webshells, unexpected files under
/var/www, modified configs, unusual processes running aswww-data/apache/nginx. - Local privilege escalation
- Attacker gets from an unprivileged user to
root. - Kernel exploits, misconfigurations, setuid binaries, or abusing sudo.
- Malicious persistence
- Attacker ensures access after reboot.
- Systemd units, cron jobs, modified shells, backdoored binaries, unusual scripts in startup locations.
- Data exfiltration or ransomware
- Data being copied off host or encrypted.
- Sudden spikes in network or disk usage, mass file changes, strange archives appearing.
In later sub‑chapters you’ll see the concrete artifacts (logs, files, memory, processes) you can collect to investigate these.
Core Principles of Linux Forensics
Linux forensics is guided by some fundamental ideas that shape how you work:
1. Minimize Changes to the System
Every command you run changes the system (timestamps, logs, caches). For forensics you:
- Prefer read‑only access (mounts, copies, external boot media).
- Capture data systematically before doing heavy “fixing.”
- Understand the trade‑off: an active incident may require immediate containment even if it disturbs evidence.
Common practical strategies:
- Use statically linked tools from trusted media (USB, read‑only NFS) instead of possibly compromised
/usr/bin. - Avoid editing or deleting anything until after an initial evidence collection pass.
2. Order of Volatility
Some evidence disappears very quickly. The typical order (from most volatile to least):
- CPU and memory state: processes, in‑memory keys, network connections.
- Network state: connections, routing tables, ARP caches.
- Temporary files and
/tmp,/run, in‑memory filesystems. - Log files and application data.
- Static files and disk images.
Priority: capture the most volatile data first when feasible (memory, current connections, running processes), then work your way down.
3. Chain of Custody and Integrity
If you might ever have to prove what happened (internal investigation, legal case, audit), you need:
- A record of who collected data, when, how, and where it was stored.
- Integrity checks (hashes like SHA‑256) for collected artifacts such as disk images, logs, and memory dumps:
$$ \text{hash} = \text{SHA256}(\text{file}) $$
Linux makes this relatively easy:
- Use
sha256sumto compute and verify hashes. - Store hashes in a separate, protected location (e.g., another system or a secure notebook).
4. Prefer Artifacts over Assumptions
Linux is flexible; there are many ways to hide or persist malicious activity. Rely on artifacts, not assumptions:
- Don’t assume “nobody can log in as root” because your team policy says so—verify
sshd_config, sudoers, and logs. - Don’t assume a file is legit because “we always have something in that directory”—compare checksums, timestamps, and packages.
You will learn specific artifact locations and formats in the sub‑chapters.
Key Linux Forensic Artifacts (High-Level)
You will explore these in detail later, but it’s helpful to see the big picture now.
Process and Memory Information
- Running processes and their trees
- Open files and sockets
- Mapped libraries and modules
These answer questions like:
- “What is running right now?”
- “What is this process connected to?”
- “Is this binary what it claims to be?”
Linux tools in this space include commands such as ps, lsof, ss, top, /proc inspection, and memory acquisition tools, which will be discussed later.
Log Files and Journals
Linux logging is typically split between:
- System logs in
/var/log/(e.g.,auth.log,secure,messages,syslog). - Application logs (web server logs, database logs, etc.).
- Systemd journal (via
journalctl) on modern systems.
From a forensic perspective, logs are your timeline source:
- Who authenticated successfully and when?
- What services started/stopped?
- Which IPs were making requests?
You’ll see concrete techniques for log and file recovery in the corresponding chapter.
Filesystem and Metadata
Not just what’s in a file, but how the filesystem describes it:
- Ownership, permissions, and extended attributes
- Timestamps (creation, modification, access, change)
- Mount options and filesystem types
- Deleted files that can sometimes be recovered or analyzed from raw disk
These artifacts help you:
- Spot unusual writable locations (e.g., world‑writable directories).
- Identify newly added or altered binaries and scripts.
- Investigate when and by whom items were modified.
Network State
Even if you have centralized network monitoring, the host view is crucial:
- Which connections are open now?
- Which ports are listening, and which processes own them?
- What routes and DNS settings are in effect?
Linux provides rich tools for this, which are standard in networking and monitoring chapters, but in DFIR you use them to detect unexpected or malicious communication.
Forensics vs Incident Response
You can think of DFIR as two intertwined activities:
- Forensics: Observe and understand. Collect and analyze evidence to reconstruct events.
- Incident response: Decide and act. Contain, eradicate, recover, and communicate.
On Linux, the interplay looks like this:
- Detection
- A monitoring alert, a log anomaly, or a user report triggers suspicion.
- Triage
- High‑level checks: is the host obviously compromised or misconfigured?
- Evidence collection
- Capture volatile data and key artifacts with minimal disturbance.
- Analysis
- Use the collected data to identify root cause, affected components, and scope.
- Containment/Eradication
- Stop the attack, remove persistence, and close exploited paths.
- Recovery and Hardening
- Restore from clean sources, patch vulnerabilities, and improve defenses.
- Post‑incident review
- Document the case and feed lessons learned into improvements.
The later “Incident response workflow” chapter will break this down into concrete, repeatable steps; here you just need the conceptual framework.
Live vs Offline Forensics
On Linux you often choose between investigating a running system or analyzing it offline:
Live Forensics
You analyze the system while it is still running.
Advantages:
- Access to volatile data (processes, memory, connections).
- Quicker triage and potential rapid containment.
Disadvantages:
- You inevitably alter the system by interacting with it.
- An adversary might detect your actions in real time.
- Rootkits may hide reality from your tools.
Use cases:
- Active attacks, urgent containment, or when you can’t take systems down.
Offline Forensics
You shut down (or isolate) the system and work from images or copies:
- Disk images
- Memory dumps
- Log exports
Advantages:
- You can work without fear of altering the live system.
- Easier to preserve and demonstrate evidence integrity.
- You can use more intensive tools without impacting production.
Disadvantages:
- No chance to collect volatile data if you already powered off.
- May require downtime or failover.
In Linux environments with clustering or high availability, offline forensics is often practical: fail over traffic, isolate the suspect node, then image it.
Working in Compromised Environments
Forensics on a potentially compromised Linux host has extra complications:
- Untrusted binaries
- System tools like
psorlsmay have been replaced. - Prefer known‑good tools from a read‑only source.
- Kernel‑level tampering
- Rootkits can hide processes, files, and network connections.
- Cross‑checking information from multiple sources (e.g.,
/procvs command output) can reveal inconsistencies. - Time manipulation
- System clock or timestamps may be altered.
- Correlate with external sources (other hosts, infrastructure logs) when building timelines.
- Log tampering
- Logs can be truncated or edited.
- Remote log aggregation and journaling can mitigate this risk.
Your goal is to gather enough evidence to corroborate a story using multiple independent artifacts, not just a single suspicious log line or one command’s output.
Collaboration and Documentation
Linux DFIR rarely happens in isolation. You’ll often work with:
- System administrators and DevOps engineers (knowledge of normal configs and behavior).
- Security team members (SIEM, IDS, vulnerability data).
- Application owners (impact on business logic and data).
To make that collaboration effective, you need to:
- Document everything
- Commands run (with timestamps and outputs, where possible).
- Copies and artifacts you created (paths, hashes, storage locations).
- Decisions made and by whom.
- Use consistent time references
- Prefer UTC when writing reports.
- Note the system’s timezone and any NTP offsets.
- Be explicit about uncertainty
- Separate what you know from what you suspect.
- Call out missing data (e.g., logs rotated before you could collect them).
Building Linux DFIR Skills
To become strong in Linux forensics and incident response, it helps to:
- Deepen your system internals understanding (processes, memory, filesystem behavior, logs).
- Practice in labs and capture‑the‑flag (CTF) environments with realistic attack traces.
- Build your own checklists and runbooks: tailored command sequences, evidence‑collection scripts, and triage processes for your environment.
- Integrate DFIR capabilities into your monitoring and configuration management:
- Centralized logging
- Baseline configuration and file integrity tracking
- Automated collection of key artifacts during incidents
The following sub‑chapters in this part of the course will now walk through:
- How to collect evidence on Linux safely and systematically
- How to recover and interpret logs and files
- How to analyze suspicious processes and behaviors
- How to implement a practical, repeatable incident response workflow tailored to Linux systems