7.2.1 Collecting evidence

Table of Contents

Incident response mindset for evidence collection

In forensics and incident response, evidence collection must balance speed, accuracy, and preservation. In practice this means:

Act as if every action could be replayed in court or audited.
Prioritize preserving volatile data that will disappear soon.
Avoid changing the system more than absolutely necessary.
Record exactly who did what, when, and how.

This chapter focuses on what to collect, in which order, and how, with Linux-specific tools and practices.

Key principles:

Order of volatility (OOV): Collect the most volatile data first.
Minimal footprint: Prefer read-only access, external tools, and mounted media.
Reproducibility: Everything must be re-doable from your notes, commands, and hashes.
Chain of custody: Evidence must be clearly traceable through its lifetime.

Preparation and environment

Response workstation and toolset

Never rely solely on the compromised system. Ideally, use:

A trusted response workstation (your laptop or dedicated IR box).
A set of statically linked, known-good tools, e.g.:

BusyBox static binaries
Forensic toolkits (The Sleuth Kit, Volatility, LiME, etc.)

External media:

Write-once (or write-protected) USB drives for evidence storage.
Separate media for tools vs collected evidence.

Before the incident:

Pre-build a forensic USB with:

A minimal Linux (e.g. Kali, CAINE, Tails, custom build).
Acquisition tools (dd, dcfldd, bulk_extractor, aff4, ewfacquire).
Hashing tools (sha256sum, md5sum, b3sum).
Network tools (nmap, ssh, socat, netcat).

During the incident:

Use trusted binaries from:

Mounted read-only media (/mnt/usb/tools/...).
Or from a read-only NFS/SSHFS share.

Avoid using potentially compromised /bin, /usr/bin on the suspect system if you suspect rootkits.

Order of volatility and collection strategy

General order of volatility (most to least volatile):

CPU state, registers, running processes, and memory (RAM).
Network connections and ephemeral data (ARP, routing cache, netstat state).
Logged-in users and sessions.
Temporary files, /tmp, /run, swap.
Application logs and system logs.
Filesystem metadata (timestamps, extended attributes).
Full disk images.
Off-host logs and cloud metadata (less volatile but critical).

For Linux, a practical workflow:

Stabilize and isolate (network isolation) while minimizing changes.
Immediately start documentation (photos, notes, typed commands history).
Collect live data:

System clock, uptime.
Logged-in users.
Running processes, open files.
Network state.
Memory image (if in scope and tools available).

Collect key logs and configuration.
Acquire disk-level images (or at least key partition images).
Collect external logs (IDS, firewall, VPN, cloud provider, auth systems).

Documentation and chain of custody

Field notes

Document in real time:

Who is performing the actions.
When (timestamp in UTC).
Where (hostname, IPs, environment).
What was done (exact commands and tools, their versions).
Why each action was taken (brief justification).

Example: keep a text log on a separate device:

# All timestamps in UTC
2025-04-01T13:20:10Z - IR lead: Alice
- Host: web01.example.com (10.0.5.12)
- Access over out-of-band console
- Command: /mnt/ir-tools/bin/ps -eo pid,ppid,cmd > /mnt/evidence/web01_ps.txt
- SHA256(web01_ps.txt)=...

If possible, capture:

Screenshots or photographs of the console.
The original terminal scrollback (e.g. script or scriptreplay).

Example using script (from a trusted location):

/mnt/ir-tools/bin/script -t 2> /mnt/evidence/typescript.timing \
  /mnt/evidence/typescript.log

Chain of custody records

For each evidence item:

Unique ID.
Description (host, path, context).
Acquisition method and tools.
Date/time (with timezone or clearly stated UTC).
Hashes (at least SHA-256).
Who collected it.
All transfers, storage locations, and access.

Hash example:

sha256sum memdump.lime > memdump.lime.sha256
sha256sum disk-image.dd > disk-image.dd.sha256

Use consistent naming and immutable storage (e.g. WORM storage, object storage with versioning and retention policies) where possible.

Live data collection on Linux

Assume the system is still powered on. The challenge is to collect as much as possible without altering evidence more than necessary.

Time and system identity

First, capture reference information:

# Use trusted binaries if available
date -u
hostname -f
uname -a
id
who
last -n 20

Also capture:

System clock vs external time if possible, for correlation.
/etc/hostname, /etc/hosts, /etc/os-release.

Running processes and system state

Collect comprehensive process and system info:

ps auxww > /mnt/evidence/ps_aux.txt
ps -eo pid,ppid,user,group,start,time,command > /mnt/evidence/ps_detailed.txt
top -b -n 1 > /mnt/evidence/top_snapshot.txt
lsof -nP > /mnt/evidence/lsof_all.txt
lsmod > /mnt/evidence/lsmod.txt

Prefer -ww to avoid truncated command lines.

If procps tools are distrusted, read directly from /proc:

cp -a /proc /mnt/evidence/proc_snapshot

(Be aware this is large and may change as you copy; note that in your documentation.)

Network connections and configuration

Network state is highly volatile; gather early:

ip addr show > /mnt/evidence/ip_addr.txt
ip route show > /mnt/evidence/ip_route.txt
ip neigh show > /mnt/evidence/ip_neigh.txt
ss -tulpn > /mnt/evidence/ss_listen.txt
ss -tanp > /mnt/evidence/ss_tcp_all.txt
ss -uanp > /mnt/evidence/ss_udp_all.txt

If ss is unavailable:

netstat -plant > /mnt/evidence/netstat_plant.txt
netstat -rn > /mnt/evidence/netstat_rn.txt

Capture firewall rules:

iptables-save > /mnt/evidence/iptables_save.txt 2>/dev/null || true
ip6tables-save > /mnt/evidence/ip6tables_save.txt 2>/dev/null || true
nft list ruleset > /mnt/evidence/nft_ruleset.txt 2>/dev/null || true

Also note:

VPN connections (e.g. ip addr, wg show for WireGuard).
Proxy configuration (env | grep -i proxy, /etc/environment).

Users, sessions, and authentication state

Capture who is (and was recently) logged in:

who > /mnt/evidence/who.txt
w > /mnt/evidence/w.txt
last -n 100 > /mnt/evidence/last.txt
id > /mnt/evidence/id_current_user.txt

Optionally gather:

/etc/passwd, /etc/shadow (restrict access; treat as highly sensitive).
/etc/group, /etc/sudoers, /etc/sudoers.d/.

Example:

cp /etc/passwd /mnt/evidence/etc_passwd
cp /etc/group /mnt/evidence/etc_group
cp /etc/shadow /mnt/evidence/etc_shadow_restricted
chmod 600 /mnt/evidence/etc_shadow_restricted

Kernel and system configuration

Capture runtime kernel parameters:

sysctl -a > /mnt/evidence/sysctl_all.txt 2>/dev/null || true
cat /proc/cmdline > /mnt/evidence/proc_cmdline.txt
dmesg > /mnt/evidence/dmesg.txt 2>/dev/null || true

Note SELinux/AppArmor state:

getenforce > /mnt/evidence/selinux_enforce.txt 2>/dev/null || true
sestatus > /mnt/evidence/selinux_status.txt 2>/dev/null || true
aa-status > /mnt/evidence/apparmor_status.txt 2>/dev/null || true

Filesystem overview and suspicious artifacts

Collect a high-level snapshot of the filesystem without recursively dumping everything:

df -hT > /mnt/evidence/df_hT.txt
mount > /mnt/evidence/mount.txt
lsblk -f > /mnt/evidence/lsblk_f.txt

Capture directory listings that help later triage:

# Common executable and temp locations
ls -alR /bin /sbin /usr/bin /usr/sbin > /mnt/evidence/ls_system_bins.txt
ls -alR /tmp /var/tmp /dev/shm > /mnt/evidence/ls_temp_areas.txt
# Home directories
ls -alR /home > /mnt/evidence/ls_home.txt

Depending on size, you may restrict recursion, or use find to capture metadata only:

find / -xdev -printf '%p|%u|%g|%m|%s|%TY-%Tm-%Td %TH:%TM:%TS\n' \
  > /mnt/evidence/find_metadata_root.txt 2>/dev/null

Memory acquisition on Linux

Memory acquisition is technically complex and kernel-version-dependent, but extremely valuable for:

In-memory malware.
Decrypted secrets (keys, credentials).
Network connections not visible in logs.

Considerations before dumping RAM

Impact: Memory acquisition usually impacts performance and may cause instability.
Size: A 64 GB system will produce a large image; ensure enough storage and bandwidth.
Legality and privacy: Memory may contain highly sensitive data; ensure authorization and handling procedures.

Common acquisition approaches

Kernel modules like LiME (Linux Memory Extractor).
Hypervisor-level dumps for virtual machines (preferred when available).
Hardware-based methods (outside scope here).

Using LiME (conceptual overview)

Typical steps (from a trusted source):

Load the LiME kernel module:

  insmod lime.ko "path=/mnt/evidence/memdump.lime format=lime"

Verify file creation and size.
Unload the module after dump:

  rmmod lime

Then hash the image:

sha256sum /mnt/evidence/memdump.lime > /mnt/evidence/memdump.lime.sha256

For VMs, use hypervisor tools (e.g. virsh dump for KVM, snapshot features in cloud providers) to get a memory snapshot without touching the guest, when possible. That approach is usually preferable for IR.

Disk and filesystem evidence

Triaging vs full imaging

You often will not start with a full disk image if rapid triage is needed, but you should understand when imaging is required:

Full disk images:

Used when legal proceedings or detailed forensic analysis is expected.
Capture everything, including deleted data, slack space, unallocated space.

Targeted collections:

Faster, smaller.
Focus on logs, configs, suspicious directories, application data.

When possible, aim for both: quick triage evidence now, and full imaging soon after.

Creating a disk/partition image with `dd` or `dcfldd`

Use a read-only source:

For physical disks:

Prefer connecting them to a forensic workstation via a write-blocker.

For logical volumes:

Consider imaging from a live response environment with as little system activity as possible.

Example:

# Identify target
lsblk -f
# Imaging with dd (careful with if/of order)
dd if=/dev/sda of=/mnt/evidence/disk-sda.img bs=4M conv=noerror,sync status=progress
# Hash the image
sha256sum /mnt/evidence/disk-sda.img > /mnt/evidence/disk-sda.img.sha256

dcfldd adds built-in hashing and split output:

dcfldd if=/dev/sda hash=sha256 hashlog=/mnt/evidence/sda_hash.log \
  split=2000M of=/mnt/evidence/disk-sda.img.

Imaging virtual disks

For VMs, often you can:

Copy the virtual disk files (e.g. qcow2, vmdk, vhdx).
Use hypervisor snapshot features.

Example (KVM/libvirt):

# Stop or snapshot the VM (as policy allows)
virsh suspend vmname
# Copy the disk
cp /var/lib/libvirt/images/vmname.qcow2 /mnt/evidence/vmname.qcow2
# Resume VM if needed
virsh resume vmname

Document the VM configuration as well (e.g. virsh dumpxml vmname).

Log and configuration evidence

System logs

On Linux, key logs are usually under /var/log. Collect at least:

/var/log/auth.log or /var/log/secure
/var/log/syslog or /var/log/messages
/var/log/kern.log (if present)
/var/log/dmesg (if distinct from command)
/var/log/wtmp, /var/log/btmp, /var/log/lastlog
/var/log/faillog

Example:

mkdir -p /mnt/evidence/var_log
cp -a /var/log/* /mnt/evidence/var_log/

Also capture log rotation configs:

cp -a /etc/logrotate.conf /etc/logrotate.d /mnt/evidence/logrotate/

Application and service logs

This depends on the system; common locations:

Web servers:

Apache: /var/log/apache2/ or /var/log/httpd/
Nginx: /var/log/nginx/

Databases:

MySQL/MariaDB logs: often /var/log/mysql/ or /var/log/mariadb/
PostgreSQL: often /var/log/postgresql/ or custom directories.

SSH-specific:

View in syslog/auth logs; config in /etc/ssh/.

Copy relevant directories recursively, preserving permissions and timestamps:

cp -a /var/log/nginx /mnt/evidence/var_log_nginx 2>/dev/null || true
cp -a /var/log/apache2 /mnt/evidence/var_log_apache2 2>/dev/null || true
cp -a /var/log/mysql /mnt/evidence/var_log_mysql 2>/dev/null || true

Configuration files

Configurations are crucial for reconstructing how a system was supposed to behave vs how it was misused.

Common targets:

/etc/ (as a whole, if feasible):

  cp -a /etc /mnt/evidence/etc_full

Specific apps:

/etc/ssh/, /etc/httpd/ or /etc/apache2/, /etc/nginx/.

Package manager state:

APT: /var/lib/dpkg/, /etc/apt/.
RPM/YUM/DNF: /var/lib/rpm/, /etc/yum.repos.d/.
Pacman: /var/lib/pacman/.

Remote and off-host evidence

Modern environments generate lots of evidence outside the compromised machine.

Networking and security devices

Firewalls (iptables/nftables logs, or external firewalls).
IDS/IPS (Snort/Suricata/Zeek logs, SIEM platforms).
Load balancers and WAFs.
VPN concentrators and RADIUS servers.

Coordinate with relevant teams to acquire:

Time-bounded log exports (e.g. all logs for host 10.0.5.12 from T1 to T2).
Configuration snapshots to understand rule sets at the time.

Cloud and orchestration logs

For Linux systems in cloud or containerized environments, evidence may include:

Cloud provider logs:

AWS: CloudTrail, VPC Flow Logs, CloudWatch logs.
Azure: Activity logs, NSG flow logs.
GCP: Cloud Audit Logs, VPC Flow Logs.

Orchestration logs:

Kubernetes: kubectl logs, kubectl describe, audit logs.
Container engine logs and metadata.

These are usually viewed and exported via provider tools or APIs (not covered here in depth). The key for this chapter: treat them as first-class evidence, with the same hash and chain-of-custody discipline.

Minimizing contamination and anti-forensic defenses

Reducing your footprint

Every action you take may:

Change timestamps (atime, mtime, ctime).
Create new log entries.
Alter process and network state.

Mitigations:

Use mount -o ro or mount -o ro,noload where possible for offline mounts.
If you must browse files:

Use tools that avoid writing extended attributes or generating thumbnails, e.g. avoid GUI file managers.

Disable services that may overwrite evidence only after capturing their state and logs.

Handling anti-forensics and rootkits

If you suspect:

Kernel-level rootkits.
Binary replacement of core tools (ps, ls, netstat).
Log tampering.

Then:

Prefer raw artifact collection (e.g. /proc, raw logs, full disk images, memory dumps).
Use external verification:

Compare hashes of binaries to known-good sources.
Run tools from trusted, read-only media and point them at mounted images.

Example: analyzing a mounted image from a clean system:

# On a forensic workstation
mount -o ro,loop,show_sys_files,relatime disk-sda.img /mnt/image
# Now use host tools (not the suspect system's) to inspect
ls -al /mnt/image/bin

Packaging and transferring evidence

Structuring evidence directories

Use a logical structure, for example:

/mnt/evidence/
  host-web01/
    metadata/
      hostinfo.txt
      acquisition-notes.txt
    live/
      date.txt
      ps_aux.txt
      ss_tcp_all.txt
      ...
    logs/
      var_log/
      app_logs/
    configs/
      etc_full/
    images/
      disk-sda.img
      disk-sda.img.sha256
      memdump.lime
      memdump.lime.sha256

This makes it easier for analysts to navigate later.

Secure transfer and storage

When moving evidence:

Prefer encrypted channels:

scp, rsync -e ssh, SFTP, or encrypted archives (gpg, age).

Re-verify hashes on arrival and record them.

Example:

# On source
tar czf - host-web01 | gpg --encrypt --recipient forensics@example.com \
  > host-web01.tar.gz.gpg
# Transfer
scp host-web01.tar.gz.gpg forensics@example.com:/evidence/incoming/
# On destination, verify and re-hash
sha256sum host-web01.tar.gz.gpg > host-web01.tar.gz.gpg.sha256

Implement access control on the evidence repository (least privilege), and configure backups to avoid accidental loss without allowing uncontrolled modification.

Practicing evidence collection

To become effective, practice in non-production environments:

Set up a lab system, simulate incidents (e.g. deploy simple malware, misconfigurations, or CTF challenges).
Perform full evidence collection:

Live triage.
Memory acquisition (if feasible).
Disk imaging.
Log and configuration collection.

Review your:

Commands and their impact.
Documentation and chain-of-custody records.
Directory structure and naming conventions.

Over time, refine:

A standard playbook of commands.
A scripted collection toolkit that:

Runs from trusted media.
Automates repetitive collection tasks.
Logs everything it does to a file.

That combination of tooling and discipline is what turns ad-hoc data gathering into reliable, defensible forensic evidence collection.

Comments

Please login to add a comment.

Don't have an account? Register now!

7.2.1 Collecting evidence

Incident response mindset for evidence collection

Preparation and environment

Response workstation and toolset

Order of volatility and collection strategy

Documentation and chain of custody

Field notes

Chain of custody records

Live data collection on Linux

Time and system identity

Running processes and system state

Network connections and configuration

Users, sessions, and authentication state

Kernel and system configuration

Filesystem overview and suspicious artifacts

Memory acquisition on Linux

Considerations before dumping RAM

Common acquisition approaches

Using LiME (conceptual overview)

Disk and filesystem evidence

Triaging vs full imaging

Creating a disk/partition image with `dd` or `dcfldd`

Imaging virtual disks

Log and configuration evidence

System logs

Application and service logs

Configuration files

Remote and off-host evidence

Networking and security devices

Cloud and orchestration logs

Minimizing contamination and anti-forensic defenses

Reducing your footprint

Handling anti-forensics and rootkits

Packaging and transferring evidence

Structuring evidence directories

Secure transfer and storage

Practicing evidence collection

Comments

Where to Move