Table of Contents
Incident response mindset for evidence collection
In forensics and incident response, evidence collection must balance speed, accuracy, and preservation. In practice this means:
- Act as if every action could be replayed in court or audited.
- Prioritize preserving volatile data that will disappear soon.
- Avoid changing the system more than absolutely necessary.
- Record exactly who did what, when, and how.
This chapter focuses on what to collect, in which order, and how, with Linux-specific tools and practices.
Key principles:
- Order of volatility (OOV): Collect the most volatile data first.
- Minimal footprint: Prefer read-only access, external tools, and mounted media.
- Reproducibility: Everything must be re-doable from your notes, commands, and hashes.
- Chain of custody: Evidence must be clearly traceable through its lifetime.
Preparation and environment
Response workstation and toolset
Never rely solely on the compromised system. Ideally, use:
- A trusted response workstation (your laptop or dedicated IR box).
- A set of statically linked, known-good tools, e.g.:
- BusyBox static binaries
- Forensic toolkits (The Sleuth Kit, Volatility, LiME, etc.)
- External media:
- Write-once (or write-protected) USB drives for evidence storage.
- Separate media for tools vs collected evidence.
Before the incident:
- Pre-build a forensic USB with:
- A minimal Linux (e.g. Kali, CAINE, Tails, custom build).
- Acquisition tools (
dd,dcfldd,bulk_extractor,aff4,ewfacquire). - Hashing tools (
sha256sum,md5sum,b3sum). - Network tools (
nmap,ssh,socat,netcat).
During the incident:
- Use trusted binaries from:
- Mounted read-only media (
/mnt/usb/tools/...). - Or from a read-only NFS/SSHFS share.
- Avoid using potentially compromised
/bin,/usr/binon the suspect system if you suspect rootkits.
Order of volatility and collection strategy
General order of volatility (most to least volatile):
- CPU state, registers, running processes, and memory (RAM).
- Network connections and ephemeral data (ARP, routing cache, netstat state).
- Logged-in users and sessions.
- Temporary files,
/tmp,/run, swap. - Application logs and system logs.
- Filesystem metadata (timestamps, extended attributes).
- Full disk images.
- Off-host logs and cloud metadata (less volatile but critical).
For Linux, a practical workflow:
- Stabilize and isolate (network isolation) while minimizing changes.
- Immediately start documentation (photos, notes, typed commands history).
- Collect live data:
- System clock, uptime.
- Logged-in users.
- Running processes, open files.
- Network state.
- Memory image (if in scope and tools available).
- Collect key logs and configuration.
- Acquire disk-level images (or at least key partition images).
- Collect external logs (IDS, firewall, VPN, cloud provider, auth systems).
Documentation and chain of custody
Field notes
Document in real time:
- Who is performing the actions.
- When (timestamp in UTC).
- Where (hostname, IPs, environment).
- What was done (exact commands and tools, their versions).
- Why each action was taken (brief justification).
Example: keep a text log on a separate device:
# All timestamps in UTC
2025-04-01T13:20:10Z - IR lead: Alice
- Host: web01.example.com (10.0.5.12)
- Access over out-of-band console
- Command: /mnt/ir-tools/bin/ps -eo pid,ppid,cmd > /mnt/evidence/web01_ps.txt
- SHA256(web01_ps.txt)=...If possible, capture:
- Screenshots or photographs of the console.
- The original terminal scrollback (e.g.
scriptorscriptreplay).
Example using script (from a trusted location):
/mnt/ir-tools/bin/script -t 2> /mnt/evidence/typescript.timing \
/mnt/evidence/typescript.logChain of custody records
For each evidence item:
- Unique ID.
- Description (host, path, context).
- Acquisition method and tools.
- Date/time (with timezone or clearly stated UTC).
- Hashes (at least SHA-256).
- Who collected it.
- All transfers, storage locations, and access.
Hash example:
sha256sum memdump.lime > memdump.lime.sha256
sha256sum disk-image.dd > disk-image.dd.sha256Use consistent naming and immutable storage (e.g. WORM storage, object storage with versioning and retention policies) where possible.
Live data collection on Linux
Assume the system is still powered on. The challenge is to collect as much as possible without altering evidence more than necessary.
Time and system identity
First, capture reference information:
# Use trusted binaries if available
date -u
hostname -f
uname -a
id
who
last -n 20Also capture:
- System clock vs external time if possible, for correlation.
/etc/hostname,/etc/hosts,/etc/os-release.
Running processes and system state
Collect comprehensive process and system info:
ps auxww > /mnt/evidence/ps_aux.txt
ps -eo pid,ppid,user,group,start,time,command > /mnt/evidence/ps_detailed.txt
top -b -n 1 > /mnt/evidence/top_snapshot.txt
lsof -nP > /mnt/evidence/lsof_all.txt
lsmod > /mnt/evidence/lsmod.txt
Prefer -ww to avoid truncated command lines.
If procps tools are distrusted, read directly from /proc:
cp -a /proc /mnt/evidence/proc_snapshot(Be aware this is large and may change as you copy; note that in your documentation.)
Network connections and configuration
Network state is highly volatile; gather early:
ip addr show > /mnt/evidence/ip_addr.txt
ip route show > /mnt/evidence/ip_route.txt
ip neigh show > /mnt/evidence/ip_neigh.txt
ss -tulpn > /mnt/evidence/ss_listen.txt
ss -tanp > /mnt/evidence/ss_tcp_all.txt
ss -uanp > /mnt/evidence/ss_udp_all.txt
If ss is unavailable:
netstat -plant > /mnt/evidence/netstat_plant.txt
netstat -rn > /mnt/evidence/netstat_rn.txtCapture firewall rules:
iptables-save > /mnt/evidence/iptables_save.txt 2>/dev/null || true
ip6tables-save > /mnt/evidence/ip6tables_save.txt 2>/dev/null || true
nft list ruleset > /mnt/evidence/nft_ruleset.txt 2>/dev/null || trueAlso note:
- VPN connections (e.g.
ip addr,wg showfor WireGuard). - Proxy configuration (
env | grep -i proxy,/etc/environment).
Users, sessions, and authentication state
Capture who is (and was recently) logged in:
who > /mnt/evidence/who.txt
w > /mnt/evidence/w.txt
last -n 100 > /mnt/evidence/last.txt
id > /mnt/evidence/id_current_user.txtOptionally gather:
/etc/passwd,/etc/shadow(restrict access; treat as highly sensitive)./etc/group,/etc/sudoers,/etc/sudoers.d/.
Example:
cp /etc/passwd /mnt/evidence/etc_passwd
cp /etc/group /mnt/evidence/etc_group
cp /etc/shadow /mnt/evidence/etc_shadow_restricted
chmod 600 /mnt/evidence/etc_shadow_restrictedKernel and system configuration
Capture runtime kernel parameters:
sysctl -a > /mnt/evidence/sysctl_all.txt 2>/dev/null || true
cat /proc/cmdline > /mnt/evidence/proc_cmdline.txt
dmesg > /mnt/evidence/dmesg.txt 2>/dev/null || trueNote SELinux/AppArmor state:
getenforce > /mnt/evidence/selinux_enforce.txt 2>/dev/null || true
sestatus > /mnt/evidence/selinux_status.txt 2>/dev/null || true
aa-status > /mnt/evidence/apparmor_status.txt 2>/dev/null || trueFilesystem overview and suspicious artifacts
Collect a high-level snapshot of the filesystem without recursively dumping everything:
df -hT > /mnt/evidence/df_hT.txt
mount > /mnt/evidence/mount.txt
lsblk -f > /mnt/evidence/lsblk_f.txtCapture directory listings that help later triage:
# Common executable and temp locations
ls -alR /bin /sbin /usr/bin /usr/sbin > /mnt/evidence/ls_system_bins.txt
ls -alR /tmp /var/tmp /dev/shm > /mnt/evidence/ls_temp_areas.txt
# Home directories
ls -alR /home > /mnt/evidence/ls_home.txt
Depending on size, you may restrict recursion, or use find to capture metadata only:
find / -xdev -printf '%p|%u|%g|%m|%s|%TY-%Tm-%Td %TH:%TM:%TS\n' \
> /mnt/evidence/find_metadata_root.txt 2>/dev/nullMemory acquisition on Linux
Memory acquisition is technically complex and kernel-version-dependent, but extremely valuable for:
- In-memory malware.
- Decrypted secrets (keys, credentials).
- Network connections not visible in logs.
Considerations before dumping RAM
- Impact: Memory acquisition usually impacts performance and may cause instability.
- Size: A 64 GB system will produce a large image; ensure enough storage and bandwidth.
- Legality and privacy: Memory may contain highly sensitive data; ensure authorization and handling procedures.
Common acquisition approaches
- Kernel modules like LiME (Linux Memory Extractor).
- Hypervisor-level dumps for virtual machines (preferred when available).
- Hardware-based methods (outside scope here).
Using LiME (conceptual overview)
Typical steps (from a trusted source):
- Load the LiME kernel module:
insmod lime.ko "path=/mnt/evidence/memdump.lime format=lime"- Verify file creation and size.
- Unload the module after dump:
rmmod limeThen hash the image:
sha256sum /mnt/evidence/memdump.lime > /mnt/evidence/memdump.lime.sha256
For VMs, use hypervisor tools (e.g. virsh dump for KVM, snapshot features in cloud providers) to get a memory snapshot without touching the guest, when possible. That approach is usually preferable for IR.
Disk and filesystem evidence
Triaging vs full imaging
You often will not start with a full disk image if rapid triage is needed, but you should understand when imaging is required:
- Full disk images:
- Used when legal proceedings or detailed forensic analysis is expected.
- Capture everything, including deleted data, slack space, unallocated space.
- Targeted collections:
- Faster, smaller.
- Focus on logs, configs, suspicious directories, application data.
When possible, aim for both: quick triage evidence now, and full imaging soon after.
Creating a disk/partition image with `dd` or `dcfldd`
Use a read-only source:
- For physical disks:
- Prefer connecting them to a forensic workstation via a write-blocker.
- For logical volumes:
- Consider imaging from a live response environment with as little system activity as possible.
Example:
# Identify target
lsblk -f
# Imaging with dd (careful with if/of order)
dd if=/dev/sda of=/mnt/evidence/disk-sda.img bs=4M conv=noerror,sync status=progress
# Hash the image
sha256sum /mnt/evidence/disk-sda.img > /mnt/evidence/disk-sda.img.sha256
dcfldd adds built-in hashing and split output:
dcfldd if=/dev/sda hash=sha256 hashlog=/mnt/evidence/sda_hash.log \
split=2000M of=/mnt/evidence/disk-sda.img.Imaging virtual disks
For VMs, often you can:
- Copy the virtual disk files (e.g. qcow2, vmdk, vhdx).
- Use hypervisor snapshot features.
Example (KVM/libvirt):
# Stop or snapshot the VM (as policy allows)
virsh suspend vmname
# Copy the disk
cp /var/lib/libvirt/images/vmname.qcow2 /mnt/evidence/vmname.qcow2
# Resume VM if needed
virsh resume vmname
Document the VM configuration as well (e.g. virsh dumpxml vmname).
Log and configuration evidence
System logs
On Linux, key logs are usually under /var/log. Collect at least:
/var/log/auth.logor/var/log/secure/var/log/syslogor/var/log/messages/var/log/kern.log(if present)/var/log/dmesg(if distinct from command)/var/log/wtmp,/var/log/btmp,/var/log/lastlog/var/log/faillog
Example:
mkdir -p /mnt/evidence/var_log
cp -a /var/log/* /mnt/evidence/var_log/Also capture log rotation configs:
cp -a /etc/logrotate.conf /etc/logrotate.d /mnt/evidence/logrotate/Application and service logs
This depends on the system; common locations:
- Web servers:
- Apache:
/var/log/apache2/or/var/log/httpd/ - Nginx:
/var/log/nginx/ - Databases:
- MySQL/MariaDB logs: often
/var/log/mysql/or/var/log/mariadb/ - PostgreSQL: often
/var/log/postgresql/or custom directories. - SSH-specific:
- View in syslog/auth logs; config in
/etc/ssh/.
Copy relevant directories recursively, preserving permissions and timestamps:
cp -a /var/log/nginx /mnt/evidence/var_log_nginx 2>/dev/null || true
cp -a /var/log/apache2 /mnt/evidence/var_log_apache2 2>/dev/null || true
cp -a /var/log/mysql /mnt/evidence/var_log_mysql 2>/dev/null || trueConfiguration files
Configurations are crucial for reconstructing how a system was supposed to behave vs how it was misused.
Common targets:
/etc/(as a whole, if feasible):
cp -a /etc /mnt/evidence/etc_full- Specific apps:
/etc/ssh/,/etc/httpd/or/etc/apache2/,/etc/nginx/.- Package manager state:
- APT:
/var/lib/dpkg/,/etc/apt/. - RPM/YUM/DNF:
/var/lib/rpm/,/etc/yum.repos.d/. - Pacman:
/var/lib/pacman/.
Remote and off-host evidence
Modern environments generate lots of evidence outside the compromised machine.
Networking and security devices
- Firewalls (iptables/nftables logs, or external firewalls).
- IDS/IPS (Snort/Suricata/Zeek logs, SIEM platforms).
- Load balancers and WAFs.
- VPN concentrators and RADIUS servers.
Coordinate with relevant teams to acquire:
- Time-bounded log exports (e.g. all logs for host
10.0.5.12fromT1toT2). - Configuration snapshots to understand rule sets at the time.
Cloud and orchestration logs
For Linux systems in cloud or containerized environments, evidence may include:
- Cloud provider logs:
- AWS: CloudTrail, VPC Flow Logs, CloudWatch logs.
- Azure: Activity logs, NSG flow logs.
- GCP: Cloud Audit Logs, VPC Flow Logs.
- Orchestration logs:
- Kubernetes:
kubectl logs,kubectl describe, audit logs. - Container engine logs and metadata.
These are usually viewed and exported via provider tools or APIs (not covered here in depth). The key for this chapter: treat them as first-class evidence, with the same hash and chain-of-custody discipline.
Minimizing contamination and anti-forensic defenses
Reducing your footprint
Every action you take may:
- Change timestamps (
atime,mtime,ctime). - Create new log entries.
- Alter process and network state.
Mitigations:
- Use
mount -o roormount -o ro,noloadwhere possible for offline mounts. - If you must browse files:
- Use tools that avoid writing extended attributes or generating thumbnails, e.g. avoid GUI file managers.
- Disable services that may overwrite evidence only after capturing their state and logs.
Handling anti-forensics and rootkits
If you suspect:
- Kernel-level rootkits.
- Binary replacement of core tools (
ps,ls,netstat). - Log tampering.
Then:
- Prefer raw artifact collection (e.g.
/proc, raw logs, full disk images, memory dumps). - Use external verification:
- Compare hashes of binaries to known-good sources.
- Run tools from trusted, read-only media and point them at mounted images.
Example: analyzing a mounted image from a clean system:
# On a forensic workstation
mount -o ro,loop,show_sys_files,relatime disk-sda.img /mnt/image
# Now use host tools (not the suspect system's) to inspect
ls -al /mnt/image/binPackaging and transferring evidence
Structuring evidence directories
Use a logical structure, for example:
/mnt/evidence/
host-web01/
metadata/
hostinfo.txt
acquisition-notes.txt
live/
date.txt
ps_aux.txt
ss_tcp_all.txt
...
logs/
var_log/
app_logs/
configs/
etc_full/
images/
disk-sda.img
disk-sda.img.sha256
memdump.lime
memdump.lime.sha256This makes it easier for analysts to navigate later.
Secure transfer and storage
When moving evidence:
- Prefer encrypted channels:
scp,rsync -e ssh, SFTP, or encrypted archives (gpg,age).- Re-verify hashes on arrival and record them.
Example:
# On source
tar czf - host-web01 | gpg --encrypt --recipient forensics@example.com \
> host-web01.tar.gz.gpg
# Transfer
scp host-web01.tar.gz.gpg forensics@example.com:/evidence/incoming/
# On destination, verify and re-hash
sha256sum host-web01.tar.gz.gpg > host-web01.tar.gz.gpg.sha256Implement access control on the evidence repository (least privilege), and configure backups to avoid accidental loss without allowing uncontrolled modification.
Practicing evidence collection
To become effective, practice in non-production environments:
- Set up a lab system, simulate incidents (e.g. deploy simple malware, misconfigurations, or CTF challenges).
- Perform full evidence collection:
- Live triage.
- Memory acquisition (if feasible).
- Disk imaging.
- Log and configuration collection.
- Review your:
- Commands and their impact.
- Documentation and chain-of-custody records.
- Directory structure and naming conventions.
Over time, refine:
- A standard playbook of commands.
- A scripted collection toolkit that:
- Runs from trusted media.
- Automates repetitive collection tasks.
- Logs everything it does to a file.
That combination of tooling and discipline is what turns ad-hoc data gathering into reliable, defensible forensic evidence collection.