Table of Contents
Understanding Goals and Constraints
In incident response, “log and file recovery” has two distinct but related goals:
- Recovering evidence that was deleted, tampered with, rotated, or partially overwritten (forensic focus).
- Recovering operational data (e.g., configuration files, application data) to restore service or reconstruct events (response focus).
At this stage, you should:
- Preserve as much of the current state as possible (including free space and slack space).
- Avoid unnecessary writes to affected disks.
- Work from copies or disk images whenever possible.
This chapter focuses on how to recover and reconstruct logs and files, not on the overall incident workflow or general evidence collection (covered elsewhere in the parent section).
Types of Data Relevant to Recovery
For log and file recovery, you will primarily deal with:
- Plain-text logs (e.g.,
/var/log/secure,/var/log/auth.log, web server logs). - Binary/application logs (e.g., SQLite DBs, custom binary formats).
- Rotated/compressed logs (e.g.,
.1,.gz,*.xz). - Configuration files (sometimes deleted or replaced by attackers).
- User and application data files (documents, database files, etc.).
- Filesystem metadata (timestamps, directory entries, inodes) that imply file and log history even when contents are gone.
Each has different recovery strategies and success probabilities depending on filesystem, time elapsed, and system activity.
Immediate Preservation Measures
Before attempting recovery:
- Remount read-only if appropriate and feasible:
- Using
mount -o remount,ro /dev/sdXN /mountpointon a live system, or - Preferably, power down and use a forensic boot media or hardware write-blocker.
- Acquire a disk image:
- Use tools that support hashing and verification, such as:
dc3dd if=/dev/sdX of=/evidence/disk.img hash=sha256 log=/evidence/dc3dd.log- For large systems, consider partition-level or LVM snapshot–based images.
All recovery work should be done on the image, not on the original disk.
Sources of Log Data Beyond Obvious Files
Attackers frequently try to clean or modify visible logs (e.g., auth.log), but logs or equivalents may exist elsewhere:
- Rotated logs:
/var/log/auth.log.1,/var/log/auth.log.2.gz/var/log/secure-20251201,/var/log/messages-2025-12-10.xz- Journal logs storage:
systemdjournals in/var/log/journal/or/run/log/journal/- Application-specific directories:
- Web:
/var/log/nginx/,/var/log/apache2/ - Databases:
/var/log/mysql/, transaction logs, WAL files. - Service-specific logs in
$HOME: - E.g.,
.xsession-errors, application logs under~/.local/share/or~/.cache/. - Remote/centralized logging:
rsyslog/syslog-ngforwarding to log servers.- SIEM systems (e.g., ELK, Splunk) and cloud logging (CloudWatch, Stackdriver, etc.).
- Non-log artifacts with evidential value:
- Shell history files (
~/.bash_history,~/.zsh_history). - Cron logs and crontab files.
- SSH known_hosts, authorized_keys, and configuration files.
- Web server config and
.htaccessfiles that show redirections, backdoors, or tampering. - Browser history (for user actions relevant to the incident).
Before deep recovery, inventory all obvious and secondary log locations; many “lost logs” turn out to exist in a rotated or alternate location.
Recovering Rotated and Compressed Logs
Log rotation is not adversarial; it’s routine. But it can hide or fragment the timeline.
Identifying Rotated Logs
- Use filename patterns:
ls -1 /var/log | egrep '(.1|.2|.gz|.xz|[0-9]{8})$'- For application logs with custom rotation:
- Inspect logrotate configs in
/etc/logrotate.confand/etc/logrotate.d/.
Decompressing and Viewing
Common formats:
- gzip (
*.gz):
zcat /var/log/auth.log.2.gz | less- xz (
*.xz):
xzcat /var/log/journal-20251210.xz | less- bzip2 (
*.bz2):
bzcat /var/log/someapp.log.3.bz2 | less
Use zgrep, xzgrep, etc. to filter without full decompression.
Reconstructing Log Timelines
To rebuild a continuous timeline:
- Order files by rotation policy, usually oldest has highest suffix (e.g.,
.7.gz). - Concatenate in correct order:
zcat /var/log/auth.log.7.gz /var/log/auth.log.6.gz ... \
| cat - /var/log/auth.log.1 /var/log/auth.log \
> /tmp/auth_full_timeline.log- Adjust for overlapping or missing time windows:
- Verify continuity with timestamps.
- Note explicit gaps where compression or truncation removed data.
Be explicit in your notes where logs are missing due to rotation; do not silently infer actions in missing periods.
Recovering `systemd` Journals
If the system uses systemd journaling:
- Check journal directories:
- Persistent:
/var/log/journal/ - Volatile:
/run/log/journal/
Viewing and Exporting
- Full journal export to text:
journalctl --no-pager --output=short-iso > /tmp/journal_full.log- Restricted by service, unit, or time:
journalctl -u ssh.service --since "2025-12-10 00:00" \
--until "2025-12-11 00:00" > ssh_journal.logRecovering Deleted or Truncated Journal Entries
If journalctl shows gaps or journals appear wiped:
- Look for rotated journal files:
.journaland.journal~artifacts may exist in the journal directories.- Carve journal files from unallocated space:
- Journal files contain identifiable headers (
LPKSHHRHmagic, etc.); use file carvers (discussed below) with custom signatures. - Recover from backups or snapshots:
- Journals are often captured in filesystem-level snapshots (Btrfs, LVM, ZFS).
Direct manual repair of journal binary files is non-trivial; focus on preservation and extraction rather than editing.
File Recovery from Filesystem Artifacts
When logs or other files were deleted (whether intentionally or due to normal rotation), recovery depends heavily on the filesystem.
EXT Family (ext2/ext3/ext4)
Common tools:
- TestDisk – partition and filesystem recovery.
- extundelete – undelete files on unmounted/ext image.
- Photorec – file carving (filesystem-agnostic).
Workflow Example (ext4, using `extundelete`)
- Work on an image:
losetup /dev/loop0 /evidence/disk.img
mkdir /mnt/recovered- Identify partition with
fdisk -l /dev/loop0(assume/dev/loop0p2). - Run extundelete:
extundelete /dev/loop0p2 --restore-file var/log/auth.log \
--output-dir /mnt/recovered- If path unknown, try:
extundelete /dev/loop0p2 --restore-all --output-dir /mnt/recoveredResults may be partial if blocks are overwritten.
XFS/Btrfs/Other Filesystems
Each filesystem has different recovery tools and limitations:
- XFS:
xfs_repair(integrity, not undelete).xfs_metadumpfor metadata extraction.- File carving is often needed for content.
- Btrfs:
- Use
btrfs restoreto access older versions. - Snapshots greatly increase chances of recovery.
- ZFS:
- Native snapshots (
zfs list -t snapshot,zfs rollback,zfs clone).
In many non-ext cases, snapshots are more productive than raw undelete attempts.
File Carving for Logs and Text
When filesystem-level undelete fails, you can attempt file carving from unallocated space or raw images.
Basic Principles
- File carving ignores filesystem metadata and searches for:
- Signatures (magic bytes) to determine start of files.
- Optional heuristics to guess file boundaries.
Tools
- Photorec:
- Part of TestDisk, recovers by file type.
- scalpel / foremost:
- Signature-based file carving.
Using Photorec
- Run on the image:
photorec /log /d /mnt/recovered /cmd /evidence/disk.img options,search
(Interactive mode is typical; configure to search for log, txt, gz.)
- After carving:
- Use
filecommand to identify recovered files. - Use
grep,zgrepto search for relevant event patterns.
Carving for Text-Based Logs
- Many logs are plain text; carving them is noisy.
- Strategy:
- Carve to a separate directory.
- Use indicators of compromise (IOCs) to filter:
grep -R "suspicious_user" /mnt/recovered | head- Look for recognizable log headers or date patterns.
Remember that carved logs may not retain original names or paths; correlate with contents and timestamps where possible.
Recovering from Snapshots and Backups
Modern Linux systems frequently use snapshotting or regular backups. These can be invaluable when logs or files have been cleaned.
LVM Snapshots
If the system used LVM and snapshots existed before the incident:
- List snapshots:
lvs- Mount snapshot (read-only) and copy logs:
lvcreate -s -n forensic_snap -L 10G /dev/vg0/root
mount -o ro /dev/vg0/forensic_snap /mnt/forensic
cp -a /mnt/forensic/var/log /analysis/logs_preincidentIf historical snapshots are gone, this may not help, but on some systems, backup LVs or older snapshots persist.
Btrfs Snapshots
- List snapshots:
btrfs subvolume list /- Mount or access snapshot subvolumes (often under
.snapshots):
btrfs subvolume show /.snapshots/123/snapshot
mount -o ro /.snapshots/123/snapshot /mnt/snap123Copy logs or configuration from these snapshots for comparison.
ZFS Snapshots
- List:
zfs list -t snapshot- Clone for analysis:
zfs clone pool/root@pre-incident pool/forensics
mount -o ro /pool/forensics /mnt/forensicsTraditional Backups
Backups might be:
rsync-based directories.tararchives.- Commercial backup products.
For logs:
- Extract only needed time windows to avoid clutter.
- Note: Backups may be incomplete or filtered; validate their policy (e.g., logs only kept for N days).
Reconstruction When Logs Are Missing or Tampered
Often, you will not fully “recover” logs, but rather reconstruct events from partial artifacts.
Cross-Correlation of Multiple Sources
If /var/log/auth.log is wiped, consider:
- SSH daemon
systemdjournal entries. last/lastboutputs derived from/var/log/wtmpand/var/log/btmp.- Application logs (e.g., VPN, web app).
- Network device logs (firewalls, routers).
- Cloud provider audit logs (e.g.,
aws cloudtrail).
A single event (like an SSH login) usually leaves traces in several locations; one may be intact even if another is removed.
Using Metadata as Evidence
Even without file contents, metadata can indicate activity:
- Timestamps on logs themselves:
ls -l --full-time /var/log/auth.log- Sudden truncation times can be linked to attacker actions.
- Directory metadata:
stat /var/log, or timeline tools likelog2timeline(from other toolsets).- Inode reuse patterns:
- Multiple recreated log files may share changed inodes, visible through forensic tools.
Detecting and Handling Log Tampering
Indicators:
- Gaps in timestamps or sequences.
- Inconsistent time zones or formats.
- Missing log entries you know should exist (e.g., associated events elsewhere).
- Strings like
log cleanerscripts, manual edits markers (vibackup files, etc.).
For reconstruction:
- Document where logs are suspected to be incomplete.
- Use other corroborative evidence (process lists, network captures, backups).
- Avoid “filling in” missing data with assumptions; instead, state possible interpretations.
Recovering Configuration and Application Files
Non-log files often matter as much as logs:
- Configuration files:
/etc/*(e.g.,sshd_config,sudoers, web server configs).- Application config under
/opt,/usr/local, or$HOME. - Application state:
- Web roots (
/var/www/), database data directories, custom binaries/scripts.
To recover:
- Apply the same toolkit as for logs (undelete, carving, snapshots).
- For textual configs, even partial fragments are valuable to see attacker-introduced directives (e.g.,
ProxyCommand,ForceCommand, extraIncludelines).
Recovered versions can be used to:
- Compare pre- and post-incident states.
- Restore services to known good configurations.
- Identify persistence mechanisms embedded in configs.
Handling Partially Overwritten or Fragmented Files
Sometimes logs or files are only partially available:
- Start or end of file missing.
- Fragmentation due to disk usage after deletion.
- Overwritten sectors in the middle.
Strategies:
- Partial reconstruction:
- Combine intact portions from:
- Recovered file fragments.
- Rotated or backup copies.
- Use text pattern recognition to align fragments by timestamp or line structure.
- Marking uncertainty:
- Clearly annotate which parts are continuous and which involve gaps.
- Do not alter content; preserve original fragments separately from reconstructed views used for analysis.
Integrity, Hashing, and Documentation
All recovered content must be handled with forensic rigor.
Hashing Recovered Files
- Compute cryptographic hashes:
sha256sum /mnt/recovered/auth.log > auth.log.sha256- For large sets, generate a manifest:
find /mnt/recovered -type f -exec sha256sum {} \; > recovered_hashes.txtThese hashes let you:
- Prove files were not modified after recovery.
- Reference exact versions in reports and legal processes.
Documenting Recovery Steps
Maintain a clear record of:
- Tools and versions used (e.g.,
extundelete 0.2.4). - Exact commands, including options.
- Time and host environment of each operation.
- Any errors or warnings observed.
Use a read-only operations log (e.g., plain text file under version control) and include it with the case artifacts.
Common Pitfalls and How to Avoid Them
- Performing recovery on the live compromised system:
- Risk of overwriting evidence; always favor disk images and read-only mounts.
- Running defragmentation or disk cleanup tools:
- On Linux, this is rare but some admins try “optimization.” It destroys unallocated artifacts; avoid during investigations.
- Not checking off-system logs:
- Missing opportunities from log servers, network equipment, or cloud services.
- Over-relying on a single tool:
- If
extundeletefails, try carving or snapshot analysis; different approaches reveal different data.
Putting It Together: Practical Mini-Workflow
As a concise example for log and file recovery during an incident:
- Preserve:
- Acquire a disk image with hashing.
- Mount image read-only.
- Inventory existing logs and configs:
- Plain logs, rotated/compressed, journals, application logs.
- Consolidate:
- Decompress and merge rotated logs into full timelines.
- Export
systemdjournal segments. - Recover:
- Attempt undelete (filesystem-appropriate tools) for key logs and config files.
- Use snapshots or backups to retrieve earlier states.
- Apply file carving to unallocated space for additional fragments.
- Correlate:
- Combine logs, recovered files, metadata, and off-system sources to build an event timeline.
- Mark explicit gaps or tampering suspicions.
- Verify and document:
- Hash recovered artifacts.
- Record commands and tools used.
- Preserve both raw recovered files and any derived analysis products (e.g., combined timelines).
This structured approach maximizes the chance of recovering useful logs and files while maintaining forensic soundness.