7.2.2 Log and file recovery

Table of Contents

Understanding Goals and Constraints

In incident response, “log and file recovery” has two distinct but related goals:

Recovering evidence that was deleted, tampered with, rotated, or partially overwritten (forensic focus).
Recovering operational data (e.g., configuration files, application data) to restore service or reconstruct events (response focus).

At this stage, you should:

Preserve as much of the current state as possible (including free space and slack space).
Avoid unnecessary writes to affected disks.
Work from copies or disk images whenever possible.

This chapter focuses on how to recover and reconstruct logs and files, not on the overall incident workflow or general evidence collection (covered elsewhere in the parent section).

Types of Data Relevant to Recovery

For log and file recovery, you will primarily deal with:

Plain-text logs (e.g., /var/log/secure, /var/log/auth.log, web server logs).
Binary/application logs (e.g., SQLite DBs, custom binary formats).
Rotated/compressed logs (e.g., .1, .gz, *.xz).
Configuration files (sometimes deleted or replaced by attackers).
User and application data files (documents, database files, etc.).
Filesystem metadata (timestamps, directory entries, inodes) that imply file and log history even when contents are gone.

Each has different recovery strategies and success probabilities depending on filesystem, time elapsed, and system activity.

Immediate Preservation Measures

Before attempting recovery:

Remount read-only if appropriate and feasible:

Using mount -o remount,ro /dev/sdXN /mountpoint on a live system, or
Preferably, power down and use a forensic boot media or hardware write-blocker.

Acquire a disk image:

Use tools that support hashing and verification, such as:

    dc3dd if=/dev/sdX of=/evidence/disk.img hash=sha256 log=/evidence/dc3dd.log

For large systems, consider partition-level or LVM snapshot–based images.

All recovery work should be done on the image, not on the original disk.

Sources of Log Data Beyond Obvious Files

Attackers frequently try to clean or modify visible logs (e.g., auth.log), but logs or equivalents may exist elsewhere:

Rotated logs:

/var/log/auth.log.1, /var/log/auth.log.2.gz
/var/log/secure-20251201, /var/log/messages-2025-12-10.xz

Journal logs storage:

systemd journals in /var/log/journal/ or /run/log/journal/

Application-specific directories:

Web: /var/log/nginx/, /var/log/apache2/
Databases: /var/log/mysql/, transaction logs, WAL files.

Service-specific logs in $HOME:

E.g., .xsession-errors, application logs under ~/.local/share/ or ~/.cache/.

Remote/centralized logging:

rsyslog/syslog-ng forwarding to log servers.
SIEM systems (e.g., ELK, Splunk) and cloud logging (CloudWatch, Stackdriver, etc.).

Non-log artifacts with evidential value:

Shell history files (~/.bash_history, ~/.zsh_history).
Cron logs and crontab files.
SSH known_hosts, authorized_keys, and configuration files.
Web server config and .htaccess files that show redirections, backdoors, or tampering.
Browser history (for user actions relevant to the incident).

Before deep recovery, inventory all obvious and secondary log locations; many “lost logs” turn out to exist in a rotated or alternate location.

Recovering Rotated and Compressed Logs

Log rotation is not adversarial; it’s routine. But it can hide or fragment the timeline.

Identifying Rotated Logs

Use filename patterns:

  ls -1 /var/log | egrep '(.1|.2|.gz|.xz|[0-9]{8})$'

For application logs with custom rotation:

Inspect logrotate configs in /etc/logrotate.conf and /etc/logrotate.d/.

Decompressing and Viewing

Common formats:

gzip (*.gz):

  zcat /var/log/auth.log.2.gz | less

xz (*.xz):

  xzcat /var/log/journal-20251210.xz | less

bzip2 (*.bz2):

  bzcat /var/log/someapp.log.3.bz2 | less

Use zgrep, xzgrep, etc. to filter without full decompression.

Reconstructing Log Timelines

To rebuild a continuous timeline:

Order files by rotation policy, usually oldest has highest suffix (e.g., .7.gz).
Concatenate in correct order:

   zcat /var/log/auth.log.7.gz /var/log/auth.log.6.gz ... \
        | cat - /var/log/auth.log.1 /var/log/auth.log \
        > /tmp/auth_full_timeline.log

Adjust for overlapping or missing time windows:

Verify continuity with timestamps.
Note explicit gaps where compression or truncation removed data.

Be explicit in your notes where logs are missing due to rotation; do not silently infer actions in missing periods.

Recovering `systemd` Journals

If the system uses systemd journaling:

Check journal directories:

Persistent: /var/log/journal/
Volatile: /run/log/journal/

Viewing and Exporting

Full journal export to text:

  journalctl --no-pager --output=short-iso > /tmp/journal_full.log

Restricted by service, unit, or time:

  journalctl -u ssh.service --since "2025-12-10 00:00" \
             --until "2025-12-11 00:00" > ssh_journal.log

Recovering Deleted or Truncated Journal Entries

If journalctl shows gaps or journals appear wiped:

Look for rotated journal files:

.journal and .journal~ artifacts may exist in the journal directories.

Carve journal files from unallocated space:

Journal files contain identifiable headers (LPKSHHRH magic, etc.); use file carvers (discussed below) with custom signatures.

Recover from backups or snapshots:

Journals are often captured in filesystem-level snapshots (Btrfs, LVM, ZFS).

Direct manual repair of journal binary files is non-trivial; focus on preservation and extraction rather than editing.

File Recovery from Filesystem Artifacts

When logs or other files were deleted (whether intentionally or due to normal rotation), recovery depends heavily on the filesystem.

EXT Family (ext2/ext3/ext4)

Common tools:

TestDisk – partition and filesystem recovery.
extundelete – undelete files on unmounted/ext image.
Photorec – file carving (filesystem-agnostic).

Workflow Example (ext4, using `extundelete`)

Work on an image:

   losetup /dev/loop0 /evidence/disk.img
   mkdir /mnt/recovered

Identify partition with fdisk -l /dev/loop0 (assume /dev/loop0p2).
Run extundelete:

   extundelete /dev/loop0p2 --restore-file var/log/auth.log \
              --output-dir /mnt/recovered

If path unknown, try:

   extundelete /dev/loop0p2 --restore-all --output-dir /mnt/recovered

Results may be partial if blocks are overwritten.

XFS/Btrfs/Other Filesystems

Each filesystem has different recovery tools and limitations:

XFS:

xfs_repair (integrity, not undelete).
xfs_metadump for metadata extraction.
File carving is often needed for content.

Btrfs:

Use btrfs restore to access older versions.
Snapshots greatly increase chances of recovery.

ZFS:

Native snapshots (zfs list -t snapshot, zfs rollback, zfs clone).

In many non-ext cases, snapshots are more productive than raw undelete attempts.

File Carving for Logs and Text

When filesystem-level undelete fails, you can attempt file carving from unallocated space or raw images.

Basic Principles

File carving ignores filesystem metadata and searches for:

Signatures (magic bytes) to determine start of files.
Optional heuristics to guess file boundaries.

Tools

Photorec:

Part of TestDisk, recovers by file type.

scalpel / foremost:

Signature-based file carving.

Using Photorec

Run on the image:

   photorec /log /d /mnt/recovered /cmd /evidence/disk.img options,search

(Interactive mode is typical; configure to search for log, txt, gz.)

After carving:

Use file command to identify recovered files.
Use grep, zgrep to search for relevant event patterns.

Carving for Text-Based Logs

Many logs are plain text; carving them is noisy.
Strategy:

Carve to a separate directory.
Use indicators of compromise (IOCs) to filter:

    grep -R "suspicious_user" /mnt/recovered | head

Look for recognizable log headers or date patterns.

Remember that carved logs may not retain original names or paths; correlate with contents and timestamps where possible.

Recovering from Snapshots and Backups

Modern Linux systems frequently use snapshotting or regular backups. These can be invaluable when logs or files have been cleaned.

LVM Snapshots

If the system used LVM and snapshots existed before the incident:

List snapshots:

lvs

Mount snapshot (read-only) and copy logs:

  lvcreate -s -n forensic_snap -L 10G /dev/vg0/root
  mount -o ro /dev/vg0/forensic_snap /mnt/forensic
  cp -a /mnt/forensic/var/log /analysis/logs_preincident

If historical snapshots are gone, this may not help, but on some systems, backup LVs or older snapshots persist.

Btrfs Snapshots

List snapshots:

  btrfs subvolume list /

Mount or access snapshot subvolumes (often under .snapshots):

  btrfs subvolume show /.snapshots/123/snapshot
  mount -o ro /.snapshots/123/snapshot /mnt/snap123

Copy logs or configuration from these snapshots for comparison.

ZFS Snapshots

List:

  zfs list -t snapshot

Clone for analysis:

  zfs clone pool/root@pre-incident pool/forensics
  mount -o ro /pool/forensics /mnt/forensics

Traditional Backups

Backups might be:

rsync-based directories.
tar archives.
Commercial backup products.

For logs:

Extract only needed time windows to avoid clutter.
Note: Backups may be incomplete or filtered; validate their policy (e.g., logs only kept for N days).

Reconstruction When Logs Are Missing or Tampered

Often, you will not fully “recover” logs, but rather reconstruct events from partial artifacts.

Cross-Correlation of Multiple Sources

If /var/log/auth.log is wiped, consider:

SSH daemon systemd journal entries.
last/lastb outputs derived from /var/log/wtmp and /var/log/btmp.
Application logs (e.g., VPN, web app).
Network device logs (firewalls, routers).
Cloud provider audit logs (e.g., aws cloudtrail).

A single event (like an SSH login) usually leaves traces in several locations; one may be intact even if another is removed.

Using Metadata as Evidence

Even without file contents, metadata can indicate activity:

Timestamps on logs themselves:

ls -l --full-time /var/log/auth.log
Sudden truncation times can be linked to attacker actions.

Directory metadata:

stat /var/log, or timeline tools like log2timeline (from other toolsets).

Inode reuse patterns:

Multiple recreated log files may share changed inodes, visible through forensic tools.

Detecting and Handling Log Tampering

Indicators:

Gaps in timestamps or sequences.
Inconsistent time zones or formats.
Missing log entries you know should exist (e.g., associated events elsewhere).
Strings like log cleaner scripts, manual edits markers (vi backup files, etc.).

For reconstruction:

Document where logs are suspected to be incomplete.
Use other corroborative evidence (process lists, network captures, backups).
Avoid “filling in” missing data with assumptions; instead, state possible interpretations.

Recovering Configuration and Application Files

Non-log files often matter as much as logs:

Configuration files:

/etc/* (e.g., sshd_config, sudoers, web server configs).
Application config under /opt, /usr/local, or $HOME.

Application state:

Web roots (/var/www/), database data directories, custom binaries/scripts.

To recover:

Apply the same toolkit as for logs (undelete, carving, snapshots).
For textual configs, even partial fragments are valuable to see attacker-introduced directives (e.g., ProxyCommand, ForceCommand, extra Include lines).

Recovered versions can be used to:

Compare pre- and post-incident states.
Restore services to known good configurations.
Identify persistence mechanisms embedded in configs.

Handling Partially Overwritten or Fragmented Files

Sometimes logs or files are only partially available:

Start or end of file missing.
Fragmentation due to disk usage after deletion.
Overwritten sectors in the middle.

Strategies:

Partial reconstruction:

Combine intact portions from:

Recovered file fragments.
Rotated or backup copies.

Use text pattern recognition to align fragments by timestamp or line structure.

Marking uncertainty:

Clearly annotate which parts are continuous and which involve gaps.
Do not alter content; preserve original fragments separately from reconstructed views used for analysis.

Integrity, Hashing, and Documentation

All recovered content must be handled with forensic rigor.

Hashing Recovered Files

Compute cryptographic hashes:

  sha256sum /mnt/recovered/auth.log > auth.log.sha256

For large sets, generate a manifest:

  find /mnt/recovered -type f -exec sha256sum {} \; > recovered_hashes.txt

These hashes let you:

Prove files were not modified after recovery.
Reference exact versions in reports and legal processes.

Documenting Recovery Steps

Maintain a clear record of:

Tools and versions used (e.g., extundelete 0.2.4).
Exact commands, including options.
Time and host environment of each operation.
Any errors or warnings observed.

Use a read-only operations log (e.g., plain text file under version control) and include it with the case artifacts.

Common Pitfalls and How to Avoid Them

Performing recovery on the live compromised system:

Risk of overwriting evidence; always favor disk images and read-only mounts.

Running defragmentation or disk cleanup tools:

On Linux, this is rare but some admins try “optimization.” It destroys unallocated artifacts; avoid during investigations.

Not checking off-system logs:

Missing opportunities from log servers, network equipment, or cloud services.

Over-relying on a single tool:

If extundelete fails, try carving or snapshot analysis; different approaches reveal different data.

Putting It Together: Practical Mini-Workflow

As a concise example for log and file recovery during an incident:

Preserve:

Acquire a disk image with hashing.
Mount image read-only.

Inventory existing logs and configs:

Plain logs, rotated/compressed, journals, application logs.

Consolidate:

Decompress and merge rotated logs into full timelines.
Export systemd journal segments.

Recover:

Attempt undelete (filesystem-appropriate tools) for key logs and config files.
Use snapshots or backups to retrieve earlier states.
Apply file carving to unallocated space for additional fragments.

Correlate:

Combine logs, recovered files, metadata, and off-system sources to build an event timeline.
Mark explicit gaps or tampering suspicions.

Verify and document:

Hash recovered artifacts.
Record commands and tools used.
Preserve both raw recovered files and any derived analysis products (e.g., combined timelines).

This structured approach maximizes the chance of recovering useful logs and files while maintaining forensic soundness.

Comments

Please login to add a comment.

Don't have an account? Register now!