Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Artifact Collection Feature

Overview

The artifact collection feature allows sus to copy files that match analysis patterns while preserving their metadata. This is inspired by forensic tools like UAC (Unix-like Artifacts Collector) and KAPE (Kroll Artifact Parser and Extractor).

Usage

Enable artifact collection with the --collect flag:

# Basic collection
sus /path/to/analyze --collect --output-dir ./investigation

# With profiles
sus /evidence --profile profiles/composite/forensic-investigation.toml --collect --output-dir ./collected

# Incident response collection
sus /compromised-system --profile profiles/composite/incident-response.toml --collect

Output Structure

When collection is enabled, artifacts are organized in the output directory:

output/
├── analysis.db              # Analysis database
├── files/                   # Analyzed file metadata
├── collected/               # Collected artifacts (NEW)
│   ├── default/            # Files without specific tags
│   ├── server1/            # Files tagged as 'server1'
│   └── manifest.json       # Collection manifest
└── manifest.json           # Artifact manifest

Artifact Organization

Collected artifacts are organized by tag and filename:

  • Files are grouped by their assigned tags (using --tag-dir option)
  • Filenames include SHA256 hash to prevent conflicts
  • Original directory structure is flattened for easier review

Example:

collected/
├── server1/
│   ├── a1b2c3d4...xyz_suspicious.exe
│   ├── e5f6g7h8...abc_malware.dll
│   └── ...
└── server2/
    ├── 9i0j1k2l...def_backdoor.sh
    └── ...

Artifact Manifest

The manifest (collected/manifest.json) contains detailed information about each collected artifact:

{
  "version": "1.0",
  "collection_started": "2024-01-15T10:30:00Z",
  "collection_completed": "2024-01-15T11:45:00Z",
  "artifact_count": 42,
  "total_size": 1048576,
  "artifacts": [
    {
      "original_path": "/path/to/file.exe",
      "collected_path": "/output/collected/server1/abc123...xyz_file.exe",
      "sha256": "abc123...",
      "file_size": 24576,
      "created": "2024-01-10T08:00:00Z",
      "modified": "2024-01-14T15:30:00Z",
      "accessed": "2024-01-15T10:25:00Z",
      "permissions": 755,
      "uid": 1000,
      "gid": 1000,
      "collection_timestamp": "2024-01-15T10:35:00Z",
      "tag": "server1",
      "mime_type": "application/x-executable"
    }
  ],
  "notes": []
}

Metadata Preservation

The collection feature preserves file metadata:

All Platforms

  • File size
  • Creation time (if available)
  • Modification time
  • Access time
  • SHA256 hash
  • MIME type

Unix/Linux

  • File permissions (mode)
  • User ID (UID)
  • Group ID (GID)

Notes

  • Ownership preservation (chown) typically requires root privileges
  • Timestamps are preserved where the filesystem supports it
  • Symbolic links are not followed (only their metadata is collected)

Collection Process

  1. Analysis Phase: Files are analyzed normally using profiles and patterns
  2. Pattern Matching: Files with pattern matches are identified
  3. Collection Phase: During cleanup, matched files are collected:
    • Query database for files with pattern matches
    • Exclude extracted/decoded files (collect originals only)
    • Copy files while preserving metadata
    • Create manifest records
  4. Manifest Saving: Complete manifest is saved as JSON

Filtering

Collection automatically filters files:

  • Only files with pattern matches are collected
  • Extracted files from archives are excluded (originals collected instead)
  • Decoded files are excluded (originals collected instead)
  • Files that no longer exist are skipped

Integration with Profiles

Collection works with all profiles:

# Collect malware samples
sus /samples --profile profiles/base/malware.toml --collect

# Collect PII violations
sus /data --profile profiles/composite/pci-compliance.toml --collect

# Comprehensive forensic collection
sus /evidence --profile profiles/composite/forensic-investigation.toml --collect

Use Cases

Incident Response

Collect evidence of compromise:

sus /var/log --profile profiles/composite/incident-response.toml \
    --collect --output-dir ./ir-evidence

Compliance Auditing

Collect files with PII or sensitive data:

sus /share/documents --profile profiles/composite/pci-compliance.toml \
    --collect --output-dir ./compliance-violations

Malware Triage

Collect suspicious executables:

sus /downloads --profile profiles/base/malware.toml \
    --collect --output-dir ./malware-samples

Multi-System Collection

Tag and collect from multiple systems:

sus /mnt/server1 --tag-dir '/mnt/server1:server1' \
    /mnt/server2 --tag-dir '/mnt/server2:server2' \
    --profile profiles/composite/forensic-investigation.toml \
    --collect --output-dir ./multi-system-collection

Chain of Custody

The manifest provides chain of custody information:

  • Original file path and collection time
  • File hashes for integrity verification
  • Metadata snapshots at collection time
  • Collection version and tool information

Performance Considerations

  • Collection happens after analysis completes
  • Files are copied using spawn_blocking for async efficiency
  • Large files are handled with memory-mapped I/O during analysis
  • Manifest is written once at the end of collection

Security Considerations

  • Collected files may contain malware - handle with care
  • Collection does not sanitize or quarantine files
  • Preserve the collected directory with appropriate permissions
  • Consider encrypting the collected artifacts directory
  • Verify manifest hashes before using collected artifacts

Limitations

  • Does not create forensic images (E01, AFF) - files are copied as-is
  • Does not preserve NTFS Alternate Data Streams (ADS)
  • Does not preserve extended attributes on all platforms
  • Requires sufficient disk space for collected artifacts
  • Ownership preservation requires appropriate privileges

Future Enhancements

Planned improvements:

  • Support for forensic image formats (E01, AFF)
  • Preservation of NTFS ADS and extended attributes
  • Encryption of collected artifacts
  • Compression of collection
  • Incremental collection (collect only new matches)
  • Collection reports in PDF/HTML format

Example Workflow

Complete incident response workflow:

# 1. Analyze and collect
sus /compromised-system \
    --profile profiles/composite/incident-response.toml \
    --collect \
    --output-dir ./ir-$(date +%Y%m%d-%H%M%S)

# 2. Review manifest
cat ./ir-*/collected/manifest.json | jq '.artifact_count'

# 3. Extract specific artifacts
cat ./ir-*/collected/manifest.json | \
    jq -r '.artifacts[] | select(.mime_type | contains("executable")) | .collected_path'

# 4. Generate report
sus --server-only --output-dir ./ir-*
# Access http://localhost:8080 to review findings

Troubleshooting

No artifacts collected

  • Verify files match patterns (check analysis.db)
  • Ensure --collect flag is specified
  • Check file permissions for reading source files

Missing metadata

  • Some filesystems don't support all metadata
  • Creation time may not be available on all platforms
  • Extended attributes require platform-specific support

Permission errors

  • Ensure read access to source files
  • Ensure write access to output directory
  • Ownership preservation requires root/admin

Disk space issues

  • Monitor available space before collection
  • Use --max-file-size to limit large files
  • Consider selective profiles to reduce matches