Skip to content

Chapter 3: Detection and Analysis

Introduction

Detection and analysis represents the most technically challenging phase of incident response. Organizations face a fundamental asymmetry: defenders must detect every successful attack, while adversaries need only evade detection once. This chapter explores the indicators, sources, methodologies, and decision frameworks that enable security teams to identify genuine incidents amidst vast quantities of security events, false positives, and sophisticated evasion techniques.

Effective detection and analysis requires both art and science—the science of systematic log analysis and correlation, combined with the art of recognizing subtle anomalies that indicate adversary presence.

Indicators of Compromise (IOCs)

Indicators of Compromise (IOCs) are forensic artifacts or observables that suggest a system has been breached or is currently under attack. IOCs serve as the foundational clues that security analysts use to detect malicious activity.

Types of Indicators

Atomic Indicators

Simple, discrete values that cannot be broken down further: - IP addresses associated with command and control (C2) servers - Domain names used by malware - File hashes (MD5, SHA-1, SHA-256) of known malicious files - Email addresses used in phishing campaigns

Hash Limitations

File hashes are highly specific but fragile—changing a single byte produces a completely different hash. Adversaries trivially evade hash-based detection by recompiling malware.

Behavioral Indicators

Actions or sequences that indicate malicious intent: - Privilege escalation attempts - Lateral movement between systems - Persistence mechanism creation - Credential dumping activities - Data staging and exfiltration

The Pyramid of Pain

Security researcher David Bianco developed the "Pyramid of Pain" to categorize IOCs by how difficult they are for adversaries to change:

graph TD
    A[TTP - Tactics, Techniques, Procedures] --> B[Tools]
    B --> C[Network/Host Artifacts]
    C --> D[Domain Names]
    D --> E[IP Addresses]
    E --> F[Hash Values]

    style A fill:#ff0000
    style B fill:#ff6600
    style C fill:#ffcc00
    style D fill:#ffff00
    style E fill:#ccff00
    style F fill:#00ff00

From Bottom to Top:

  1. Hash Values: Trivial to change (recompile malware)
  2. IP Addresses: Easy to change (rent new infrastructure)
  3. Domain Names: Simple to change (register new domains)
  4. Network/Host Artifacts: Annoying to change (requires code modifications)
  5. Tools: Challenging to change (investment in custom malware)
  6. TTPs: Tough to change (fundamental adversary methodology)

Focus on TTPs

Mature detection programs emphasize behavioral detection of tactics, techniques, and procedures rather than relying solely on atomic indicators that adversaries easily change.

MITRE ATT&CK Framework

The MITRE ATT&CK framework provides a comprehensive knowledge base of adversary tactics and techniques based on real-world observations. ATT&CK organizes adversary behavior into:

Tactics: The adversary's technical goals (what they're trying to accomplish) - Initial Access, Execution, Persistence, Privilege Escalation, Defense Evasion - Credential Access, Discovery, Lateral Movement, Collection - Command and Control, Exfiltration, Impact

Techniques: How adversaries achieve tactical goals (specific methods)

Organizations use ATT&CK to map detection capabilities, identify gaps, prioritize investments, and communicate using common vocabulary.

Detection Sources

Effective incident detection requires visibility across multiple data sources.

Security Information and Event Management (SIEM)

SIEMs aggregate, normalize, and correlate logs from diverse sources:

Common Log Sources: - Firewalls and network security devices - Intrusion detection/prevention systems - Web proxies and DNS servers - Endpoint operating systems (Windows Event Logs, syslog) - Authentication systems (Active Directory, LDAP) - Cloud infrastructure (AWS CloudTrail, Azure Monitor)

SIEM Capabilities: - Real-time event correlation - Alert generation based on rules and anomalies - Historical log search and analysis - Compliance reporting

SIEM Challenges

Common pitfalls include log sources not properly configured, overwhelming alert volumes from poorly tuned rules, insufficient retention for investigations, and high licensing costs based on data volume.

Endpoint Detection and Response (EDR)

EDR platforms provide deep visibility into endpoint activity:

Telemetry Collected: - Process execution and command-line arguments - Network connections initiated from endpoints - File system modifications - Registry changes (Windows) - Authentication and user activity - Memory analysis

EDR Advantages: - Continuous monitoring beyond signature-based antivirus - Behavioral analysis detecting unknown threats - Forensic data available for investigation - Remote containment and remediation capabilities

Network Traffic Analysis

Specialized tools analyze network traffic for threats:

Capabilities: - Full packet capture for forensic analysis - Metadata extraction (NetFlow, Zeek logs) - Encrypted traffic analysis (certificate inspection, traffic patterns) - Detection of lateral movement and C2 communication

Threat Intelligence Feeds

External intelligence sources provide context about known threats:

Types of Intelligence: - Strategic: High-level trends and adversary motivations - Tactical: Adversary TTPs and campaign information - Operational: Details of ongoing campaigns - Technical: IOCs (IPs, domains, hashes)

Intelligence-Driven Detection

When threat intelligence indicates adversaries targeting your industry are using specific malware families, proactively hunt for associated IOCs and TTPs in your environment rather than waiting for alerts.

Triage Methodology

When security alerts fire, analysts must quickly determine which represent genuine incidents requiring response versus false positives or low-priority events.

Triage Process Flow

flowchart TD
    A[Alert Generated] --> B{Initial Assessment}
    B -->|Clear False Positive| C[Tune Detection Rule]
    B -->|Potentially Malicious| D[Gather Context]

    D --> E[Enrich with Threat Intel]
    D --> F[Check Historical Activity]
    D --> G[Identify Affected Systems]

    E --> H{Validate Incident}
    F --> H
    G --> H

    H -->|False Positive| I[Document & Close]
    H -->|True Positive| J[Classify & Prioritize]

    J --> K[Assign Severity]
    J --> L[Determine Scope]

    K --> M[Escalate to Incident Handler]
    L --> M

    style A fill:#e1f5ff
    style H fill:#ffe1e1
    style M fill:#ff9999

Initial Assessment

Within minutes of alert generation, analysts should:

  1. Review Alert Details: What triggered the detection? What IOCs or behaviors were observed?
  2. Check Alert Confidence: Is this a high-fidelity detection rule or noisy signature?
  3. Quick Context Lookup: Is the source IP internal or external? Is the user account legitimate?

Context Gathering

For alerts that aren't immediately dismissible:

System Context: - What is the affected system's role? (Domain controller, file server, workstation) - Who owns or uses the system? - What sensitive data does it access?

User Context: - Is the associated user account active and legitimate? - What are normal activities for this user? - Has this account shown suspicious activity previously?

Threat Intelligence Enrichment: - Are associated IPs/domains known malicious infrastructure? - Do file hashes match known malware families? - Are observed techniques associated with specific threat actors?

Automated Enrichment

Security orchestration, automation, and response (SOAR) platforms automatically enrich alerts with context from threat intelligence, CMDB, and other sources, accelerating triage.

Incident Classification and Prioritization

Once an incident is validated, it must be classified and prioritized to allocate appropriate resources.

Classification Schemes

By Attack Vector: - Phishing / Social Engineering - Malware Infection - Web Application Attack - Denial of Service - Unauthorized Access - Data Breach - Insider Threat - Supply Chain Compromise

By Affected Assets: - Endpoint (workstation, mobile device) - Server (application, database, domain controller) - Network infrastructure - Cloud infrastructure - IoT / Operational Technology

Severity Rating

Organizations typically use 4-5 severity levels:

Severity Criteria Response Time Examples
Critical Imminent threat to critical systems; data exfiltration in progress < 15 minutes Ransomware encrypting servers, Active data breach, Compromised domain controller
High Confirmed compromise of important systems; potential for significant damage < 1 hour Malware on multiple endpoints, Credential theft from privileged account
Medium Confirmed incident with limited scope; potential for escalation < 4 hours Single workstation infection, Successful phishing (no payload execution)
Low Isolated event with minimal impact < 24 hours Failed login attempts, External port scanning

Dynamic Severity

Incident severity can change as investigation reveals additional information. Continuously reassess and escalate as needed.

Prioritization Factors

Beyond severity, consider:

Business Impact: - Systems affected and their criticality to operations - Potential financial losses - Regulatory or legal implications - Reputational damage

Attack Sophistication: - Commodity malware versus advanced persistent threat - Automated attack versus targeted human adversary - Known versus zero-day vulnerabilities

Scope and Spread: - Number of affected systems - Potential for lateral movement - Access to sensitive data or critical infrastructure

Forensic Evidence Preservation

From the moment an incident is detected, all response actions must preserve digital evidence.

Chain of Custody

Chain of custody is the documented, unbroken chronological record showing who collected, handled, transferred, analyzed, or stored evidence.

Required Documentation: - What evidence was collected (description, hash values) - Who collected it (name, role) - When it was collected (date, time, timezone) - Where it was collected from (system identifier, physical location) - How it was collected (tools, methodology) - Where it was stored (evidence locker, storage media)

Legal Admissibility

Evidence with broken chain of custody may be inadmissible in legal proceedings. Treat every incident as potentially leading to litigation.

Evidence Collection Best Practices

Order of Volatility: Collect evidence from most volatile (will be lost quickly) to least volatile:

  1. CPU registers and cache
  2. Memory (RAM)
  3. Network connections and process tables
  4. Temporary file systems
  5. Disk storage
  6. Remote logging and monitoring data
  7. Archival media and backups

Collection Techniques:

Memory Acquisition: - Use tools like FTK Imager, WinPMem, or Lime - Capture before shutting down systems - Critical for detecting fileless malware

Disk Imaging: - Create forensic copies using write-blocking hardware - Generate and document cryptographic hashes - Work from copies, never original evidence

Log Preservation: - Immediately export relevant logs to secure storage - Prevent log rotation from destroying evidence - Capture logs from all systems in scope

Detection Challenges and Best Practices

Common Challenges

Alert Fatigue: - High volumes of false positives exhaust analysts - Important alerts missed in the noise

Mitigation: - Continuously tune detection rules to reduce false positives - Implement tiered alerting (automated response for low-severity) - Use SOAR platforms for automated triage and enrichment

Sophisticated Evasion: - Adversaries disable logging and monitoring - Living-off-the-land techniques using legitimate tools - Encrypted command and control channels

Mitigation: - Monitor for security tool tampering - Focus on behavioral detection, not just signatures - Implement defense in depth with multiple detection layers

Insider Threats: - Authorized users with legitimate access - Difficult to distinguish malicious from normal activity

Mitigation: - User and entity behavior analytics (UEBA) - Principle of least privilege - Separation of duties for sensitive operations

Quality Over Quantity

A small number of well-tuned, high-fidelity detection rules is more valuable than hundreds of noisy signatures that generate alert fatigue.

Conclusion

Detection and analysis transforms raw security events into actionable incident intelligence. Success requires the right combination of technology (SIEM, EDR, threat intelligence), process (systematic triage and classification), and people (skilled analysts with training and tools).

Organizations that master detection and analysis gain the ability to identify adversaries early in the attack lifecycle—before significant damage occurs. This capability, combined with preparation activities, positions security teams to execute effective containment operations.

The next chapter explores containment strategies—the critical decisions and actions required to limit incident impact and prevent further compromise.

Key Takeaways

  • Focus detection on tactics, techniques, and procedures (TTPs), not just atomic indicators
  • Leverage multiple detection sources for comprehensive visibility
  • Implement systematic triage to efficiently validate incidents and reduce false positives
  • Classify and prioritize incidents based on severity, business impact, and scope
  • Preserve forensic evidence with proper chain of custody from the start
  • Continuously tune detections and address analyst alert fatigue
  • Use frameworks like MITRE ATT&CK for common vocabulary and gap analysis