Chapter 3: Detection and Analysis¶
Introduction¶
Detection and analysis represents the most technically challenging phase of incident response. Organizations face a fundamental asymmetry: defenders must detect every successful attack, while adversaries need only evade detection once. This chapter explores the indicators, sources, methodologies, and decision frameworks that enable security teams to identify genuine incidents amidst vast quantities of security events, false positives, and sophisticated evasion techniques.
Effective detection and analysis requires both art and science—the science of systematic log analysis and correlation, combined with the art of recognizing subtle anomalies that indicate adversary presence.
Indicators of Compromise (IOCs)¶
Indicators of Compromise (IOCs) are forensic artifacts or observables that suggest a system has been breached or is currently under attack. IOCs serve as the foundational clues that security analysts use to detect malicious activity.
Types of Indicators¶
Atomic Indicators
Simple, discrete values that cannot be broken down further: - IP addresses associated with command and control (C2) servers - Domain names used by malware - File hashes (MD5, SHA-1, SHA-256) of known malicious files - Email addresses used in phishing campaigns
Hash Limitations
File hashes are highly specific but fragile—changing a single byte produces a completely different hash. Adversaries trivially evade hash-based detection by recompiling malware.
Behavioral Indicators
Actions or sequences that indicate malicious intent: - Privilege escalation attempts - Lateral movement between systems - Persistence mechanism creation - Credential dumping activities - Data staging and exfiltration
The Pyramid of Pain¶
Security researcher David Bianco developed the "Pyramid of Pain" to categorize IOCs by how difficult they are for adversaries to change:
graph TD
A[TTP - Tactics, Techniques, Procedures] --> B[Tools]
B --> C[Network/Host Artifacts]
C --> D[Domain Names]
D --> E[IP Addresses]
E --> F[Hash Values]
style A fill:#ff0000
style B fill:#ff6600
style C fill:#ffcc00
style D fill:#ffff00
style E fill:#ccff00
style F fill:#00ff00
From Bottom to Top:
- Hash Values: Trivial to change (recompile malware)
- IP Addresses: Easy to change (rent new infrastructure)
- Domain Names: Simple to change (register new domains)
- Network/Host Artifacts: Annoying to change (requires code modifications)
- Tools: Challenging to change (investment in custom malware)
- TTPs: Tough to change (fundamental adversary methodology)
Focus on TTPs
Mature detection programs emphasize behavioral detection of tactics, techniques, and procedures rather than relying solely on atomic indicators that adversaries easily change.
MITRE ATT&CK Framework¶
The MITRE ATT&CK framework provides a comprehensive knowledge base of adversary tactics and techniques based on real-world observations. ATT&CK organizes adversary behavior into:
Tactics: The adversary's technical goals (what they're trying to accomplish) - Initial Access, Execution, Persistence, Privilege Escalation, Defense Evasion - Credential Access, Discovery, Lateral Movement, Collection - Command and Control, Exfiltration, Impact
Techniques: How adversaries achieve tactical goals (specific methods)
Organizations use ATT&CK to map detection capabilities, identify gaps, prioritize investments, and communicate using common vocabulary.
Detection Sources¶
Effective incident detection requires visibility across multiple data sources.
Security Information and Event Management (SIEM)¶
SIEMs aggregate, normalize, and correlate logs from diverse sources:
Common Log Sources: - Firewalls and network security devices - Intrusion detection/prevention systems - Web proxies and DNS servers - Endpoint operating systems (Windows Event Logs, syslog) - Authentication systems (Active Directory, LDAP) - Cloud infrastructure (AWS CloudTrail, Azure Monitor)
SIEM Capabilities: - Real-time event correlation - Alert generation based on rules and anomalies - Historical log search and analysis - Compliance reporting
SIEM Challenges
Common pitfalls include log sources not properly configured, overwhelming alert volumes from poorly tuned rules, insufficient retention for investigations, and high licensing costs based on data volume.
Endpoint Detection and Response (EDR)¶
EDR platforms provide deep visibility into endpoint activity:
Telemetry Collected: - Process execution and command-line arguments - Network connections initiated from endpoints - File system modifications - Registry changes (Windows) - Authentication and user activity - Memory analysis
EDR Advantages: - Continuous monitoring beyond signature-based antivirus - Behavioral analysis detecting unknown threats - Forensic data available for investigation - Remote containment and remediation capabilities
Network Traffic Analysis¶
Specialized tools analyze network traffic for threats:
Capabilities: - Full packet capture for forensic analysis - Metadata extraction (NetFlow, Zeek logs) - Encrypted traffic analysis (certificate inspection, traffic patterns) - Detection of lateral movement and C2 communication
Threat Intelligence Feeds¶
External intelligence sources provide context about known threats:
Types of Intelligence: - Strategic: High-level trends and adversary motivations - Tactical: Adversary TTPs and campaign information - Operational: Details of ongoing campaigns - Technical: IOCs (IPs, domains, hashes)
Intelligence-Driven Detection
When threat intelligence indicates adversaries targeting your industry are using specific malware families, proactively hunt for associated IOCs and TTPs in your environment rather than waiting for alerts.
Triage Methodology¶
When security alerts fire, analysts must quickly determine which represent genuine incidents requiring response versus false positives or low-priority events.
Triage Process Flow¶
flowchart TD
A[Alert Generated] --> B{Initial Assessment}
B -->|Clear False Positive| C[Tune Detection Rule]
B -->|Potentially Malicious| D[Gather Context]
D --> E[Enrich with Threat Intel]
D --> F[Check Historical Activity]
D --> G[Identify Affected Systems]
E --> H{Validate Incident}
F --> H
G --> H
H -->|False Positive| I[Document & Close]
H -->|True Positive| J[Classify & Prioritize]
J --> K[Assign Severity]
J --> L[Determine Scope]
K --> M[Escalate to Incident Handler]
L --> M
style A fill:#e1f5ff
style H fill:#ffe1e1
style M fill:#ff9999
Initial Assessment¶
Within minutes of alert generation, analysts should:
- Review Alert Details: What triggered the detection? What IOCs or behaviors were observed?
- Check Alert Confidence: Is this a high-fidelity detection rule or noisy signature?
- Quick Context Lookup: Is the source IP internal or external? Is the user account legitimate?
Context Gathering¶
For alerts that aren't immediately dismissible:
System Context: - What is the affected system's role? (Domain controller, file server, workstation) - Who owns or uses the system? - What sensitive data does it access?
User Context: - Is the associated user account active and legitimate? - What are normal activities for this user? - Has this account shown suspicious activity previously?
Threat Intelligence Enrichment: - Are associated IPs/domains known malicious infrastructure? - Do file hashes match known malware families? - Are observed techniques associated with specific threat actors?
Automated Enrichment
Security orchestration, automation, and response (SOAR) platforms automatically enrich alerts with context from threat intelligence, CMDB, and other sources, accelerating triage.
Incident Classification and Prioritization¶
Once an incident is validated, it must be classified and prioritized to allocate appropriate resources.
Classification Schemes¶
By Attack Vector: - Phishing / Social Engineering - Malware Infection - Web Application Attack - Denial of Service - Unauthorized Access - Data Breach - Insider Threat - Supply Chain Compromise
By Affected Assets: - Endpoint (workstation, mobile device) - Server (application, database, domain controller) - Network infrastructure - Cloud infrastructure - IoT / Operational Technology
Severity Rating¶
Organizations typically use 4-5 severity levels:
| Severity | Criteria | Response Time | Examples |
|---|---|---|---|
| Critical | Imminent threat to critical systems; data exfiltration in progress | < 15 minutes | Ransomware encrypting servers, Active data breach, Compromised domain controller |
| High | Confirmed compromise of important systems; potential for significant damage | < 1 hour | Malware on multiple endpoints, Credential theft from privileged account |
| Medium | Confirmed incident with limited scope; potential for escalation | < 4 hours | Single workstation infection, Successful phishing (no payload execution) |
| Low | Isolated event with minimal impact | < 24 hours | Failed login attempts, External port scanning |
Dynamic Severity
Incident severity can change as investigation reveals additional information. Continuously reassess and escalate as needed.
Prioritization Factors¶
Beyond severity, consider:
Business Impact: - Systems affected and their criticality to operations - Potential financial losses - Regulatory or legal implications - Reputational damage
Attack Sophistication: - Commodity malware versus advanced persistent threat - Automated attack versus targeted human adversary - Known versus zero-day vulnerabilities
Scope and Spread: - Number of affected systems - Potential for lateral movement - Access to sensitive data or critical infrastructure
Forensic Evidence Preservation¶
From the moment an incident is detected, all response actions must preserve digital evidence.
Chain of Custody¶
Chain of custody is the documented, unbroken chronological record showing who collected, handled, transferred, analyzed, or stored evidence.
Required Documentation: - What evidence was collected (description, hash values) - Who collected it (name, role) - When it was collected (date, time, timezone) - Where it was collected from (system identifier, physical location) - How it was collected (tools, methodology) - Where it was stored (evidence locker, storage media)
Legal Admissibility
Evidence with broken chain of custody may be inadmissible in legal proceedings. Treat every incident as potentially leading to litigation.
Evidence Collection Best Practices¶
Order of Volatility: Collect evidence from most volatile (will be lost quickly) to least volatile:
- CPU registers and cache
- Memory (RAM)
- Network connections and process tables
- Temporary file systems
- Disk storage
- Remote logging and monitoring data
- Archival media and backups
Collection Techniques:
Memory Acquisition: - Use tools like FTK Imager, WinPMem, or Lime - Capture before shutting down systems - Critical for detecting fileless malware
Disk Imaging: - Create forensic copies using write-blocking hardware - Generate and document cryptographic hashes - Work from copies, never original evidence
Log Preservation: - Immediately export relevant logs to secure storage - Prevent log rotation from destroying evidence - Capture logs from all systems in scope
Detection Challenges and Best Practices¶
Common Challenges¶
Alert Fatigue: - High volumes of false positives exhaust analysts - Important alerts missed in the noise
Mitigation: - Continuously tune detection rules to reduce false positives - Implement tiered alerting (automated response for low-severity) - Use SOAR platforms for automated triage and enrichment
Sophisticated Evasion: - Adversaries disable logging and monitoring - Living-off-the-land techniques using legitimate tools - Encrypted command and control channels
Mitigation: - Monitor for security tool tampering - Focus on behavioral detection, not just signatures - Implement defense in depth with multiple detection layers
Insider Threats: - Authorized users with legitimate access - Difficult to distinguish malicious from normal activity
Mitigation: - User and entity behavior analytics (UEBA) - Principle of least privilege - Separation of duties for sensitive operations
Quality Over Quantity
A small number of well-tuned, high-fidelity detection rules is more valuable than hundreds of noisy signatures that generate alert fatigue.
Conclusion¶
Detection and analysis transforms raw security events into actionable incident intelligence. Success requires the right combination of technology (SIEM, EDR, threat intelligence), process (systematic triage and classification), and people (skilled analysts with training and tools).
Organizations that master detection and analysis gain the ability to identify adversaries early in the attack lifecycle—before significant damage occurs. This capability, combined with preparation activities, positions security teams to execute effective containment operations.
The next chapter explores containment strategies—the critical decisions and actions required to limit incident impact and prevent further compromise.
Key Takeaways
- Focus detection on tactics, techniques, and procedures (TTPs), not just atomic indicators
- Leverage multiple detection sources for comprehensive visibility
- Implement systematic triage to efficiently validate incidents and reduce false positives
- Classify and prioritize incidents based on severity, business impact, and scope
- Preserve forensic evidence with proper chain of custody from the start
- Continuously tune detections and address analyst alert fatigue
- Use frameworks like MITRE ATT&CK for common vocabulary and gap analysis