1 Incident Classification

Incidents are classified using a priority matrix based on impact and urgency. This classification drives response SLAs, update cadences, and escalation requirements. All SOC analysts must apply this matrix consistently at the point of incident declaration.

Priority Impact Urgency Response SLA Update Frequency Examples
P1 - Critical Organization-wide Immediate 15 min Every 30 min Active ransomware, confirmed data breach, nation-state compromise
P2 - High Multiple departments High 1 hour Every 2 hours Lateral movement detected, C2 confirmed, privilege escalation
P3 - Medium Single department Medium 4 hours Every 8 hours Malware on single endpoint, phishing with credential harvest, policy violation
P4 - Low Single user Low 24 hours Daily Adware, PUP detection, low-confidence alerts, informational findings

Category Taxonomy

Each incident must be assigned a category and sub-category at the point of declaration. This taxonomy supports reporting, trend analysis, and playbook routing.

Category Sub-Categories Typical Priority
Malware Ransomware, Trojan, Worm, Dropper, RAT, Cryptominer P1P3
Unauthorized Access Credential Theft, Privilege Escalation, Account Compromise P1P2
Data Breach Exfiltration, Exposure, Loss, Unauthorized Disclosure P1P2
Denial of Service DDoS, Resource Exhaustion, Service Disruption P2P3
Policy Violation Acceptable Use, Data Handling, Shadow IT P3P4
Insider Threat Malicious Insider, Negligent Insider, Compromised Insider P1P3
Supply Chain Compromised Vendor, Malicious Update, Third-Party Breach P1P2

2 Incident Lifecycle

The incident lifecycle follows a structured nine-phase model aligned with NIST SP 800-61 and ITIL incident management best practices. Each phase has defined entry criteria, activities, and exit criteria.

1. Detection
Alert or event triggers investigation. Sources include MDE alerts, SIEM correlation rules, threat hunting findings, user reports, and external notifications.
2. Triage
Initial assessment and classification. Analyst validates the alert, determines true/false positive, assigns initial priority, and decides whether to declare an incident.
3. Declaration
Formal incident declaration. Incident ticket created, priority assigned, incident commander designated, and initial stakeholder notifications sent.
4. Investigation
Scoping and evidence collection. Determine the full extent of compromise, identify all affected assets, collect forensic evidence, and establish the attack timeline.
5. Containment
Stop the spread and limit damage. Isolate affected systems, block malicious indicators, disable compromised accounts, and implement network segmentation as needed.
6. Eradication
Remove threat from environment. Eliminate malware, close attack vectors, patch exploited vulnerabilities, and verify complete removal of adversary presence.
7. Recovery
Restore normal operations. Rebuild or restore affected systems, re-enable accounts with new credentials, validate system integrity, and monitor for recurrence.
8. Closure
Administrative closure. Complete all documentation, finalize the incident ticket, confirm all stakeholders are notified of resolution, and archive evidence.
9. Post-Incident Review
Lessons learned. Conduct a structured review meeting, identify improvements, update playbooks and detection rules, and track remediation actions to completion.
Phase Transition Criteria Reference
From → To Trigger Criteria
Detection → Triage Alert assigned to analyst
Triage → Declaration Confirmed true positive, meets incident threshold
Declaration → Investigation Incident commander assigned, war room established
Investigation → Containment Threat scope identified, containment plan approved
Containment → Eradication Spread halted, all affected assets identified
Eradication → Recovery All malicious artifacts removed, root cause addressed
Recovery → Closure All systems restored, no recurrence in 24-48h monitoring
Closure → PIR Incident closed, PIR scheduled within 5 business days

3 Communication Templates

Standardized communication templates ensure consistent, timely, and professional notifications throughout the incident lifecycle. Use these templates as starting points and customize based on the specific incident context.

Initial Incident Notification

Initial Incident Notification Priority Template
INCIDENT ID
[INC-YYYY-NNNN]
DATE/TIME DETECTED
[UTC timestamp]
CLASSIFICATION
[P1/P2/P3/P4]
CATEGORY
[Malware/Unauthorized Access/etc.]
AFFECTED SYSTEMS
[List of device names/count]
AFFECTED USERS
[List of usernames/count]
INITIAL ASSESSMENT
[Brief description of what was observed]
IMMEDIATE ACTIONS TAKEN
[Steps already performed]
ASSIGNED TO
[Lead analyst name]
NEXT UPDATE
[Expected time of next update]

Management Brief (CISO/CTO)

Management Brief (CISO/CTO) Executive Template
INCIDENT
[ID] — [One-line summary]
PRIORITY
[P-level and justification]
BUSINESS IMPACT
[Plain language impact statement]
CURRENT STATUS
[Phase: Investigation/Containment/etc.]
TIMELINE
[Key timestamps]
ACTIONS COMPLETED
[Bulleted list]
ACTIONS PENDING
[Bulleted list]
RESOURCE NEEDS
[Additional support required]
ESTIMATED RESOLUTION
[Time estimate or "TBD"]
NEXT UPDATE
[Scheduled time]

Stakeholder Update

Stakeholder Update Status Update
INCIDENT
[ID] — Update #[N]
TIME
[Current UTC timestamp]
STATUS
[Current phase]
SINCE LAST UPDATE
[Actions completed]
CURRENT ACTIONS
[What's being done now]
BLOCKERS
[Any issues preventing progress]
NEXT STEPS
[Planned actions]
NEXT UPDATE
[Scheduled time]

External/Regulatory Notification

External/Regulatory Notification Legal/Compliance
ORGANIZATION
[Company name]
INCIDENT REFERENCE
[ID]
DATE OF DISCOVERY
[Date]
NATURE OF INCIDENT
[Type and scope]
DATA AFFECTED
[Types of data potentially impacted]
INDIVIDUALS AFFECTED
[Count or estimate]
CONTAINMENT STATUS
[Current status]
REMEDIATION STEPS
[Actions taken and planned]
CONTACT
[DPO or legal contact information]

Incident Closure Notice

Incident Closure Notice Closure
INCIDENT
[ID] — RESOLVED
TOTAL DURATION
[Time from detection to closure]
ROOT CAUSE
[Brief root cause summary]
IMPACT SUMMARY
[Final scope: devices, users, data]
RESOLUTION
[How the incident was resolved]
PREVENTIVE MEASURES
[Controls implemented to prevent recurrence]
PIR SCHEDULED
[Date and time for post-incident review]
LESSONS LEARNED
[Key preliminary takeaways]

4 Escalation Matrix

The escalation matrix defines when, to whom, and how incidents must be escalated. Timely and accurate escalation is critical for minimizing business impact and ensuring appropriate leadership visibility.

Condition Escalate To Method Timeframe
P1 incident declared SOC Manager + Incident Commander Phone + Teams Immediate
P1 > 30 min unresolved CISO Phone Within 30 min
P1 > 2 hours unresolved CTO + Legal Phone + Email Within 2 hours
Data breach suspected DPO + Legal Phone + Email Within 1 hour
Nation-state indicators CISO + External IR Phone Within 15 min
P2 incident declared SOC Manager Teams + Ticket Within 15 min
P2 > 4 hours unresolved SOC Manager for re-assessment Phone Within 4 hours
Regulatory notification required Legal + Compliance Email + Phone Within 1 hour
P3 incident declared Shift Lead Ticket Within 1 hour
P3 > 24 hours unresolved SOC Manager Ticket Within 24 hours
P4 incident declared Assignment queue Ticket Next business day
Escalation Tip

When in doubt, escalate. It is always better to escalate unnecessarily than to miss a critical escalation window. Document all escalation attempts and outcomes.

5 RACI Matrix

The RACI matrix defines roles and responsibilities across the incident management lifecycle. Every activity must have exactly one Accountable party, with Responsible, Consulted, and Informed roles clearly assigned.

Activity SOC Analyst SOC Manager Incident Commander CISO IT Operations Legal HR Communications
Alert Triage R I
Incident Declaration C R/A I I
Containment Actions R A C I C
Evidence Preservation R A C I
Investigation Lead R C A I C
Executive Communication C R A I C
External/Legal Notification I C A R C
Recovery Actions C A C I R
Post-Incident Review R A C I C
Lessons Learned Implementation C R A I C
RACI Legend

R = Responsible (does the work) | A = Accountable (owns the outcome) | C = Consulted (provides input) | I = Informed (kept in the loop)

6 Post-Incident Review

Post-Incident Reviews (PIRs) are mandatory for all P1 and P2 incidents and recommended for P3 incidents with noteworthy findings. The PIR process is blameless and focused on systemic improvement.

PIR Meeting Agenda

  1. Review incident timeline (present facts, not blame)
  2. Identify what went well (effective detections, fast response, good communication)
  3. Identify what needs improvement (gaps, delays, tool issues)
  4. Perform root cause analysis using 5-Whys method
  5. Define action items with owners and deadlines
  6. Discuss detection improvements and new KQL rules needed
  7. Review and update playbooks based on findings
  8. Schedule follow-up meeting to track action items

5-Whys Template

5-Whys Root Cause Analysis RCA Template
INCIDENT
[ID]
PROBLEM STATEMENT
[What happened]
WHY 1
[First level cause]
WHY 2
[Why did WHY 1 happen?]
WHY 3
[Why did WHY 2 happen?]
WHY 4
[Why did WHY 3 happen?]
WHY 5
[Root cause]
ROOT CAUSE CATEGORY
[People / Process / Technology]
CORRECTIVE ACTION
[What will prevent recurrence]
ACTION OWNER
[Name]
TARGET DATE
[Date]

Metrics

Track the following metrics for every incident to identify trends, measure team performance, and drive continuous improvement.

Metric Description Target How to Calculate
MTTD (Mean Time to Detect) Time from initial compromise to first detection < 1 hour for P1 Alert timestamp − estimated compromise time
MTTC (Mean Time to Contain) Time from detection to successful containment < 4 hours for P1 Containment timestamp − detection timestamp
MTTR (Mean Time to Resolve) Time from detection to full resolution < 24 hours for P1 Closure timestamp − detection timestamp
Blast Radius Number of affected devices and users Minimize Count of unique affected entities
Data Impact Volume/type of data potentially compromised None Assessment of accessed/exfiltrated data
Escalation Accuracy Were escalations timely and appropriate 100% Audit of escalation decisions vs policy

7 SLA Tracking

Service Level Agreements define the expected timeframes for each phase of incident handling. SLA compliance is tracked per incident and reported monthly to SOC leadership.

Priority Initial Response Containment Resolution Update Cadence
P1 - Critical 15 minutes 4 hours 24 hours Every 30 minutes
P2 - High 1 hour 8 hours 48 hours Every 2 hours
P3 - Medium 4 hours 24 hours 5 business days Every 8 hours
P4 - Low 24 hours 72 hours 10 business days Daily
SLA Clock Rules

The SLA clock starts when the incident is declared. Clock pauses when: (1) Waiting on third-party vendor response (documented), (2) Approved management hold, (3) Awaiting user availability for evidence collection. Clock does NOT pause for: shift changes, weekends/holidays (P1/P2), or internal resource constraints.

8 Ticketing Integration

All incidents must be tracked in the ticketing system with proper field mapping from Microsoft Defender for Endpoint. Consistent ticket creation enables accurate reporting, SLA tracking, and audit compliance.

Required Ticket Fields

Field MDE Mapping Notes
Incident ID AlertId / IncidentId Auto-generated or mapped
Title Alert Title May be customized during triage
Priority Severity (High/Medium/Low/Info) Map to P1-P4
Category Category field Use org taxonomy
Affected Device(s) DeviceName From AlertEvidence
Affected User(s) AccountName From AlertEvidence
Description Alert description + analyst notes Combine MDE context with analysis
Status Investigation status Sync with MDE investigation status
Assigned To Analyst name Based on shift/skill matrix
MDE Incident URL Portal deep link For quick reference

Ticket Workflow

Ticket Workflow Checklist Required
0 / 9