Alert Triage — CyberSOC Playbook

1 Alert Severity Classification

Microsoft Defender for Endpoint assigns severity levels to alerts based on the potential impact and confidence of the detection. The following table defines the SOC triage expectations for each severity tier.

Severity	Description	Triage SLA	Action
High	Alerts associated with APTs, credential theft, or ransomware	15 min	Immediate investigation
Medium	EDR-detected suspicious behavior indicating possible compromise	1 hour	Prioritized investigation
Low	Prevalent malware or hack tools not indicating APT	4 hours	Scheduled investigation
Informational	Not considered harmful but may indicate security awareness items	24 hours	Review and classify

SLA Enforcement

Triage SLAs are measured from the time the alert appears in the MDE queue to the time an analyst begins active investigation. Automated investigation results from MDE should be reviewed as part of the initial triage, not as a substitute for it.

2 MDE Alert Categories

MDE alert categories map directly to threat behaviors. The following table provides the mapping between MDE alert categories and MITRE ATT&CK tactics along with associated technique identifiers for reference during triage.

MDE Alert Category	MITRE ATT&CK Tactic	Key Techniques
Ransomware	Impact	T1486
Malware	Execution	Multiple
Phishing	Initial Access	T1566
Credential Access	Credential Access	T1003 T1558
Command and Control	Command and Control	T1071 T1095
Lateral Movement	Lateral Movement	T1021 T1570
Persistence	Persistence	T1547 T1053
Defense Evasion	Defense Evasion	T1055 T1036
Exfiltration	Exfiltration	T1041 T1567
Discovery	Discovery	T1087 T1082
Privilege Escalation	Privilege Escalation	T1068 T1134
Execution	Execution	T1059 T1204
Suspicious Activity	Various	Multiple
Unwanted Software	Impact	Various

3 Triage Decision Tree

Follow this structured decision tree for every incoming MDE alert. Each step ensures consistent triage outcomes across the analyst team.

ALERT RECEIVED — Begin Triage Workflow
- Step 1: Is the alert a duplicate or known false positive?
  - Close as Duplicate/Known FP. Document the reason in the alert closure notes.
  - Continue to Step 2
    - Step 2: Check alert context in MDE (Review device timeline, alert story, automated investigation results)
      - Step 3: Is the activity from a sanctioned tool/process?
        
        Verify with asset inventory/CMDB
        
        Confirmed sanctioned: Close as Benign True Positive (BTP). Consider adding a suppression rule.
        
        Not in inventory: Escalate for review by senior analyst or tool owner.
        
        Continue to Step 4
        
        Step 4: Does the alert correlate with known threat intelligence?
        
        Escalate as True Positive. Create Incident and assign to IR team.
        
        Continue to Step 5
        
        Step 5: Run enrichment queries (see Section 6: Enrichment Workflows)
        
        Step 6: Based on enrichment results, classify the alert:
        
        True Positive (TP): Create incident, assign severity, begin incident response procedures.
        
        False Positive (FP): Close alert, document rationale, consider detection tuning.
        
        Benign True Positive (BTP): Close alert, document expected behavior, consider suppression rule.

4 Initial Investigation KQL Queries

The following KQL queries support the triage workflow. Run these in Microsoft 365 Defender Advanced Hunting to gather context around an alert entity.

Device Timeline Around Alert

Device Timeline Around Alert DeviceProcessEvents Triage

Retrieves all process events within a configurable time window around the alert timestamp on the target device. Provides immediate context for what was executing before, during, and after the alert fired.

let alertTime = datetime(2024-01-15T10:30:00Z);
let deviceName = "WORKSTATION-01";
let timeWindow = 30m;
DeviceProcessEvents
| where Timestamp between ((alertTime - timeWindow) .. (alertTime + timeWindow))
| where DeviceName == deviceName
| project Timestamp, FileName, ProcessCommandLine, InitiatingProcessFileName, AccountName
| sort by Timestamp asc

Expected Output

Returns all process events within ±30 minutes of the alert. Columns include timestamp, file name, command line, parent process, and the user account that launched the process.

When to Use

First step in any alert investigation. Establishes the timeline of activity surrounding the alert event to identify related suspicious processes.

Alert Process Tree Analysis

Alert Process Tree Analysis AlertEvidence Process Chain

Joins alert evidence with device process events to reconstruct the full process execution chain associated with the alert. Reveals parent-child relationships and command line arguments.

AlertEvidence
| where AlertId == "your-alert-id"
| where EntityType == "Process"
| project DeviceId, FileName, ProcessCommandLine, ParentProcessName = InitiatingProcessFileName
| join kind=leftouter (
    DeviceProcessEvents
    | project DeviceId, FileName, ProcessCommandLine, SHA256,
             ParentProcess = InitiatingProcessFileName, ParentCommandLine = InitiatingProcessCommandLine
) on DeviceId, FileName
| project-away DeviceId1, FileName1

Expected Output

Process execution chain from alert evidence, including file names, command lines, SHA256 hashes, and parent process details for each process in the chain.

When to Use

Understanding the attack chain. Use this query when you need to trace how a suspicious process was launched and what it spawned.

Lateral Movement Check for Alert Entity

Lateral Movement Check for Alert Entity DeviceLogonEvents Lateral Movement

Identifies all devices a suspected compromised account has authenticated to within the last 24 hours. Summarizes logon types and device spread by hourly buckets to surface lateral movement patterns.

let suspiciousAccount = "compromised_user";
let timeRange = 24h;
DeviceLogonEvents
| where Timestamp > ago(timeRange)
| where AccountName == suspiciousAccount
| where LogonType in ("RemoteInteractive", "Network", "Batch")
| summarize LogonCount = count(),
            Devices = make_set(DeviceName),
            LogonTypes = make_set(LogonType)
    by AccountName, bin(Timestamp, 1h)
| sort by Timestamp desc

Expected Output

Shows all devices the account authenticated to, grouped by hour. Includes logon count, distinct device names, and logon types used in each time bucket.

When to Use

When account compromise is suspected. Run this query to determine the blast radius of a potentially compromised identity and identify lateral movement across the environment.

File Hash Prevalence Check

File Hash Prevalence Check DeviceFileEvents Prevalence

Checks how many devices in the environment have seen a specific file hash. Low prevalence files are more likely to be targeted or malicious, while high prevalence suggests a legitimate application.

let targetHash = "abc123...";
DeviceFileEvents
| where SHA256 == targetHash
| summarize DeviceCount = dcount(DeviceName),
            Devices = make_set(DeviceName),
            FirstSeen = min(Timestamp),
            LastSeen = max(Timestamp)
| extend PrevalenceAssessment = iff(DeviceCount < 3, "Rare - Investigate",
                                iff(DeviceCount < 10, "Uncommon - Review", "Common - Likely Legitimate"))

Expected Output

How many devices have the file, a list of those device names, first and last seen timestamps, and a prevalence assessment label (Rare, Uncommon, or Common).

When to Use

Determining if a file is targeted or widespread. A file present on only 1-2 devices is far more suspicious than one present on hundreds.

Network IOC Fleet-Wide Check

Network IOC Fleet-Wide Check DeviceNetworkEvents IOC Sweep

Searches the entire fleet for connections to a suspicious IP address or domain over the last 7 days. Used to scope the breadth of impact when a network-based indicator of compromise is identified.

let suspiciousIP = "203.0.113.50";
let suspiciousDomain = "malicious-domain.com";
DeviceNetworkEvents
| where Timestamp > ago(7d)
| where RemoteIP == suspiciousIP or RemoteUrl has suspiciousDomain
| summarize ConnectionCount = count(),
            Devices = make_set(DeviceName),
            Ports = make_set(RemotePort),
            FirstConnection = min(Timestamp),
            LastConnection = max(Timestamp)
    by RemoteIP, RemoteUrl
| sort by ConnectionCount desc

Expected Output

All devices connecting to the suspicious IP or domain, including connection count, distinct devices, ports used, and the time window of communication.

When to Use

Scoping impact of network-based IOCs. Run this when a C2 domain or malicious IP is identified to determine how many endpoints have been communicating with the threat infrastructure.

Parent Process Chain Analysis

Parent Process Chain Analysis DeviceProcessEvents Process Genealogy

Traces the full parent-child-grandparent process chain for a suspicious process on a specific device. Reveals the execution origin, which is critical for determining whether the process was launched via a legitimate mechanism or through an exploit chain.

let targetDevice = "WORKSTATION-01";
let targetProcess = "suspicious.exe";
DeviceProcessEvents
| where DeviceName == targetDevice
| where FileName == targetProcess or InitiatingProcessFileName == targetProcess
| project Timestamp, FileName, ProcessCommandLine,
         Parent = InitiatingProcessFileName,
         ParentCmd = InitiatingProcessCommandLine,
         GrandParent = InitiatingProcessParentFileName
| sort by Timestamp asc

Expected Output

Full parent-child process chain including timestamps, file names, command lines for the process, its parent, and its grandparent process.

When to Use

Understanding how the suspicious process was launched. For example, if powershell.exe was spawned by winword.exe, that strongly suggests a malicious document execution chain.

Prior Alerts for Entity

Prior Alerts for Entity AlertInfo History

Queries the last 30 days of alert history for a specific entity (device, user, or file). Identifies repeat offenders and can reveal ongoing campaigns or persistent threats targeting the same assets.

let entityName = "WORKSTATION-01";
AlertInfo
| where Timestamp > ago(30d)
| join kind=inner (
    AlertEvidence
    | where EntityType in ("Machine", "User", "File")
    | where DeviceName == entityName or AccountName == entityName or FileName == entityName
) on AlertId
| summarize AlertCount = count(),
            AlertTitles = make_set(Title),
            Severities = make_set(Severity),
            Categories = make_set(Category)
    by DeviceName, AccountName
| sort by AlertCount desc

Expected Output

Shows the alert history for the entity over the past 30 days, including total alert count, distinct alert titles, severity levels, and categories grouped by device and account.

When to Use

Identifying repeat offenders or ongoing campaigns. A device with multiple high-severity alerts over a short period likely indicates an active compromise rather than isolated events.

5 Evidence Collection Procedures

When an alert is confirmed as a True Positive or requires further investigation, the following evidence must be collected and preserved before any containment actions are taken.

Evidence Collection Checklist Required

0 / 8

Screenshot alert details from MDE portal

Export device timeline (CSV) for relevant time window

Record process tree from alert story

Document network connections from device timeline

Collect file hashes (SHA1, SHA256, MD5) for relevant files

Initiate "Collect Investigation Package" via MDE if needed

Save enrichment results (VirusTotal, WHOIS, etc.)

Document analyst observations and initial hypothesis

Evidence Naming Convention

All collected artifacts must follow the standard naming format: [IncidentID]_[DeviceName]_[ArtifactType]_[YYYYMMDD_HHMMSS]

Examples: INC-2024-0142_WKS01_Timeline_20240115_103000 or INC-2024-0142_WKS01_InvestigationPackage_20240115_110000

6 Enrichment Workflows

Enrichment adds external and internal context to alert entities (files, IPs, domains, users). Follow this procedure for every alert that progresses past Step 4 in the Triage Decision Tree.

Enrichment Procedure Steps

Check file hash on VirusTotal (manual upload or API query). Record detection ratio, first seen date, and any sandbox behavioral reports.
Check domain/IP reputation on VirusTotal, AbuseIPDB, and Shodan. Record abuse confidence score, geolocation, ASN, and any associated malware campaigns.
Perform WHOIS lookup for suspicious domains. Note registration date, registrar, nameservers, and registrant information. Recently registered domains (<30 days) are high risk.
Query internal threat intelligence database/platform. Cross-reference IOCs with any active threat hunts, previous incidents, or known adversary infrastructure tracked by the team.
Check Microsoft Threat Intelligence for the entity. Review the MDE file profile, URL reputation, and any associated global alert prevalence data within the Microsoft ecosystem.
Verify device compliance status in MDE. Check whether the affected device has missing patches, disabled security features, or policy violations that may have contributed to the alert.
Check user risk score in Azure AD Identity Protection. Review sign-in risk events, impossible travel detections, and any existing risk flags on the user account associated with the alert.

Enrichment Decision Matrix

Entity Type	Primary Source	Secondary Sources	Key Data Points
File Hash	VirusTotal	MDE File Profile, Any.Run	Detection ratio, first/last seen, sandbox behavior
IP Address	AbuseIPDB	VirusTotal, Shodan, GreyNoise	Abuse confidence, geolocation, open ports, ISP
Domain	VirusTotal	WHOIS, URLhaus, MDE TI	Registration date, registrar, DNS history, reputation
URL	VirusTotal	URLhaus, Google Safe Browsing	Redirection chain, hosted content, blocklist status
User Account	Azure AD	MDE User Profile, UEBA	Risk level, recent activity, sign-in anomalies

7 Alert Closure Documentation

Every alert must be closed with proper documentation to maintain audit trails, enable metrics reporting, and support detection tuning. The following fields are required or recommended for each alert closure.

Required Closure Fields

Field	Required	Description
Classification	Yes	TP (True Positive), FP (False Positive), or BTP (Benign True Positive)
Determination	Yes	Malware, SecurityTesting, LineOfBusinessApplication, UnwantedSoftware, or Other
Analyst Notes	Yes	Minimum 2 sentences explaining the classification rationale, including key evidence reviewed and conclusions drawn
Actions Taken	Yes	List of all response actions performed (e.g., isolated device, blocked hash, disabled account, no action required)
Tuning Action	If FP	Recommended detection tuning or suppression rule to prevent recurrence of the false positive
Linked Incident	If TP	Incident ID for the created incident. All TP alerts must be associated with an incident for tracking and reporting
Enrichment Summary	Recommended	Key findings from enrichment sources (VirusTotal scores, reputation data, TI matches) that supported the classification
Time Spent	Recommended	Minutes spent on triage for workload metrics and SLA reporting. Helps identify detection rules that consume disproportionate analyst time

Quality Review

All P1/P2 alert closures must be reviewed by a senior analyst within 24 hours. FP classifications exceeding 5 per week for the same detection rule trigger an automatic tuning review.

Alert Triage Playbook

1 Alert Severity Classification

2 MDE Alert Categories

3 Triage Decision Tree

4 Initial Investigation KQL Queries

Device Timeline Around Alert

Expected Output

When to Use

Alert Process Tree Analysis

Expected Output

When to Use

Lateral Movement Check for Alert Entity

Expected Output

When to Use

File Hash Prevalence Check

Expected Output

When to Use

Network IOC Fleet-Wide Check

Expected Output

When to Use

Parent Process Chain Analysis

Expected Output

When to Use

Prior Alerts for Entity

Expected Output

When to Use

5 Evidence Collection Procedures

6 Enrichment Workflows

Enrichment Procedure Steps

Enrichment Decision Matrix

7 Alert Closure Documentation

Required Closure Fields