Detection Outcomes and Tiering

The Precision vs Coverage Tradeoff

Detection engineering requires balancing coverage and precision.

Additionally, single-layer alerting lacks sufficient context to accurately assess activity in isolation. This makes it difficult to identify multi-stage or chained attacks, where individual events may appear benign but are malicious when correlated.


Detection Outcome Model

We define detection outcomes across three distinct stages:

Stage 1: Detection Intent - Was the alert expected to fire based on rule design?

Stage 2: Investigation Outcome - What did the analyst determine?

Stage 3: Validation - Was that determination correct?

Each alert can be evaluated across three dimensions: whether it should have fired, what it represents, and whether that assessment was correct.


Detection Outcome Matrix (v1.0)

This model extends traditional definitions by anchoring outcomes on detection intent.

Stage 1: Analyst Triage Stage 2: Investigation Verdict Stage 3: QA Validation
Within Detection Intent
(FN - miss, TP - hit)
False Negative (FN)

- Uncaught red team relevant activities
- Threat Hunting findings (unknowns)
- Undetected bug bounty findings
- Self-identified incidents
- Any other valid form of reported incidents (detection opportunities)
True Positive (TP)

- Malicious (with various categories of attacks compromising CIA)
- Abusive (violating compliance, T&C)
- Approved (ie red team)
- Benign (ie anomalies that warrant investigation but found not to be malicious)
False-False Positive (F-FP)
- the FP determination is erroneous / should have been a TP
Outside Detection Intent
(TN - correct reject, FP - false alarm)
True Negative (TN)

(no alerts as expected)
False Positive (FP)

- Noise (that should be suppressed/excluded based on accurate detection logic)
- Inaccurate Logic
False-True Positive (F-TP)
- the Benign-TP determination was made erroneously



Detection Outcome Taxonomy

The following taxonomy defines how detection outcomes can be consistently classified during investigation and reporting.

Outcome Nature Definition Example
True Negative (TN) An event that does not meet the intended criteria of a detection rule and therefore does not generate an alert, consistent with the rule’s defined scope.

(This matters when validating that a rule doesn't match events it wasn't designed to catch. These events aren't necessarily benign - they may be malicious, but they fall outside the scope of the specific rule being tested.)
A rule designed to detect suspicious PowerShell execution does not trigger when a system service executes a signed, vendor-provided PowerShell script as part of a standard software update process.
False Negative (FN) Malicious or abusive activity that occurs without being detected by existing detection logic, subsequently discovered through alternative mechanisms. An attacker successfully exploits a vulnerable web service, but the relevant exploitation detection rule fails to alert. The activity is later identified during threat hunting.
True Positive (TP) An event that meets the intended detection criteria of a rule and correctly generates an alert in accordance with the rule’s defined scope. A rule designed to detect suspicious PowerShell execution correctly alerts on malicious script activity observed on an endpoint.
TP - Malicious An event involving malicious activity that threatens or violates the confidentiality, integrity, or availability of systems, data, or services, and is correctly detected by the rule.

For reporting purposes, this may be further classified into a high-level threat category (for example, malware, social engineering, or denial of service).
A detection alerts on ransomware encryption activity impacting file integrity.
TP - Abusive An event involving behavior that violates organizational policy, compliance requirements, acceptable use standards, or terms and conditions, and is correctly detected by the rule. A web application detection rule alerts on automated shopping cart inventory hoarding behavior intended to artificially reserve limited stock items, violating the platform’s acceptable use and fair access policies.
TP - Approved An event exhibiting malicious characteristics that is generated through authorized security testing activities and is correctly detected by the rule. A credential dumping attempt performed during an authorized red team exercise triggers the appropriate detection rule.
TP - Benign An event that satisfies the intended anomalous or investigative criteria of the detection rule, generates an alert as designed, and is subsequently determined not to involve malicious or policy-violating behavior. A detection designed to surface anomalous data transfer volume alerts on a large outbound file upload from an engineering workstation. Investigation determines the activity was a legitimate bulk export related to an approved product release.
False-TP (F-TP) (After validation) An alert initially classified as a TP that is later determined not to have met the intended detection criteria and should have been classified as a FP. An alert for suspected data exfiltration is classified as a TP, triggering an automated SOAR response that blocks outbound traffic from a production server. Post-incident review determines the traffic was a legitimate replication process that should have been excluded from the detection criteria, resulting in unnecessary operational impact.
False Positive (FP) An event that does not meet the intended detection criteria of a rule but nevertheless generates an alert. A credential misuse detection rule alerts on a user authenticating from a VPN IP address that overlaps with a known threat intelligence feed, but the IP was reassigned and no longer associated with malicious activity.
FP - Noise An alert generated from activity that falls outside the intended scope of the detection and represents expected, routine, or otherwise irrelevant events that should be excluded or suppressed through proper tuning. A lateral movement detection alerts on a service account authenticating across multiple servers, but the account is a documented configuration management account performing scheduled orchestration tasks. The detection lacks a suppression condition for this known behavior.
FP - Inaccurate Logic An alert generated due to flawed, overly broad, or improperly constructed detection logic that does not accurately represent the intended detection criteria. A detection rule incorrectly matches on an unrelated log field due to improper query construction, generating alerts for irrelevant activity.
False-FP (F-FP) (After validation) An alert that was resolved as a FP but is later determined, through validation or retrospective review, to have met the intended detection criteria and should have been classified as a TP. An analyst closes an alert as benign administrative activity, but later investigation reveals it was part of an actual credential compromise and met the rule’s intended detection criteria.



Alert vs Incident

Model Flow
1 detection rule → triggers alert → creates incident ticket
2 detection rule → triggers alert → creates case ticket → incident created for TP

Organizations implement alert and incident workflows differently, as illustrated in the two models above.

In both approaches:


Detection Tiering Overview

Detection tiering addresses the tradeoff between false positives and false negatives by layering detections across different levels of context and confidence.

Lower tiers are designed to cast a wider net, surfacing potential signals and anomalous activity. Higher tiers build on these signals by adding context, correlation, and logic to identify activity that is more likely to represent a real threat.

This layered approach improves overall visibility while increasing detection fidelity, allowing analysts to focus on what matters most.

The intended outcomes are:


Detection Tiering Model

The following model is derived from common patterns observed across SIG member implementations. It serves as a reference model to illustrate how detection tiering can be structured, rather than a prescriptive standard.

Tier Focus Description Typical Characteristics Outputs
1 Signal Generation (Baseline / Hunt) Captures atomic signals, indicators, and simple patterns to surface potential activity. IoCs, simple pattern matching, hunt queries, high-volume, low-confidence Raw signals, hunt breadcrumbs
2 Signal Enrichment and Correlation Combines and enriches Tier 1 signals by adding context and basic logic. Thresholds, temporal patterns, risk scoring, basic correlation Contextualized alerts, intermediate findings
3 Analytical Detection Identifies suspicious behaviors by aggregating signals and applying context-aware detection logic. Behavior-based rules, signal aggregation, technique-focused detection Medium to high confidence alerts requiring investigation
4 High-Confidence Detection Produces actionable detections based on strong signals and correlation. Composed detections, strong behavioral indicators, low-volume, high-confidence Incident-ready alerts, automated response triggers