Understanding MTTD and MTTR: Key Metrics in Security Ops

If you have spent any time in security operations, you have seen these two numbers on a dashboard. Mean Time to Detect and Mean Time to Respond. They show up in quarterly business reviews, board presentations, vendor sales decks, and job descriptions for SOC managers. They are, in the language of Post 1 of this series, the metrics that feel like security measurement because they are associated with real operational processes. Detection and response are genuinely important. The question is whether the way we calculate and report these numbers is actually telling us anything meaningful, or whether we have simply found a more sophisticated way to count things that are easy to count.

Contents

What These Metrics Are Actually Good For
Where the Honest Problems Begin
The Dwell Time Problem and What It Reveals
How MTTD Gets Gamed Without Anyone Intending To
Making These Metrics More Honest

My honest answer, after working with these metrics across multiple environments, is: both. MTTD and MTTR have real value when they are understood correctly, scoped honestly, and paired with the right context. They also have significant limitations that most organizations do not acknowledge, and those limitations are where the trouble lives.

What These Metrics Are Actually Good For

Let me start with the affirmative case, because it deserves to be made clearly before the criticism lands. MTTD and MTTR, when calculated honestly, do capture something real about operational security performance. Detection speed matters. The relationship between how quickly an attacker is identified and the damage they are able to cause is not theoretical. According to Mandiant’s M-Trends 2024 report, the global median attacker dwell time has dropped to just 10 days, with ransomware-related breaches averaging around five days. ^[1] That improvement in dwell time reflects genuine progress in detection capability across the industry. MTTD is one of the mechanisms through which that progress gets measured and tracked.

MTTR matters for the same reason. Once a threat is detected, the time between detection and containment determines how far the attacker gets. Breaches with a lifecycle exceeding 200 days averaged significantly higher costs, with organizations in that category averaging USD 5.46 million in breach costs in 2024. ^[2] The speed of the response lifecycle has a direct and measurable impact on the financial and operational consequences of a breach. That is a meaningful connection between a metric and a real-world outcome, which is exactly what Post 1 of this series argued most security metrics lack.

Used at the operational level, MTTD and MTTR are also useful diagnostic tools. If MTTD is trending upward over time, it is worth asking why. Are detection rules degrading? Is telemetry coverage shrinking as the environment grows? Is analyst capacity being overwhelmed? The metric does not answer those questions, but it surfaces the signal that prompts someone to ask them. That has genuine value for SOC managers who are trying to understand where their operation is breaking down.

Where the Honest Problems Begin

Here is the limitation that almost nobody reports alongside the metric: MTTD only measures the incidents that were detected. It says nothing whatsoever about the incidents that were not.

This is not a subtle distinction. It is a fundamental flaw in how these numbers get used in practice. If your SOC has solid detection coverage for endpoint malware and weak coverage for lateral movement and credential abuse, your MTTD might look excellent because the incidents that make it into your detection pipeline are the ones you are best equipped to catch quickly. The ones you are not catching are simply absent from the denominator. Your metric improves as your coverage narrows, and the number tells a more flattering story precisely because the hardest problems are invisible to it.

The same problem affects MTTR differently but just as seriously. MTTR is highly sensitive to how you define “responded.” There are no industry-standard approaches to measuring these two performance indicators, so granular comparisons between organizations can be problematic apples-versus-oranges affairs. ^[3] Does response end when the affected endpoint is isolated? When the root cause is identified? When all persistence mechanisms are removed and the environment is verified clean? Organizations answer that question inconsistently, often in the way that produces the most favorable number, and the result is that MTTR figures across different teams and organizations are frequently incomparable even when they appear to be measuring the same thing.

The Dwell Time Problem and What It Reveals

Dwell time is the metric that sits underneath both MTTD and MTTR, and it is more honest about what is actually being measured. Dwell time is the total period an attacker spends in an environment before they are discovered and expelled. It encompasses both the detection gap and the response gap. Dwell time is determined by adding mean time to detect and mean time to repair, and is sometimes referred to as the breach detection gap. ^[4]

Those dwell time numbers look considerably more sobering than most organizations’ reported MTTD figures. Part of the reason is that dwell time is calculated retrospectively after a breach is fully investigated, while MTTD is calculated from the alerts that fire in near real time. The retrospective analysis frequently reveals that the attacker was present far longer than any alert suggested, because the early stages of the intrusion generated no alert at all. The MTTD clock does not start until detection occurs. The dwell time clock starts at compromise. That gap between the two clocks is where the real measurement problem lives.

How MTTD Gets Gamed Without Anyone Intending To

There is a gaming dynamic with MTTD that does not require anyone to act in bad faith. It happens naturally as a consequence of how detection programs evolve. As detection engineers tune rules and reduce false positive rates, the alerts that make it through the queue tend to be higher confidence and faster firing. The average time to detect those alerts goes down. The MTTD looks better. What has actually improved is the efficiency of the detection logic for a specific and relatively well-understood set of threat behaviors.

Meanwhile, the detection coverage for newer, less understood, or more subtle threat techniques may have remained stagnant or even declined as engineering effort concentrated on tuning existing detections rather than building new ones. The MTTD number improved while the detection program’s actual breadth narrowed. Nobody gamed anything deliberately. The incentive structure around the metric simply pointed in the wrong direction, and the number responded accordingly.

Making These Metrics More Honest

None of this means abandoning MTTD and MTTR. It means reporting them with the context and caveats they require to be genuinely useful rather than superficially impressive. A few practical adjustments change the quality of the conversation significantly.

Report MTTD segmented by detection method and threat category. A four-hour average across all incidents tells you almost nothing. A four-hour average for endpoint detections, a twelve-hour average for identity-based detections, and a note that cloud-native lateral movement has no meaningful detection baseline is a picture leadership can actually act on. The segmentation is where the honest information lives.

Define MTTR explicitly and consistently before you calculate it. Decide what “responded” means in your environment, write it down, and apply the same definition every quarter. If you change the definition, flag the change when you report the trend. Inconsistency in definition is the primary reason MTTR trends are unreliable over time.

Pair both metrics with a coverage gap acknowledgment. Every MTTD and MTTR figure should come with an honest statement about what the detection program is not yet covering, so that leadership understands the metric in the context of its actual scope rather than assuming it represents the full attack surface. Detection coverage represents the percentage of MITRE ATT&CK techniques that have been instrumented and tested, indicating how likely the team is to receive an alert when an adversary executes each behavior. ^[5] That coverage figure belongs next to every MTTD you report.

MTTD and MTTR are useful. They are also among the most consistently misrepresented metrics in security operations, not through dishonesty but through incomplete reporting that strips out the context that makes the numbers meaningful. Speed of detection only matters if you are detecting the right things. Speed of response only matters if containment actually means what it implies. The numbers are worth tracking. The conversation around them is worth having more honestly than most organizations currently do.

Post 3 in this series takes on detection coverage directly: what it means to actually measure how much of your attack surface you can see, why the MITRE ATT&CK framework is the right organizing structure for that question, and why most organizations’ coverage estimates are significantly more optimistic than their actual detection capability warrants.

[1] The Missing Link. Understanding Incident Metrics: MTTD and MTTR. Citing Mandiant M-Trends 2024 dwell time data. November 2025. themissinglink.com.au

[2] Packetlabs. Metrics That Matter After a Breach: From Dwell Time to Impact. Citing IBM Cost of a Data Breach 2024. packetlabs.net

[3] Arctic Wolf. MTTD and MTTR. On the absence of industry-standard measurement approaches. arcticwolf.com

[4] Optiv. Dwell Time. Definition of dwell time as the sum of MTTD and MTTR. optiv.com

[5] Prophet Security. SOC Metrics and KPIs That Matter: MTTR, MTTD, MTTI, False Negatives, and More. February 2026. prophetsecurity.ai

[6] Deepwatch. Mean Time to Detect (MTTD): Reducing Dwell Time Across the Kill Chain. September 2025. deepwatch.com

[7] Wiz. MTTD and MTTR in Cybersecurity Incident Response. Citing the Midnight Blizzard detection timeline. wiz.io

MTTD and MTTR: Useful, Misused, or Both?

Mean Time to Detect and Mean Time to Respond are the most cited operational security metrics in the industry. They are also the most consistently reported in ways that obscure more than they reveal.

What These Metrics Are Actually Good For

Where the Honest Problems Begin

The Dwell Time Problem and What It Reveals

How MTTD Gets Gamed Without Anyone Intending To

Making These Metrics More Honest

Like this:

Related

Leave a ReplyCancel reply

Recent Posts

Recent Comments

You May also Like

When to Coach, When to Manage, and When to Let Go

Hunting for Living-off-the-Land Attacks

Security Operations: Optimizing Response Times

Course Review: BluRaven’s Hands-On KQL for Security Analysts

About Me

What These Metrics Are Actually Good For

Where the Honest Problems Begin

The Dwell Time Problem and What It Reveals

How MTTD Gets Gamed Without Anyone Intending To

Making These Metrics More Honest

Share this:

Like this:

Related

Leave a ReplyCancel reply

Recent Posts

Recent Comments

You May also Like

When to Coach, When to Manage, and When to Let Go

Hunting for Living-off-the-Land Attacks

Security Operations: Optimizing Response Times

Course Review: BluRaven’s Hands-On KQL for Security Analysts

About Me

Discover more from K.C. Yerrid - Information Security Executive