Why We’re Managing Detections Like It’s 2005 Production Code

Ethan Smart

20 Jan 2026 — 8 min read

“Simplicity is a prerequisite for reliability.”

Edsger Dijkstra

There’s an old lesson in engineering that shows up everywhere…from aviation, to distributed systems, to software infrastructure: when systems fail, they rarely fail because they were too simple. They fail because complexity hid the real problem.

Imagine a commercial airplane cockpit overloaded with switches, alerts, and warning lights-many of them redundant, some of them outdated, a few of them broken. In theory, it offers more control. In reality, it makes failure harder to diagnose and recovery harder to execute. Reliability doesn’t come from more knobs. It comes from clear systems, strong guarantees, and feedback you can trust.

Detection engineering today looks a lot like that cockpit.

We’ve layered rules on rules, tools on tools, alerts on alerts;then wondered why reliability keeps slipping.

The Uncomfortable Truth

Security teams like to believe detection engineering is a modern discipline. We talk about Detection-as-Code, CI/CD for security, and now AI-powered SOCs. But if you look closely at how detections are actually managed day to day, the reality is much less flattering.

In practice, most organizations are managing detections the same way software teams managed production code twenty years ago.

And that’s the real problem.

A Pattern That Feels Uncomfortably Familiar

There’s a particular smell to systems that technically work but aren’t reliable. You see it in old factories, legacy IT environments, and pre-DevOps production systems.

Nothing is obviously broken. Things mostly function. But everyone knows where the fragile parts are.

There’s the machine nobody wants to touch.The script that only one person understands.The process that works as long as nothing unexpected happens.

Detection engineering environments often feel exactly like this.

A small group of trusted experts owns the most important detections. Changes are made slowly and carefully, not because they’re hard, but because the blast radius is unclear. New rules are deployed directly into production systems with fingers crossed. Validation is informal, tribal, and mostly manual.

When something goes wrong, it’s rarely caught by a test or a dashboard. It’s caught by a human, usually a SOC analyst-who suddenly feels the pain first.

This isn’t what broken systems look like.

It’s what fragile systems look like.

Even the way detections are tested follows the same pattern.

In many organizations, detection testing still looks like a waterfall process. A rule is written, reviewed, and deployed, then validated later using breach-and-attack simulation tools, purple team exercises, or manual attack tests.

Those tests rarely integrate into the detection workflow itself. Instead, engineers and analysts sit side-by-side, watching dashboards, hoping to see an alert fire.

Did it trigger?Did it trigger the right way?Did anything else break?

This kind of testing is slow, manual, and episodic. It produces point-in-time confidence at best and depends heavily on expert coordination. When a detection changes, the process starts over from scratch.

Just like early software testing, this approach finds problems late and does nothing to prevent the same failures from reappearing the next time a rule is modified.

How Software Used to Be Run (Before We Knew Better)

Early production software looked remarkably similar:

Code shipped straight to production
Minimal automated testing
Logging added after incidents
Bugs discovered by users
Hero engineers fixing issues under pressure

At the time, teams believed the solution was “better developers.”

It wasn’t.

Software reliability improved only when the industry embraced boring-but powerful-ideas:

CI/CD pipelines
Automated testing
Observability and telemetry
Clear ownership of production health
Feedback loops that surfaced failure early

Software didn’t become reliable because engineers got smarter. It became reliable because systems got simpler and safer to operate.

How Detections Are Still Managed Today

Despite all the progress in software engineering, detection engineering largely skipped this evolution-and in some ways, it’s in a worse position than early production software.

In software, even brittle systems had some form of runtime feedback. Exceptions crashed processes. Errors surfaced in logs. Users complained immediately. Failure was visible.

Detections don’t get that luxury.

Most detections today:

Are written as queries or rules
Are deployed directly into SIEMs, EDRs, or cloud platforms
Have no automated schema or log validation
Have no continuous health monitoring
Have no concept of runtime errors

A detection can be syntactically correct, deployed successfully, and completely non-functional-and no one will know.

In software, a null pointer exception is loud. In detection engineering, failure is silent.

The closest thing to a runtime error handler is a human, usually a SOC analyst, who notices something feels off. An alert never fires. Or fires constantly. Or fires too late to matter.

Often, the first true signal that something was broken isn’t an alert at all. It’s a confirmed compromise. Or a red team report. Or an uncomfortable post-incident question: Why didn’t we see this?

By the time detection failure becomes undeniable, the cost has already been paid.

That’s what makes detection engineering uniquely dangerous.

Software failures interrupt service. Detection failures rewrite history.

Where AI Actually Enters the Picture

AI isn’t the problem here and it isn’t the punchline either.

In fact, detection engineering is one of the domains where modern AI models are genuinely strong. Large language models are excellent at reading, writing, and reasoning about code. They can translate between formats, explain complex logic, and accelerate iteration in ways that were impossible just a few years ago.

The issue isn’t that teams are using AI.

It’s where and how AI is being applied.

Today, AI is most often used at the edges of the detection process:

Drafting or modifying detection queries
Translating rules between formats like Sigma
Generating documentation or runbooks

These are useful accelerations, but they operate on individual artifacts, not the system that governs how detections are built, tested, deployed, and monitored.

As a result, teams move faster, but reliability doesn’t improve.

A detection can be written more quickly and still:

Reference fields that no longer exist
Depend on logs that aren’t present
Fail silently in production
Create noise that overwhelms the SOC

AI didn’t cause these problems and it can’t fix them in isolation.

This is a familiar lesson from software engineering. Better code generation never made production systems reliable on its own. Reliability came from workflows, validation, and feedback loops that surrounded the code.

The same is true for detection engineering.

AI becomes transformative only when it operates inside a well-defined detection lifecycle and inside a platform where all of those lifecycle stages are actually connected.

Without a shared platform, detections live as disconnected artifacts: queries in one system, validation in another, tests run manually, health signals buried in dashboards, and feedback trapped in tickets or tribal knowledge.

A true workflow doesn’t just define steps. It connects them.

When those pieces are connected, AI can move beyond assistance and into agentic workflows, where models don’t just generate detections, but help detection engineers reason about change, validate assumptions, surface risk, and suggest safer paths forward.

In that world, AI isn’t replacing human judgment. It’s reinforcing it by operating across the full system, not just a single file or prompt.

The Missing Piece in Detection Engineering

Detections look deceptively lightweight. They’re often just text, YAML, or a saved query in a console. Easy to write. Easy to copy. Easy to deploy.

But they don’t behave like lightweight artifacts.

In practice, detections behave much more like production systems:

They generate ongoing operational load
They influence real human decisions under pressure
They fail silently rather than loudly
They carry real business and risk implications

This mismatch is at the heart of the problem.

We treat detections like configuration, but they behave like software.

The issue isn’t that teams lack AI, intelligence, or effort. It’s that detections are rarely managed inside engineering-grade workflows that assume failure, validate continuously, and surface risk early.

What Changes When Detections Are Treated Like Software

Once detections are managed inside a real engineering system, the entire dynamic shifts.

Validation stops being an event and becomes a property of the system. Change stops being scary because impact is visible. Failure stops being silent because signals are wired into the workflow.

In that environment, AI finally has something solid to operate on.

Not a single query in isolation, but a connected system with history, context, and feedback.

Suddenly, AI can assist in ways that actually improve reliability:

Linting detections against schemas before deployment
Verifying required log sources exist and are healthy
Generating and replaying tests automatically
Backtesting changes against historical data
Summarizing how a revision changes alert volume or coverage
Flagging elevated risk before it reaches production

This is the real shift:

AI doesn’t make detections safer.Engineering discipline makes AI useful.

Without structure, AI accelerates risk.With structure, AI accelerates confidence.

The Force Multiplier Effect

When core engineering practices come together, AI stops feeling like a novelty and starts acting like leverage.

Detection-as-Code provides a source of truth. CI-style workflows make change observable. Continuous validation ensures assumptions stay valid. Health metrics close the loop with reality. SOC feedback grounds everything in outcomes.

When those pieces are connected inside a single platform, AI can operate across the full system instead of a single step.

The interaction changes fundamentally.

AI evolves from:

“Help me write a rule”

Into:

“Tell me if this change is dangerous”
“Show me what broke after deployment”
“Explain why alert volume shifted”
“Recommend safer alternatives based on past behavior”

This is the difference between AI as a generator and AI as a reviewer.

And reviewers, systems that continuously assess change and surface risk, are what production-grade environments actually need.

Why This Matters Now

Detection environments are becoming more complex every quarter.

There’s more telemetry. Schemas change faster. Vendors ship updates continuously. Alert volumes keep climbing.

At the same time, the human side of the system is under strain.

SOCs are understaffed. Analysts are burned out. Leadership wants fewer false positives and stronger coverage-without adding headcount.

In this environment, speed without safety is reckless.

AI alone gives teams speed. Engineering systems provide safety.

When the two are combined inside a connected platform, teams finally get leverage-moving faster without losing reliability.

Security Is Catching Up to Engineering

There’s a clear pattern running through every failure described above.

When systems become unreliable, the root cause is almost never a lack of intelligence, effort, or tooling. It’s fragmentation. Signals live in different places. Decisions are made without context. Feedback arrives late-or not at all.

Software engineering learned this lesson years ago. Reliability didn’t emerge from writing better code or hiring smarter engineers. It emerged when teams built platforms that connected code, testing, deployment, telemetry, and feedback into a single operational system.

Detection engineering hasn’t fully made that shift yet.

Today, detections still live as disconnected artifacts. Validation happens outside the system. Testing is episodic. Health is inferred indirectly. And failure is often discovered only after an attacker, a red team, or an incident response report tells you what should have been obvious earlier.

This is why simplicity matters.

Not simplicity in ambition or coverage, but simplicity in how systems are structured. Fewer handoffs. Fewer disconnected tools. Clearer guarantees about what is running, what is working, and what is at risk.

When detection lifecycles are unified inside a single platform, workflows stop being checklists and start becoming systems. Context is preserved. History is visible. Change becomes safer.

Only then does AI reach its full potential.

Not as a replacement for detection engineers, and not as a black-box decision-maker, but as an agentic assistant operating across the entire detection lifecycle. One that helps engineers reason about changes, validate assumptions continuously, surface risk early, and move faster without breaking trust.

As Dijkstra warned decades ago, reliability begins with simplicity.

In detection engineering, that simplicity doesn’t come from doing less. It comes from finally connecting the pieces that were always meant to work together.