Event-driven doesn’t mean event-first

There's a pattern I've seen repeat across enough organisations to call it a trend. A team adopts event-driven architecture — usually during a migration away from a monolith — and within six months, everything is an event. Service-to-service communication? Events. User actions? Events. State transitions that used to be a synchronous call with a clear response? Events. The team has internalised “event-driven” as a philosophy rather than a tool, and the result is a system that's harder to reason about than the monolith it replaced.

Event-driven architecture is powerful. It's also one of the most consistently over-applied patterns in enterprise software. The failure mode isn't choosing events — it's choosing events for everything.

The distributed monolith, event edition

The promise of event-driven architecture is decoupling. Services publish facts about what happened. Other services subscribe to those facts and react according to their own logic. No service needs to know who's listening. The coupling is loose, the boundaries are clean, and each service can evolve independently.

That's the theory. In practice, when teams default to events for all inter-service communication, something else happens. The coupling doesn't disappear — it moves. Instead of explicit service-to-service calls with clear contracts, you get implicit dependencies mediated by event schemas. Service A publishes an event. Services B, C, and D consume it. Nobody owns the contract in between. When Service A changes the shape of the event, all three consumers are affected — but the dependency isn't visible in any service's codebase. It's in the event bus, in the schema registry if you have one, and in the runtime behaviour of systems that were supposed to be independent.

This is a distributed monolith. The services deploy independently, but they can't evolve independently because they're coupled through shared event semantics that nobody explicitly governs.

The monolith at least had the courtesy of making its coupling visible in the code.

Events are facts, not instructions

The most useful distinction I've found is this: events should describe things that have happened, not instruct other services on what to do. An event is a fact. “OrderPlaced.” “PaymentReceived.” “InventoryReserved.” These are statements about state transitions in the publishing service's domain. They carry meaning regardless of who's listening.

The failure mode is when events become disguised commands. “ProcessPayment.” “ReserveInventory.” “SendConfirmationEmail.” These aren't facts — they're instructions wrapped in event syntax. The publishing service isn't describing what happened in its own domain. It's telling another service what to do. That's a command, and commands have different requirements: they need acknowledgement, they need error handling, and they need the caller to know what happens when the command fails.

When you use events as commands, you lose all three. The publisher fires an event and moves on. If the consumer fails to process it, the publisher doesn't know.

If the consumer needs to report an error back, there's no natural channel for it. If the operation needs to happen exactly once, you're now building idempotency guarantees on top of a pattern that wasn't designed for them.

This isn't an argument against asynchronous communication. It's an argument for using the right pattern for the interaction. Some interactions are genuinely fire-and-forget — a domain event that multiple services may or may not care about. Others are request-response — one service asking another to do something and needing to know the outcome. Forcing the second into the shape of the first creates complexity that serves nobody.

Where events earn their keep

Events are the right choice when the interaction has three characteristics: the publisher genuinely doesn't need to know who consumes the event, the consumers can handle the event independently without coordinating with each other, and eventual consistency is acceptable for the business process in question.

Domain events between bounded contexts are the clearest example. When the Order service publishes “OrderPlaced,” the Fulfilment service, the Notification service, and the Analytics service can each react independently. The Order service doesn't need to know they exist. If a new service wants to react to the same event next year, nothing changes in the publisher. That's real decoupling. That's the pattern working as designed.

Event sourcing — capturing state changes as an immutable sequence of events — is another strong use case, but a specialised one. It gives you a complete audit trail and the ability to reconstruct state at any point in time. It's powerful for domains where auditability and temporal queries matter. It's expensive and complex for domains where they don't. Choosing event sourcing because it's intellectually elegant rather than because the domain requires it is one of the more costly over-applications I've seen.

Change data capture and integration boundaries also benefit from events. When you need to propagate state changes across systems — particularly across team or organisational boundaries — events provide a natural integration surface that doesn't require the systems to be available simultaneously.

Where events cost more than they're worth

Events are the wrong choice when the publisher needs to know the outcome of the operation. If a user submits an order and the system needs to confirm the payment was processed before acknowledging the order, that's a synchronous interaction. Modelling it as a chain of events — “OrderSubmitted” triggers “ProcessPayment” which triggers “PaymentProcessed” which triggers “ConfirmOrder” — creates an asynchronous saga for something that could have been a straightforward request-response call. You've added infrastructure, latency, failure modes, and debugging complexity without gaining any meaningful decoupling.

Events are also the wrong choice when ordering and exactly-once processing are hard requirements. Events can arrive out of order. Consumers can receive duplicates. These are solvable problems, but solving them adds significant complexity. If the business process requires strict ordering and exactly-once semantics, a synchronous call with a well-defined contract is simpler, cheaper, and easier to debug.

And events are almost always the wrong choice for simple CRUD operations within a single bounded context. If Service A needs to read data from Service B, an API call with a clear contract is simpler and more legible than publishing a request event and subscribing to a response event. The event pattern adds indirection without adding value.

The observability cost

Every event in the system is a decision point you need to be able to trace. When a customer reports that they never received their order confirmation, you need to reconstruct the chain: did the Order service publish the event? Did the Notification service receive it? Did it process it successfully? Did the email provider accept the request?

In a synchronous call chain, this trace is relatively straightforward. Each service calls the next, and the correlation ID propagates through the request path. In an event-driven system, the trace is fragmented across publishers, brokers, and consumers. Each hop is a potential gap in observability.

Every event-driven interaction is an interaction you need to instrument, monitor, and debug through asynchronous trace reconstruction rather than a simple request log.

This isn't an argument against events. It's an argument for being deliberate about where you introduce them, because every event-driven interaction adds observability overhead that compounds across the system.

The teams that use events well invest heavily in correlation IDs, structured event metadata, dead letter queues with alerting, and consumer lag monitoring. The teams that over-use events find themselves drowning in tracing complexity for interactions that never needed to be asynchronous in the first place.

Events and AI: legibility matters more than you think

There's an emerging dimension to this that most teams aren't considering yet. AI agents are becoming active participants in enterprise systems — not just generating code, but operating within architectures: triaging incidents, automating workflows, orchestrating processes across service boundaries.

Through my work on the Model Context Protocol, I see what AI agents actually require from the systems they interact with. They need the same things human engineers need, but they're less forgiving when those things are missing: typed contracts, explicit schemas, observable state transitions, and traceable causality chains.

A well-implemented event-driven system gives AI tooling a powerful foundation. Typed event schemas tell an agent exactly what data is available at each stage of a process. Correlation IDs let an agent trace a customer journey across services. Explicit domain events — “OrderPlaced,” “PaymentReceived” — give an agent a vocabulary for understanding what the system is doing and why.

A poorly implemented event-driven system is as opaque to an AI agent as it is to a new engineer. Worse, because AI tools will attempt to operate within that opacity rather than asking for help.

They'll make the same locally reasonable, architecturally inconsistent decisions that under-informed humans make, but at a much higher velocity.

This isn't a new argument for events or against them. It's the same argument this article has been making — use events deliberately, instrument them properly, keep the contracts explicit — with one additional reason to get it right. The systems you're building now will have AI agents as first-class participants sooner than most roadmaps assume. The discipline you apply to event design today determines whether those agents help or add to the chaos.

Drawing the line

The question isn't “should we use events?” It's “which interactions benefit from the trade-offs that events introduce?” Every event adds asynchrony, eventual consistency, and observability overhead. Those trade-offs are worth it when the interaction genuinely benefits from loose coupling and independent processing. They're not worth it when a synchronous call would be simpler, faster, and easier to debug.

A useful heuristic: if the publisher needs to know the outcome, it's not an event — it's a command or a query. Use a synchronous pattern. If the publisher is stating a fact about its own domain and doesn't care who reacts, that's an event. If you're unsure, start synchronous. You can always introduce events later when the coupling becomes a real constraint. Removing events that shouldn't have been events is significantly harder than adding them where they're needed.

Event-driven architecture is a powerful tool for specific problems: cross-boundary integration, domain event propagation, and decoupling services that genuinely don't need to know about each other.

Applied selectively, it makes systems more resilient, more scalable, and more evolvable. Applied as a default, it makes systems harder to understand, harder to debug, and coupled in ways that are invisible until something breaks.

The architecture should be event-driven where events are the right pattern. It shouldn't be event-first.

Event-driven doesn’t mean event-first

The distributed monolith, event edition

Events are facts, not instructions

Where events earn their keep

Where events cost more than they're worth

The observability cost

Events and AI: legibility matters more than you think

Drawing the line

Konstantin Konstantinov

Wrestling with your eventing architecture?

Event-driven doesn’t mean event-first

The distributed monolith, event edition

Events are facts, not instructions

Where events earn their keep

Where events cost more than they're worth

The observability cost

Events and AI: legibility matters more than you think

Drawing the line

Konstantin Konstantinov

The architecture review nobody asks for (until it’s too late)

Forty engineers, one architecture: lessons from scaling a delivery org

Wrestling with your eventing architecture?