I saw a pull request last Tuesday that made me want to close my laptop and go live in the woods. It was a classic “event-driven” architecture mistake, the kind that looks efficient on a whiteboard but turns into a distributed monolith the second you deploy it to production.
The developer had hooked up Debezium to our user service’s Postgres instance and was piping the raw Change Data Capture (CDC) logs directly into a public Kafka topic for the billing team to consume. “It’s real-time,” they said. “It’s decoupled.”
No. It’s not decoupled. You just took your database schema, signed it in blood, and handed it to another team as a binding contract. And if I rename a column in my table next month, the billing service crashes. That’s the opposite of decoupled.
We need to talk about the difference between internal events and external events. Because — and this is probably why — treating them as the same thing is why so many microservices architectures end up being a nightmare to maintain.
The “Just Expose It” Trap
Here’s the scenario. You’re building an Order Service. When an order is placed, you need to let the Shipping Service know. The lazy way — and I’ve done this, so I’m not judging too hard — is to just emit an event whenever your Order aggregate changes.
If you’re using a framework that auto-magically publishes domain events, it might look like this:
{
"topic": "orders.events",
"event_type": "OrderUpdated",
"payload": {
"id": "ord_8723",
"status": "PAID",
"customer_id": 992,
"internal_flags": 4,
"version": 12,
"updated_at": "2026-02-26T14:30:00Z",
"db_shard": "us-east-1a"
}
}
See the problem? internal_flags? version? db_shard? None of this is the Shipping Service’s business. This is implementation detail. This is an internal event.
Internal events are for you. They represent the granular steps your service takes to do its job. And they are chatty, specific, and tied tightly to your domain logic. If you refactor your code, these events probably change.
External events (or integration events) are for others. They are a promise. A contract. They say, “Hey, the world has changed in this specific way.”
The Refactoring Nightmare
Let’s say six months from now, I decide to split the Order table. Maybe I move status into a separate lifecycle table for performance reasons. I was running Postgres 17.2 locally and realized the locking contention on the main table was killing us during the last holiday spike.
But if I exposed that internal OrderUpdated event directly to the public, I can’t change my database schema without breaking the Shipping Service, the Email Service, and that weird Analytics script Dave wrote three years ago.
I am now paralyzed. I cannot improve my service’s internals because my internals have leaked out.
Designing the Integration Layer
The fix isn’t complicated, but it requires discipline. You need a translation layer. You need to explicitly design your public API events just like you design your REST or gRPC endpoints.
An external event should look like this:
{
"topic": "public.orders",
"event_type": "OrderPlaced",
"payload": {
"order_id": "ord_8723",
"customer_reference": "cust_992",
"items": [
{"sku": "A-123", "qty": 1}
],
"shipping_address": {...}
}
}
Notice what’s missing? No internal versions. No database flags. Just the intent. “An order was placed.”
And if I change my underlying database from Postgres to Mongo next week, this event doesn’t change. The consumers don’t care. That is actual decoupling.
The “Outbox” Pattern is Your Friend
So how do you implement this without race conditions? You don’t want to commit to the database and then try to publish to Kafka, because if the publisher fails, your system is in an inconsistent state.
I’ve been burned by this. But the solution is the Transactional Outbox Pattern.
- Start a database transaction.
- Update your domain state (e.g., the
orderstable). - Insert the integration event into a separate
outboxtable in the same transaction. - Commit.
Then, you have a separate process (a “relay”) that reads from the outbox table and pushes to your message broker. If the broker is down, you just retry. You never lose an event.
When to Break the Rules
Look, I’m not a purist. Sometimes you don’t need this level of separation. But the moment an event crosses a team boundary? It’s an API. Treat it with the same respect you treat your HTTP endpoints.
Summary (Or: What I Told the Junior Dev)
I rejected that PR. Not to be mean, but to save us from a 3 AM pager duty call six months from now.
We ended up building a small publisher service that listens to the raw CDC stream, filters out the noise, transforms the schema into a stable contract, and publishes that to the public topic. It added maybe 5 milliseconds of latency. In exchange, we can now rewrite our entire user service backend without asking the billing team for permission.
Worth it.
