Exactly-once isn't a myth — it's discipline

Lilit Hovhannisyan avatar
Lilit Hovhannisyan
Cover for Exactly-once isn't a myth — it's discipline

Ask an experienced distributed-systems engineer about “exactly-once delivery” and you’ll often get a knowing smile: at the network level, it doesn’t exist. Messages get lost, retried, and redelivered. A client double-clicks. A gateway times out and your service never finds out whether the call landed.

And yet, when the message is “move $500”, the result has to be exactly once. The money moves a single time, or the whole system loses trust. The trick is that exactly-once isn’t a property of the network — it’s a property you build, at the layer where it matters.

Idempotency is the foundation

Every state-changing operation in a payment system should carry a client-supplied idempotency key: a unique identifier for that specific intent. The rule is simple and absolute:

The same idempotency key must produce the same result, exactly once, no matter how many times it arrives.

The first request with a given key does the work and records the outcome. Every subsequent request with that key returns the recorded outcome — it does not redo the work. A retried “charge $500” returns the original receipt; it does not charge again.

This one discipline turns an unreliable, at-least-once world into an effectively exactly-once one, without requiring the impossible from the network.

Make the ledger the source of truth

Idempotency keys protect the entrance. The ledger protects the truth. We build on a few non-negotiables:

  • Double-entry accounting. Every movement is a balanced pair of entries. The books must sum to zero. If they don’t, something is wrong — and you find out immediately, not at month-end.
  • Append-only history. You never edit a posted entry. Corrections are new entries. The full history is immutable and auditable.
  • Atomic balance updates. The debit, the credit, and the idempotency record commit in a single transaction. Either all of it happened, or none of it did.

Handle the in-between state honestly

The hardest cases live in the gaps: the request that timed out, the worker that died mid-transaction, the callback that never came. The answer is to make “unknown” a first-class state rather than a guess.

  1. Record an operation as pending before you attempt it, keyed by its idempotency key.
  2. Attempt the work. On success, transition to completed in the same transaction.
  3. If you crash in between, recovery re-reads the pending record and either completes it or marks it failed — deterministically, using the key to avoid doing it twice.

A process that can be safely re-run from any point is one you can recover without fear. That’s what lets you retry aggressively instead of hoping.

Reconcile, always

Even with all of the above, you reconcile — continuously, automatically, and against external sources of truth like your payment providers. Reconciliation isn’t an admission that the system is wrong; it’s the control that proves it’s right. When every balance ties out to the cent, every day, you’ve earned the one thing a money system can’t function without: the confidence that the numbers are real.

Exactly-once, in the end, isn’t magic. It’s idempotency keys, an honest ledger, recoverable state, and the humility to keep checking. Discipline, applied everywhere it counts.