Delivery semantics: at-least-once, exactly-once, and how it's actually built

Walk me through it

Imagine texting a friend "I sent the money." The text might not arrive (network drops it). So you set your phone to resend until you get a reply. Now a new problem: maybe the first text arrived fine and only the reply got lost — so your friend gets the same message twice. You've traded "might lose it" for "might duplicate it." There is no third option for free: a system that retries to avoid loss will, sooner or later, send a duplicate.

That's the entire delivery-semantics problem in one anecdote. In streaming, a "message" might be a ShopFlow paid event (see Meet ShopFlow), and a duplicate might be that order's revenue counted twice in fact_revenue_1m. This lesson defines the three guarantees, shows why the easy ones are easy and the hard one is hard, and then builds exactly-once out of four concrete parts — because exactly-once isn't a checkbox you flip, it's a chain, and a guide that hand-waves it leaves you with silent double-counts in production.

The three guarantees

When an event flows source → processor → sink, how many times can it affect the result?

At-most-once — each event is processed zero or one times. It may be lost, never duplicated. (Commit your progress before you finish processing; a crash in between drops the event.) Acceptable only when occasional loss is fine — some metrics, some telemetry.
At-least-once — each event is processed one or more times. It's never lost, but may be duplicated. (Process first, commit progress after; a crash before the commit means you reprocess on restart.) This is the common default and is correct for any operation that doesn't mind seeing an event twice.
Exactly-once — each event affects the result exactly one time, even across failures. Never lost, never double-counted. The gold standard — and the expensive one.

The single design knob behind at-most vs at-least-once is when you commit your offset relative to doing the work — commit-then-work risks loss; work-then-commit risks duplication. Exactly-once needs more than knob-twiddling.

Why "exactly-once" is hard — and what it really means

The naive hope is "just don't send duplicates." But you must retry (to avoid loss), retries cause duplicates, and the processor can crash after a side effect but before recording that it happened. So duplicates are inevitable at the transport layer. The trick is to make them not matter:

Exactly-once doesn't mean "delivered exactly once on the wire." It means "the effect is applied exactly once." Duplicates may be produced; the system neutralizes them so the end result is as if each event counted once. The accurate term many engines use is exactly-once processing / effective exactly-once.

Achieving that end-to-end takes a chain of four cooperating mechanisms. Break any link and you fall back to at-least-once.

Part 1 — The idempotent producer (no duplicates into the log)

First problem: a producer writes to Kafka, the write succeeds, but the acknowledgment is lost, so the producer retries — and now the same record is in the log twice. An idempotent producer fixes this: each producer gets an ID and tags records with sequence numbers, so the broker recognizes and discards a retried duplicate. The result: even with retries, each record appears in the partition exactly once. This is a simple config toggle on modern Kafka and is the foundation everything else stands on. (Idempotent = doing it twice has the same effect as doing it once.)

Part 2 — Transactions (atomic multi-write across partitions)

A stream processor typically does: read input offsets → update state → write output records. For exactly-once, all of that must be all-or-nothing. Kafka transactions let a processor group writes — the output records and the consumed input offsets — into a single atomic unit that either all commits or all aborts. Consumers configured with read_committed see a transaction's records only after it commits; aborted records are never visible. So a crash mid-transaction leaves no partial output — on restart, the aborted attempt is invisible and the processor cleanly retries. This is what ties "I produced this output" and "I consumed this input" together so they can't drift apart.

Part 3 — The transactional sink / two-phase commit (exactly-once to the outside world)

The hardest link: the sink — the external system you write results to (a database, a warehouse, another Kafka topic, a lakehouse table). Internal checkpoints (lesson 9.4) make the processor's own state exactly-once, but on recovery the processor replays, and replay re-emits output — so without care the sink gets the result twice. Two ways to make the sink exactly-once:

Idempotent sink — design the write so doing it twice equals doing it once. The classic move: upsert by a key instead of append/INSERT. For ShopFlow's fact_revenue_1m (see Meet ShopFlow), the natural sink key is the window (the minute) — upsert each row keyed on minute (or window + grouping key), so re-emitting minute 10:01 = $65 on replay overwrites the same row rather than adding a second. Replaying the same record just overwrites the same row — no duplicate. Simple and robust when your sink supports keyed upserts.
Transactional sink via two-phase commit (2PC) — coordinate the sink's commit with the processor's checkpoint so they're atomic. Two-phase commit is a protocol with two steps: (1) prepare — the sink writes the data but doesn't yet make it visible (a pending transaction); (2) commit — only once the corresponding checkpoint succeeds does the engine tell the sink to commit, making the data visible. If the job crashes between phases, recovery either re-commits the prepared-but-uncommitted transaction or aborts it — so the external write becomes visible exactly once, in lockstep with the checkpoint. (Flink's TwoPhaseCommitSinkFunction and Kafka's transactional producer implement exactly this.)

The chain, end to end: idempotent producer (no dup into the log) → transactions (atomic read-process-write) → consistent checkpoints (exactly-once internal state, lesson 9.4) → transactional/idempotent sink (exactly-once external effect). All four, or you have at-least-once.

Part 4 — Dedup on a key + event_id (the pragmatic fallback)

Full end-to-end transactions aren't always available — maybe a source can only give you at-least-once, or a sink can't do 2PC. The workhorse fallback is explicit deduplication: stamp every event at the source with a stable unique event_id (and usually its key), then have the processor or sink remember IDs it has already applied (in keyed state, lesson 9.4, or via a unique constraint / upsert in the sink) and skip any it has seen before.

This converts an at-least-once stream into an effectively exactly-once result: duplicates still arrive, but the second copy is recognized by its (key, event_id) and dropped. It's the most broadly applicable technique because it needs only a stable ID and a place to remember it — no special transactional plumbing. The cost is the state to track seen IDs (bounded with a time window: "dedup within the last N minutes," since duplicates from retries arrive close together).

A traced example: a duplicate `order_event` that won't double-count revenue

A processor consumes ShopFlow's paid events from order_events and writes fact_revenue_1m rows to the warehouse. We want no inflated revenue even if everything that can fail, fails — the duplicate-order_event problem: a retried producer or a replay re-delivers the same paid event, and a naive pipeline would add its amount to the minute twice.

Source stamps each event with a stable event_id (e.g. "evt-9f3") — idempotent at the source, exactly the event_id field in the contract.
Producer is idempotent → even after a retry, evt-9f3 is in order_events exactly once.
Processor reads evt-9f3, adds its order's amount to the minute window, and within a transaction: upserts the fact_revenue_1m row keyed on the window/minute (idempotent sink) and commits the input offset — atomically.
Crash right after writing the row but before the offset commit is durable.
Restart: the processor restores its last checkpoint, rewinds Kafka, and replays — evt-9f3 is read again.
It recomputes the minute and writes again — but the upsert keyed on the minute overwrites the same row with the same total instead of adding a second contribution. Revenue counted once. ✓ (Where the source can't guarantee no duplicates, the processor also dedups on (order_id, event_id) so a re-delivered paid is recognized and skipped before it ever touches the window.)

Trace the failure modes: a lost ack? The idempotent producer absorbs it. A crash mid-write? The transaction/checkpoint plus the idempotent sink absorbs it. A duplicate paid from replay? The window-keyed upsert (or a dedup on (order_id, event_id)) absorbs it. Every path lands on the correct minute revenue. That's what "exactly-once" actually buys you — and why it takes four cooperating parts, not one setting.

Why it matters

Delivery semantics is where "it works in the demo" and "it's correct in production" diverge. At-least-once is the right, cheap default for anything duplicate-tolerant — and you can make almost anything duplicate-tolerant with an idempotent sink (upsert) or a dedup on (key, event_id), which is why those two tricks are the most valuable in this chapter. True end-to-end exactly-once exists (idempotent producer + transactions + consistent checkpoints + transactional sink), but it costs latency and complexity, so reach for it only where a duplicate is genuinely harmful (money, inventory, billing). The fatal mistake is assuming exactly-once because a checkbox said so, while the sink quietly appends duplicates.

Common pitfalls

Believing a checkpoint alone gives exactly-once. Checkpoints make the processor's internal state exactly-once, but on replay the sink still gets re-written. Without an idempotent or transactional sink, you have at-least-once with duplicates downstream.
Appending to the sink instead of upserting. INSERT on replay = duplicate rows. Upsert on a key (or a unique constraint) turns the replay into a harmless overwrite.
No stable event_id. Dedup needs a deterministic ID assigned at the source (not generated downstream, which changes on replay). Without it, you can't recognize a duplicate.
Unbounded dedup state. Remembering every event_id forever exhausts memory. Bound it to a time window matching how late duplicates realistically arrive.
Paying for exactly-once everywhere. It adds latency and operational cost. Use at-least-once + idempotency by default; reserve full transactions for the genuinely money-critical paths.
Assuming the source is exactly-once. Many sources (sensors, mobile, third-party webhooks) are at-least-once. Plan to dedup at your boundary regardless of what's downstream.

Checkpoint

Required checkpoint

9.5 Delivery semantics & exactly-once

Pass to unlock the Next button below

→ Going deeper: delivery guarantees rest on the checkpoints and replay of lesson 9.4 and the ISR/acks durability of lesson 9.2. How these guarantees are operated when records break — dead-letter queues, replay, lag — is Streaming operations & failure modes. Where these exactly-once results land — the lakehouse — is Schema management & streaming architecture and Chapter 10.

← Related: idempotency here is the same idempotency principle as in Ingestion & integration.

Next: Schema management & streaming architecture →

Walk me through it​

The three guarantees​

Why "exactly-once" is hard — and what it really means​

Part 1 — The idempotent producer (no duplicates into the log)​

Part 2 — Transactions (atomic multi-write across partitions)​

Part 3 — The transactional sink / two-phase commit (exactly-once to the outside world)​

Part 4 — Dedup on a key + event_id (the pragmatic fallback)​

A traced example: a duplicate order_event that won't double-count revenue​

Why it matters​

Common pitfalls​

Checkpoint​