Iceberg, Delta Lake, and Hudi

Three open table formats dominate the lakehouse: Apache Iceberg, Delta Lake, and Apache Hudi. All three implement the idea from the last lesson — metadata over files, atomic commits via a pointer swap or log append. They differ in their history, their architecture's emphasis, and what they're best at. This lesson gives you a working map of all three, then a candid view of where each shines — and which one fits ShopFlow's fact_sales (ShopFlow — see Meet ShopFlow).

They were born from different needs, and it shows:

Apache Iceberg came out of Netflix to fix correctness and scale problems with huge tables and slow list operations. Its emphasis is a clean, engine-agnostic table spec.
Delta Lake came out of Databricks (the Spark company) to add reliability to Spark on the lake. Its emphasis is a simple, ordered transaction log and tight engine integration.
Apache Hudi ("Hadoop Upserts Deletes and Incrementals") came out of Uber to make record-level updates and deletes on the lake fast. Its emphasis is streaming ingestion and upserts.

Apache Iceberg: the snapshot architecture

You already met Iceberg's metadata tree in lesson 10.2: catalog → metadata file → snapshot → manifest list → manifest files → data files. Each commit produces a new snapshot and atomically swaps the catalog's pointer to a new metadata file. Two features make Iceberg distinctive.

Hidden partitioning. In a classic lake, partitioning leaks into your data and your queries. You'd physically store fact_sales in folders like /order_date=2026-06-01/ and then analysts had to remember to filter on that exact column, written that exact way, or the engine scanned all of fact_sales. Iceberg instead records the partitioning as a transform in metadata — e.g. "partition by day(order_ts)" — and applies it for you. An analyst simply filters WHERE order_ts > '2026-06-01'; Iceberg figures out which partitions that touches. The partitioning is hidden: analysts don't need to know it exists, and they can't accidentally bypass it.

Partition evolution. Because partitioning lives in metadata rather than in the physical folder layout, you can change it without rewriting the table. Started partitioning fact_sales by month and now it's too big, so you want daily? Change the partition spec; new order lines are written daily, old data keeps its monthly layout, and Iceberg tracks both. In a folder-based lake, changing partitioning meant rewriting every fact_sales file. This is a direct payoff of "partitioning is metadata, not directory names."

:::tip Why hidden partitioning is a big deal The classic lake's number-one performance footgun is an analyst who filters on a derived column instead of the partition column (filtering WHERE order_ts > ... when fact_sales was partitioned by a separate order_date string), causing a full-table scan of every order line. Hidden partitioning removes the footgun: you filter on the real column, and the engine maps it to partitions for you. Fewer accidental full scans, and you can evolve the layout later for free. :::

Delta Lake: the transaction log and its tooling

Delta's architecture is the _delta_log transaction log from lesson 10.2: an ordered series of JSON commits, periodically compacted into Parquet checkpoints, reconstructed by replay. Concurrency is handled by optimistic concurrency control: a writer reads the current log version, does its work, then tries to commit as the next log entry; if someone else committed first, it lost the race and retries against the new state. ("Optimistic" = assume no conflict, check at commit time, retry if you were wrong — good when conflicts are rare.)

Delta is known less for its log shape than for its maintenance and layout tooling, much of which set the template the others followed:

OPTIMIZE — compacts many small files into fewer big ones (the small-files problem; lesson 10.5).
Z-ORDER — reorders rows across files so that values you commonly filter on cluster together, so file-skipping prunes more aggressively. (A Z-order curve interleaves multiple columns' values so points close in several dimensions land near each other on disk.)
Liquid clustering — a newer, automatic replacement for fixed partitioning and Z-ordering: you declare clustering keys and Delta keeps data laid out by them incrementally, without you choosing partition boundaries or re-running Z-ORDER. It adapts as data and query patterns shift.
Deletion vectors — instead of rewriting a whole Parquet file to delete a few rows (slow), Delta writes a small side file marking which rows in that file are deleted. Reads apply the vector to skip those rows; the heavy rewrite is deferred to a later compaction. This is a merge-on-read technique (lesson 10.4) and it makes deletes and updates much cheaper.

:::note UniForm: Delta that also speaks Iceberg Delta Lake's UniForm (Universal Format) writes the extra Iceberg metadata alongside Delta's, so the same data files can be read by Iceberg clients and Delta clients. It's a direct response to the 2026 reality that Iceberg has become the interoperability lingua franca — we return to it in lesson 10.5. :::

Apache Hudi: built for record-level upserts

Iceberg and Delta were designed first for appends and bulk overwrites, then grew row-level update/delete features. Hudi was designed from day one around the opposite problem: efficiently updating and deleting individual records on the lake — exactly what you need when ingesting ShopFlow's order_events stream (or CDC off the orders table, Chapter 6) where the same order_id flips placed → paid → shipped and updates the same fact_sales row again and again.

To make per-record upserts fast, Hudi maintains an index mapping each record key to the file that currently holds it. When a shipped event for order_id = K arrives, the index says "K lives in file 7," so Hudi can go straight there instead of scanning all of fact_sales to find it. This indexing is Hudi's signature strength and why it excels at high-frequency streaming upserts.

Hudi also pioneered the copy-on-write vs merge-on-read choice that every format now grapples with — important enough that lesson 10.4 is devoted to it.

Where each one is strong (in 2026)

Iceberg is, in 2026, the default open standard. Its vendor-neutral governance (Apache) and broad engine + catalog support — Snowflake, Trino, Dremio, Spark, Flink, BigQuery, plus a standard REST catalog — made it the format everyone agrees to interoperate on. When in doubt, and especially when you want freedom to switch engines, Iceberg is the safe default.
Delta Lake is strongest if you live in Spark/Databricks, where its integration and layout tooling are excellent — and with UniForm it no longer means giving up Iceberg readers.
Hudi is strongest for streaming, mutation-heavy ingestion — high-volume CDC where records are constantly upserted — thanks to its indexing and merge-on-read maturity.

For ShopFlow's fact_sales, concretely: if fact_sales is rebuilt by the nightly batch (Ch. 5/7) and read all day by dashboards — mostly appends and bulk overwrites, read-heavy — Iceberg (or Delta) is the natural fit, and Iceberg keeps every engine able to read it. If instead fact_sales is kept live by upserting straight from the order_events stream as each order flips status — write-heavy, mutation-heavy — Hudi earns its keep. Same table, two operating modes; the read/write ratio decides, which is exactly the copy-on-write vs merge-on-read trade-off in lesson 10.4.

:::tip Don't agonize over the choice For most teams the three are more alike than different — all give you ACID, time travel, and schema evolution on open files. The decision is increasingly driven by which engines and catalog you use, not by the format's intrinsic merits. We give you a real decision framework in lesson 10.5. Two emerging names to file away: Apache Paimon (a streaming-first lakehouse format, strong with Flink) and DuckLake (an approach that puts table metadata in a SQL database instead of files). :::

Why it matters

Iceberg, Delta Lake, and Hudi all implement the metadata-over-files idea, but with different DNA. Iceberg's snapshot/manifest architecture plus hidden partitioning and partition evolution make it the engine-agnostic, vendor-neutral spec that became 2026's default open standard. Delta Lake's _delta_log with optimistic concurrency is paired with strong layout tooling — OPTIMIZE, Z-ORDER, liquid clustering, deletion vectors — and shines in the Spark/Databricks world, now bridging to Iceberg via UniForm. Hudi was built for record-level upserts and deletes with an index, making it the choice for mutation-heavy streaming ingestion. They share the headline features; the differentiators are partitioning ergonomics, mutation efficiency, and engine/catalog fit. Next, the features they all promise — and the read-vs-write trade-off hiding inside every update.

Next: ACID, time travel, and schema evolution →

Apache Iceberg: the snapshot architecture​

Delta Lake: the transaction log and its tooling​

Apache Hudi: built for record-level upserts​

Where each one is strong (in 2026)​

Why it matters​

Apache Iceberg: the snapshot architecture

Delta Lake: the transaction log and its tooling

Apache Hudi: built for record-level upserts

Where each one is strong (in 2026)

Why it matters