Analytic SQL: windows, CTEs & grouping sets
Most people who say "I know SQL" mean they can write SELECT … FROM … WHERE … JOIN … GROUP BY. That is the CRUD layer — Create, Read, Update, Delete — and it is genuinely the foundation. But the SQL that actually fills a data engineer's day, and roughly half of every analytics interview, lives one floor up: window functions, CTEs, and multi-level aggregation. This lesson teaches that floor from first principles, assuming only that you've seen a basic SELECT.
We'll use one small running table the whole way through so you can trace every result by hand: a slice of ShopFlow's orders table (ShopFlow — see Meet ShopFlow). Each row is one placed order, with its customer, the timestamp it was placed (order_ts), and the order total (amount).
orders
┌──────────┬─────────────┬─────────────────────┬─────────┬────────┐
│ order_id │ customer_id │ order_ts │ status │ amount │
├──────────┼─────────────┼─────────────────────┼─────────┼────────┤
│ 1 │ 7 │ 2026-01-02 09:14 │ paid │ 100 │
│ 2 │ 7 │ 2026-01-05 16:40 │ paid │ 300 │
│ 3 │ 7 │ 2026-01-09 11:02 │ paid │ 200 │
│ 4 │ 9 │ 2026-01-03 08:55 │ paid │ 500 │
│ 5 │ 9 │ 2026-01-08 19:20 │ paid │ 400 │
└──────────┴─────────────┴─────────────────────┴─────────┴────────┘
Customer 7 placed three orders, customer 9 placed two — that's the grouping we'll trace throughout.
A quick floor-setting: joins and aggregation
A join combines rows from two tables by matching a condition (usually a shared key). The durable mental model: a join produces every pair of rows from the two sides that satisfy the ON condition.
INNER JOINkeeps only matched pairs.LEFT JOINkeeps every row from the left side, fillingNULLwhere the right has no match (the workhorse for "all customers, with their orders if any").FULL OUTER JOINkeeps unmatched rows from both sides.
Aggregation collapses many rows into one summary row per group. GROUP BY customer_id makes one output row per customer; aggregate functions (SUM, COUNT, AVG, MIN, MAX) summarize the rows inside each group.
SELECT customer_id, SUM(amount) AS total, COUNT(*) AS n_orders
FROM orders
GROUP BY customer_id;
-- 7 │ 600 │ 3
-- 9 │ 900 │ 2
The key limitation to feel: GROUP BY destroys the individual rows. After grouping, you can't also see each order_id — the rows are gone, replaced by one summary per group. That limitation is exactly what window functions remove.
:::note WHERE vs HAVING
WHERE filters rows before grouping; HAVING filters groups after aggregation. WHERE amount > 100 drops small orders; HAVING SUM(amount) > 700 drops small customers. Mixing them up is one of the most common SQL bugs.
:::
The single most common ShopFlow aggregate is revenue per day: bucket orders by the date of order_ts and sum amount. (CAST(order_ts AS DATE) collapses each timestamp to its calendar day.)
SELECT CAST(order_ts AS DATE) AS order_date, SUM(amount) AS revenue
FROM orders
WHERE status <> 'cancelled' -- don't count cancelled orders as revenue
GROUP BY CAST(order_ts AS DATE)
ORDER BY order_date;
-- 2026-01-02 │ 100
-- 2026-01-03 │ 500
-- 2026-01-05 │ 300
-- 2026-01-08 │ 400
-- 2026-01-09 │ 200
This exact daily-revenue rollup reappears as a Spark batch job in Chapter 5 and as a per-minute streaming aggregate in Chapter 9 — same measure, different engines.
Window functions: aggregate without collapsing rows
A window function computes a value across a set of rows related to the current row, without collapsing them. You keep every original row and get a computed column alongside it. That single capability — "summarize, but keep the detail" — is why window functions are everywhere in data engineering.
The anatomy of a window function:
function() OVER (
PARTITION BY <columns> -- split rows into independent groups (like GROUP BY, but rows survive)
ORDER BY <columns> -- define an ordering within each partition
<frame> -- which rows around the current one to include (optional)
)
The clause after OVER is the window: the set of rows the function can see for the current row. PARTITION BY divides the data into groups that are computed independently; ORDER BY sequences rows inside each partition; the frame narrows the window to a sliding range.
Ranking functions: ROW_NUMBER, RANK, DENSE_RANK
These number rows within each partition, ordered by something. They are the single most-used window functions in DE.
SELECT
customer_id, order_id, amount,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY amount DESC) AS rn,
RANK() OVER (PARTITION BY customer_id ORDER BY amount DESC) AS rnk,
DENSE_RANK() OVER (PARTITION BY customer_id ORDER BY amount DESC) AS dense
FROM orders;
Traced by hand, partitioning by customer_id and ordering by amount descending:
customer_id │ order_id │ amount │ rn │ rnk │ dense
7 │ 2 │ 300 │ 1 │ 1 │ 1
7 │ 3 │ 200 │ 2 │ 2 │ 2
7 │ 1 │ 100 │ 3 │ 3 │ 3
9 │ 4 │ 500 │ 1 │ 1 │ 1
9 │ 5 │ 400 │ 2 │ 2 │ 2
The differences only show up on ties (rows with equal ORDER BY values):
ROW_NUMBERalways gives distinct numbers1,2,3,…— even for ties, it breaks them arbitrarily.RANKgives ties the same rank, then skips:1,1,3.DENSE_RANKgives ties the same rank with no gap:1,1,2.
Use ROW_NUMBER for "give me exactly one row per group" — the largest order per customer, the latest record per key. This is the engine of deduplication, which the next lesson builds on directly.
LAG and LEAD: reach to other rows
LAG(col, n) reads the value of col from n rows before the current row in the window; LEAD(col, n) reads n rows after. They are how you compute "change since last time" without a self-join.
SELECT
customer_id, order_ts, amount,
LAG(amount) OVER (PARTITION BY customer_id ORDER BY order_ts) AS prev_amount,
amount - LAG(amount) OVER (PARTITION BY customer_id ORDER BY order_ts) AS delta
FROM orders;
-- 7 │ 2026-01-02 │ 100 │ NULL │ NULL (no prior order)
-- 7 │ 2026-01-05 │ 300 │ 100 │ 200
-- 7 │ 2026-01-09 │ 200 │ 300 │ -100
-- 9 │ 2026-01-03 │ 500 │ NULL │ NULL
-- 9 │ 2026-01-08 │ 400 │ 500 │ -100
The first order of each customer has no predecessor, so LAG returns NULL. This pattern — order-over-order deltas, time between purchases, sessionization, gap detection — is bread-and-butter analytics.
Frames: running totals and moving averages
By default, an ORDER BY window includes every row from the start of the partition up to the current row — which gives you a running total for free:
SELECT
customer_id, order_ts, amount,
SUM(amount) OVER (PARTITION BY customer_id ORDER BY order_ts
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running_total
FROM orders;
-- 7 │ 2026-01-02 │ 100 │ 100 (100)
-- 7 │ 2026-01-05 │ 300 │ 400 (100+300)
-- 7 │ 2026-01-09 │ 200 │ 600 (100+300+200)
-- 9 │ 2026-01-03 │ 500 │ 500
-- 9 │ 2026-01-08 │ 400 │ 900
This running total per customer is exactly how you'd compute a customer's cumulative spend over time (the basis for lifetime_value in Chapter 4). The ROWS BETWEEN … AND … clause is the frame — it bounds which rows the aggregate sees. Common frames:
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW→ running total (the default for an ordered window).ROWS BETWEEN 2 PRECEDING AND CURRENT ROW→ a 3-row moving window (e.g. a 3-order moving-average basket size).ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING→ the whole partition (e.g. each order's share of its customer's total spend).
:::warning ROWS vs RANGE — a real footgun
ROWS counts physical rows; RANGE counts logical peers that tie on the ORDER BY value. With duplicate order keys, RANGE lumps the ties into one frame step and they can all get the same running total — usually not what you want. Default to ROWS unless you specifically need RANGE semantics, and always state the frame explicitly; the implicit default frame for an ordered window is RANGE, which surprises people.
:::
Common Table Expressions (CTEs)
A Common Table Expression is a named, temporary result set defined with WITH, used to break a complex query into readable, named steps. It's the difference between one unreadable 200-line query and a pipeline you can follow top to bottom.
WITH per_customer AS (
SELECT customer_id, SUM(amount) AS total
FROM orders
GROUP BY customer_id
),
ranked AS (
SELECT customer_id, total,
RANK() OVER (ORDER BY total DESC) AS rnk
FROM per_customer
)
SELECT * FROM ranked WHERE rnk = 1;
-- 9 │ 900 │ 1 ← ShopFlow's top customer by total spend
Each WITH block is a named step you can read in order: summarize → rank → filter. CTEs don't (by themselves) make a query faster — they make it readable and maintainable, which is why production SQL and dbt models lean on them heavily (Chapter 7).
:::note CTE vs subquery
A subquery is a query nested inside another (in FROM, WHERE, or SELECT). A CTE is the same idea, named and pulled to the top. Anything a CTE does, a subquery can do — CTEs just read far better when steps build on each other. One caveat: in some engines a CTE referenced multiple times may be re-computed each time (not cached), so a CTE isn't automatically a performance win.
:::
Recursive CTEs: querying hierarchies
A recursive CTE references itself to walk a hierarchy or generate a sequence — the one thing plain SQL otherwise can't do. It has two parts joined by UNION ALL: a base case (the starting rows) and a recursive step (rows derived from the previous iteration), which repeats until it produces no new rows.
ShopFlow's four source tables are deliberately flat (orders → items → products, no self-reference), so the canonical recursive example is a hierarchy table like an org chart or a category tree — picture ShopFlow growing a product_categories table where Electronics → Phones → Accessories nest. The shape is identical:
WITH RECURSIVE org AS (
-- base case: the top of the tree
SELECT employee_id, manager_id, name, 1 AS level
FROM employees
WHERE manager_id IS NULL
UNION ALL
-- recursive step: everyone who reports to someone already in `org`
SELECT e.employee_id, e.manager_id, e.name, o.level + 1
FROM employees e
JOIN org o ON e.manager_id = o.employee_id
)
SELECT * FROM org ORDER BY level;
Each iteration finds the next layer of reports and stamps its depth. Recursive CTEs handle org charts, category trees, bill-of-materials explosions, and graph traversals. Always make sure the recursion terminates — a cycle in the data (A manages B manages A) loops forever unless the engine has a depth guard.
Multi-level aggregates: GROUPING SETS, ROLLUP, CUBE
Often you want totals at several levels in one query — per product category, and a grand total; per (category, product), and per category, and overall. Running separate GROUP BY queries and UNION-ing them works but is verbose. SQL has dedicated syntax:
GROUPING SETS— you list exactly the grouping combinations you want.ROLLUP(a, b)— hierarchical subtotals:(a,b), then(a), then()the grand total. For drill-down hierarchies (category → product).CUBE(a, b)— every combination:(a,b),(a),(b),(). For cross-tab "all dimensions" reports.
To get product-level revenue we need the finest grain, which means joining orders to its line items and the product catalog (ShopFlow — see Meet ShopFlow). The revenue of a line item is quantity × unit_price captured at sale time (order_items.unit_price):
SELECT p.category, SUM(oi.quantity * oi.unit_price) AS revenue
FROM orders o
JOIN order_items oi ON oi.order_id = o.order_id
JOIN products p ON p.product_id = oi.product_id
WHERE o.status <> 'cancelled'
GROUP BY ROLLUP(p.category);
-- Books │ 1200
-- Electronics │ 300
-- NULL │ 1500 ← the grand-total row; NULL marks "all categories"
The NULL in a grouping column marks "this column was rolled up." If your real data also contains NULLs, use the GROUPING() function to tell a real null from a subtotal null. ROLLUP(p.category, p.name) would add a subtotal row per product underneath each category. These are how BI tools generate subtotal rows; knowing the SQL means you can produce the same output in a pipeline instead of pushing the work to the dashboard. This three-table join — orders × order_items × products — is the core of ShopFlow's fact_sales table you'll model in Chapter 4.
Why it matters
SELECT/JOIN/GROUP BY reads data; window functions, CTEs, and multi-level aggregates shape it — and that shaping is the substance of analytics engineering. Window functions let you summarize while keeping detail (rankings, running totals, period-over-period deltas, and — crucially — "one row per key" deduplication). CTEs turn sprawling queries into readable named steps, and recursive CTEs walk hierarchies. ROLLUP/CUBE/GROUPING SETS produce subtotals in one pass. Skip these and you'll write ten clumsy queries (and fail the interview) where one clean query would do.
Common pitfalls
- Reaching for
GROUP BYwhen you mean a window. "Largest order per customer while keeping the row" isROW_NUMBER, notGROUP BY. If you find yourself self-joining a table back to its own aggregate, you almost certainly want a window function. - Forgetting the frame. An ordered window without an explicit
ROWS …frame uses the implicitRANGEdefault, which mishandles ties. State the frame. - Assuming a CTE is cached / faster. It's primarily a readability tool; some engines re-evaluate it on each reference.
- Unbounded recursion. A cycle in hierarchical data loops forever — guard with a depth limit.
WHEREon a window result. You can't filter on a window function in the sameWHERE(the window is computed afterWHERE). Wrap it in a CTE/subquery — or useQUALIFY, covered in the next lesson.
Cross-links
- Next: Idempotent SQL & semi-structured data → —
QUALIFY ROW_NUMBER()deduplication builds directly on the ranking windows here. - Builds on: Storage & file formats — these queries run over the columnar files from Chapter 2.
- Leads to: Inside a query engine — how the engine actually executes the SQL you just wrote; and Transformation & dbt, where CTE-heavy SQL becomes versioned models.
Checkpoint
Using the orders table at the top:
- Write a query that returns, per customer, only that customer's single highest order (keep
order_idandamount). Which window function and value do you filter on? - What does
LAG(amount) OVER (PARTITION BY customer_id ORDER BY order_ts)return for the first order of each customer, and why? - You wrote
SUM(amount) OVER (ORDER BY order_ts)and got a running total, but ties onorder_tsall share the same total. What frame clause fixes it?
Answers
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY amount DESC), then keep rows where it equals1. (RANKwould return all tied top orders;ROW_NUMBERguarantees exactly one.)NULL— the first row in a partition has no preceding row forLAGto read.- The implicit default frame is
RANGE, which groups order-key ties together. Switch toROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROWfor a strict row-by-row running total.