Skip to main content

Analytic SQL: windows, CTEs & grouping sets

Most people who say "I know SQL" mean they can write SELECT … FROM … WHERE … JOIN … GROUP BY. That is the CRUD layer — Create, Read, Update, Delete — and it is genuinely the foundation. But the SQL that actually fills a data engineer's day, and roughly half of every analytics interview, lives one floor up: window functions, CTEs, and multi-level aggregation. This lesson teaches that floor from first principles, assuming only that you've seen a basic SELECT.

We'll use one small running table the whole way through so you can trace every result by hand: a slice of ShopFlow's orders table (ShopFlow — see Meet ShopFlow). Each row is one placed order, with its customer, the timestamp it was placed (order_ts), and the order total (amount).

orders
┌──────────┬─────────────┬─────────────────────┬─────────┬────────┐
│ order_id │ customer_id │ order_ts │ status │ amount │
├──────────┼─────────────┼─────────────────────┼─────────┼────────┤
│ 1 │ 7 │ 2026-01-02 09:14 │ paid │ 100 │
│ 2 │ 7 │ 2026-01-05 16:40 │ paid │ 300 │
│ 3 │ 7 │ 2026-01-09 11:02 │ paid │ 200 │
│ 4 │ 9 │ 2026-01-03 08:55 │ paid │ 500 │
│ 5 │ 9 │ 2026-01-08 19:20 │ paid │ 400 │
└──────────┴─────────────┴─────────────────────┴─────────┴────────┘

Customer 7 placed three orders, customer 9 placed two — that's the grouping we'll trace throughout.

A quick floor-setting: joins and aggregation

A join combines rows from two tables by matching a condition (usually a shared key). The durable mental model: a join produces every pair of rows from the two sides that satisfy the ON condition.

  • INNER JOIN keeps only matched pairs.
  • LEFT JOIN keeps every row from the left side, filling NULL where the right has no match (the workhorse for "all customers, with their orders if any").
  • FULL OUTER JOIN keeps unmatched rows from both sides.

Aggregation collapses many rows into one summary row per group. GROUP BY customer_id makes one output row per customer; aggregate functions (SUM, COUNT, AVG, MIN, MAX) summarize the rows inside each group.

SELECT customer_id, SUM(amount) AS total, COUNT(*) AS n_orders
FROM orders
GROUP BY customer_id;
-- 7 │ 600 │ 3
-- 9 │ 900 │ 2

The key limitation to feel: GROUP BY destroys the individual rows. After grouping, you can't also see each order_id — the rows are gone, replaced by one summary per group. That limitation is exactly what window functions remove.

:::note WHERE vs HAVING WHERE filters rows before grouping; HAVING filters groups after aggregation. WHERE amount > 100 drops small orders; HAVING SUM(amount) > 700 drops small customers. Mixing them up is one of the most common SQL bugs. :::

Revenue by day — the most-run query at ShopFlow

The single most common ShopFlow aggregate is revenue per day: bucket orders by the date of order_ts and sum amount. (CAST(order_ts AS DATE) collapses each timestamp to its calendar day.)

SELECT CAST(order_ts AS DATE) AS order_date, SUM(amount) AS revenue
FROM orders
WHERE status <> 'cancelled' -- don't count cancelled orders as revenue
GROUP BY CAST(order_ts AS DATE)
ORDER BY order_date;
-- 2026-01-02 │ 100
-- 2026-01-03 │ 500
-- 2026-01-05 │ 300
-- 2026-01-08 │ 400
-- 2026-01-09 │ 200

This exact daily-revenue rollup reappears as a Spark batch job in Chapter 5 and as a per-minute streaming aggregate in Chapter 9 — same measure, different engines.

Window functions: aggregate without collapsing rows

A window function computes a value across a set of rows related to the current row, without collapsing them. You keep every original row and get a computed column alongside it. That single capability — "summarize, but keep the detail" — is why window functions are everywhere in data engineering.

The anatomy of a window function:

function() OVER (
PARTITION BY <columns> -- split rows into independent groups (like GROUP BY, but rows survive)
ORDER BY <columns> -- define an ordering within each partition
<frame> -- which rows around the current one to include (optional)
)

The clause after OVER is the window: the set of rows the function can see for the current row. PARTITION BY divides the data into groups that are computed independently; ORDER BY sequences rows inside each partition; the frame narrows the window to a sliding range.

Ranking functions: ROW_NUMBER, RANK, DENSE_RANK

These number rows within each partition, ordered by something. They are the single most-used window functions in DE.

SELECT
customer_id, order_id, amount,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY amount DESC) AS rn,
RANK() OVER (PARTITION BY customer_id ORDER BY amount DESC) AS rnk,
DENSE_RANK() OVER (PARTITION BY customer_id ORDER BY amount DESC) AS dense
FROM orders;

Traced by hand, partitioning by customer_id and ordering by amount descending:

customer_id │ order_id │ amount │ rn │ rnk │ dense
7 │ 2 │ 300 │ 1 │ 1 │ 1
7 │ 3 │ 200 │ 2 │ 2 │ 2
7 │ 1 │ 100 │ 3 │ 3 │ 3
9 │ 4 │ 500 │ 1 │ 1 │ 1
9 │ 5 │ 400 │ 2 │ 2 │ 2

The differences only show up on ties (rows with equal ORDER BY values):

  • ROW_NUMBER always gives distinct numbers 1,2,3,… — even for ties, it breaks them arbitrarily.
  • RANK gives ties the same rank, then skips: 1,1,3.
  • DENSE_RANK gives ties the same rank with no gap: 1,1,2.

Use ROW_NUMBER for "give me exactly one row per group" — the largest order per customer, the latest record per key. This is the engine of deduplication, which the next lesson builds on directly.

LAG and LEAD: reach to other rows

LAG(col, n) reads the value of col from n rows before the current row in the window; LEAD(col, n) reads n rows after. They are how you compute "change since last time" without a self-join.

SELECT
customer_id, order_ts, amount,
LAG(amount) OVER (PARTITION BY customer_id ORDER BY order_ts) AS prev_amount,
amount - LAG(amount) OVER (PARTITION BY customer_id ORDER BY order_ts) AS delta
FROM orders;
-- 7 │ 2026-01-02 │ 100 │ NULL │ NULL (no prior order)
-- 7 │ 2026-01-05 │ 300 │ 100 │ 200
-- 7 │ 2026-01-09 │ 200 │ 300 │ -100
-- 9 │ 2026-01-03 │ 500 │ NULL │ NULL
-- 9 │ 2026-01-08 │ 400 │ 500 │ -100

The first order of each customer has no predecessor, so LAG returns NULL. This pattern — order-over-order deltas, time between purchases, sessionization, gap detection — is bread-and-butter analytics.

Frames: running totals and moving averages

By default, an ORDER BY window includes every row from the start of the partition up to the current row — which gives you a running total for free:

SELECT
customer_id, order_ts, amount,
SUM(amount) OVER (PARTITION BY customer_id ORDER BY order_ts
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running_total
FROM orders;
-- 7 │ 2026-01-02 │ 100 │ 100 (100)
-- 7 │ 2026-01-05 │ 300 │ 400 (100+300)
-- 7 │ 2026-01-09 │ 200 │ 600 (100+300+200)
-- 9 │ 2026-01-03 │ 500 │ 500
-- 9 │ 2026-01-08 │ 400 │ 900

This running total per customer is exactly how you'd compute a customer's cumulative spend over time (the basis for lifetime_value in Chapter 4). The ROWS BETWEEN … AND … clause is the frame — it bounds which rows the aggregate sees. Common frames:

  • ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW → running total (the default for an ordered window).
  • ROWS BETWEEN 2 PRECEDING AND CURRENT ROW → a 3-row moving window (e.g. a 3-order moving-average basket size).
  • ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING → the whole partition (e.g. each order's share of its customer's total spend).

:::warning ROWS vs RANGE — a real footgun ROWS counts physical rows; RANGE counts logical peers that tie on the ORDER BY value. With duplicate order keys, RANGE lumps the ties into one frame step and they can all get the same running total — usually not what you want. Default to ROWS unless you specifically need RANGE semantics, and always state the frame explicitly; the implicit default frame for an ordered window is RANGE, which surprises people. :::

Common Table Expressions (CTEs)

A Common Table Expression is a named, temporary result set defined with WITH, used to break a complex query into readable, named steps. It's the difference between one unreadable 200-line query and a pipeline you can follow top to bottom.

WITH per_customer AS (
SELECT customer_id, SUM(amount) AS total
FROM orders
GROUP BY customer_id
),
ranked AS (
SELECT customer_id, total,
RANK() OVER (ORDER BY total DESC) AS rnk
FROM per_customer
)
SELECT * FROM ranked WHERE rnk = 1;
-- 9 │ 900 │ 1 ← ShopFlow's top customer by total spend

Each WITH block is a named step you can read in order: summarize → rank → filter. CTEs don't (by themselves) make a query faster — they make it readable and maintainable, which is why production SQL and dbt models lean on them heavily (Chapter 7).

:::note CTE vs subquery A subquery is a query nested inside another (in FROM, WHERE, or SELECT). A CTE is the same idea, named and pulled to the top. Anything a CTE does, a subquery can do — CTEs just read far better when steps build on each other. One caveat: in some engines a CTE referenced multiple times may be re-computed each time (not cached), so a CTE isn't automatically a performance win. :::

Recursive CTEs: querying hierarchies

A recursive CTE references itself to walk a hierarchy or generate a sequence — the one thing plain SQL otherwise can't do. It has two parts joined by UNION ALL: a base case (the starting rows) and a recursive step (rows derived from the previous iteration), which repeats until it produces no new rows.

ShopFlow's four source tables are deliberately flat (orders → items → products, no self-reference), so the canonical recursive example is a hierarchy table like an org chart or a category tree — picture ShopFlow growing a product_categories table where Electronics → Phones → Accessories nest. The shape is identical:

WITH RECURSIVE org AS (
-- base case: the top of the tree
SELECT employee_id, manager_id, name, 1 AS level
FROM employees
WHERE manager_id IS NULL
UNION ALL
-- recursive step: everyone who reports to someone already in `org`
SELECT e.employee_id, e.manager_id, e.name, o.level + 1
FROM employees e
JOIN org o ON e.manager_id = o.employee_id
)
SELECT * FROM org ORDER BY level;

Each iteration finds the next layer of reports and stamps its depth. Recursive CTEs handle org charts, category trees, bill-of-materials explosions, and graph traversals. Always make sure the recursion terminates — a cycle in the data (A manages B manages A) loops forever unless the engine has a depth guard.

Multi-level aggregates: GROUPING SETS, ROLLUP, CUBE

Often you want totals at several levels in one query — per product category, and a grand total; per (category, product), and per category, and overall. Running separate GROUP BY queries and UNION-ing them works but is verbose. SQL has dedicated syntax:

  • GROUPING SETS — you list exactly the grouping combinations you want.
  • ROLLUP(a, b) — hierarchical subtotals: (a,b), then (a), then () the grand total. For drill-down hierarchies (category → product).
  • CUBE(a, b)every combination: (a,b), (a), (b), (). For cross-tab "all dimensions" reports.

To get product-level revenue we need the finest grain, which means joining orders to its line items and the product catalog (ShopFlow — see Meet ShopFlow). The revenue of a line item is quantity × unit_price captured at sale time (order_items.unit_price):

SELECT p.category, SUM(oi.quantity * oi.unit_price) AS revenue
FROM orders o
JOIN order_items oi ON oi.order_id = o.order_id
JOIN products p ON p.product_id = oi.product_id
WHERE o.status <> 'cancelled'
GROUP BY ROLLUP(p.category);
-- Books │ 1200
-- Electronics │ 300
-- NULL │ 1500 ← the grand-total row; NULL marks "all categories"

The NULL in a grouping column marks "this column was rolled up." If your real data also contains NULLs, use the GROUPING() function to tell a real null from a subtotal null. ROLLUP(p.category, p.name) would add a subtotal row per product underneath each category. These are how BI tools generate subtotal rows; knowing the SQL means you can produce the same output in a pipeline instead of pushing the work to the dashboard. This three-table join — orders × order_items × products — is the core of ShopFlow's fact_sales table you'll model in Chapter 4.

Why it matters

SELECT/JOIN/GROUP BY reads data; window functions, CTEs, and multi-level aggregates shape it — and that shaping is the substance of analytics engineering. Window functions let you summarize while keeping detail (rankings, running totals, period-over-period deltas, and — crucially — "one row per key" deduplication). CTEs turn sprawling queries into readable named steps, and recursive CTEs walk hierarchies. ROLLUP/CUBE/GROUPING SETS produce subtotals in one pass. Skip these and you'll write ten clumsy queries (and fail the interview) where one clean query would do.

Common pitfalls

  • Reaching for GROUP BY when you mean a window. "Largest order per customer while keeping the row" is ROW_NUMBER, not GROUP BY. If you find yourself self-joining a table back to its own aggregate, you almost certainly want a window function.
  • Forgetting the frame. An ordered window without an explicit ROWS … frame uses the implicit RANGE default, which mishandles ties. State the frame.
  • Assuming a CTE is cached / faster. It's primarily a readability tool; some engines re-evaluate it on each reference.
  • Unbounded recursion. A cycle in hierarchical data loops forever — guard with a depth limit.
  • WHERE on a window result. You can't filter on a window function in the same WHERE (the window is computed after WHERE). Wrap it in a CTE/subquery — or use QUALIFY, covered in the next lesson.

Checkpoint

Using the orders table at the top:

  1. Write a query that returns, per customer, only that customer's single highest order (keep order_id and amount). Which window function and value do you filter on?
  2. What does LAG(amount) OVER (PARTITION BY customer_id ORDER BY order_ts) return for the first order of each customer, and why?
  3. You wrote SUM(amount) OVER (ORDER BY order_ts) and got a running total, but ties on order_ts all share the same total. What frame clause fixes it?
Answers
  1. ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY amount DESC), then keep rows where it equals 1. (RANK would return all tied top orders; ROW_NUMBER guarantees exactly one.)
  2. NULL — the first row in a partition has no preceding row for LAG to read.
  3. The implicit default frame is RANGE, which groups order-key ties together. Switch to ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW for a strict row-by-row running total.