Jinja, macros, packages & testing
You've already seen Jinja sneaking into models — {{ ref(...) }}, {% if is_incremental() %}, {{ this }}. This lesson explains what Jinja actually is, how to package repeated logic into macros, how to borrow community code with packages, and then turns to the feature that makes dbt trustworthy: testing. These two themes — DRY templating and automated tests — are what elevate dbt from "SQL files" to "software engineering for data."
Jinja: SQL that can write SQL
Jinja is a templating language (borrowed from the Python world) that dbt runs over your .sql files before sending them to the warehouse. It lets you put programming constructs — variables, conditionals, loops, function calls — inside your SQL. The warehouse never sees Jinja; dbt evaluates it during compilation and emits plain SQL (the compiled-SQL mental model again).
Jinja has two delimiters, and the distinction is worth memorizing:
{{ ... }}— an expression. It evaluates to text that gets substituted into the SQL.{{ ref('stg_orders') }}becomes a table name;{{ this }}becomes the current model's name.{% ... %}— a statement. It controls logic but produces no output itself:{% if ... %},{% for ... %},{% set ... %}.
Here's a loop that writes repetitive SQL for you — pivoting ShopFlow's order amount across its status values (ShopFlow — see Meet ShopFlow):
select
customer_id,
{% for status in ['placed', 'paid', 'shipped', 'cancelled'] %}
sum(case when status = '{{ status }}' then order_amount else 0 end)
as {{ status }}_amount
{% if not loop.last %},{% endif %}
{% endfor %}
from {{ ref('stg_orders') }}
group by customer_id
That {% for %} loop compiles to four nearly-identical sum(case when ...) lines — you wrote the pattern once, Jinja expanded it. (loop.last is a Jinja helper so the last item skips the trailing comma.) This is the value of Jinja: stop copy-pasting SQL.
Macros: reusable SQL functions
A macro is a named, reusable chunk of Jinja-templated SQL — essentially a function for SQL. You define it once in the macros/ folder and call it from any model. For ShopFlow the natural one is line_revenue — quantity × unit_price, the measure that defines fact_sales and that you don't want re-typed (and possibly mis-typed) in every model that needs it:
-- macros/line_revenue.sql
{% macro line_revenue(qty_col, price_col) %}
({{ qty_col }} * {{ price_col }})::numeric(12, 2)
{% endmacro %}
-- in fact_sales.sql:
select
order_id,
product_id,
{{ line_revenue('quantity', 'unit_price') }} as line_revenue
from {{ ref('stg_order_items') }}
{{ line_revenue('quantity', 'unit_price') }} compiles to (quantity * unit_price)::numeric(12, 2). Now the revenue rule lives in one place. If finance ever redefines line revenue (say, net of a discount column), you fix one macro and every model updates. That's DRY — Don't Repeat Yourself — applied to SQL. (In fact, ref, source, and is_incremental are themselves macros that ship with dbt.)
Packages: borrow, don't reinvent
A package is a bundle of models and macros someone else wrote that you install into your project — like a library in any other language. You list packages in packages.yml and run dbt deps to install them. The most important one to know by name is dbt_utils, a community toolbox of battle-tested macros for things you'd otherwise hand-roll badly:
# packages.yml
packages:
- package: dbt-labs/dbt_utils
version: 1.3.0
-- generate a surrogate key for the order-line grain by hashing its columns
select
{{ dbt_utils.generate_surrogate_key(['order_id', 'product_id']) }} as sales_key,
...
from {{ ref('stg_order_items') }}
dbt_utils gives you macros for surrogate keys, date spines, deduplication, pivoting, and dozens more — all maintained and cross-warehouse-safe. The rule: before you write a clever macro, check whether dbt_utils already has it.
The danger of over-templating
Now the honest warning, because this is a real failure mode. Jinja is powerful enough to make SQL unreadable. A model so wrapped in loops, conditionals, and nested macros that you can't tell what SQL it produces without compiling it has defeated dbt's main virtue — that transformations are readable SQL anyone can follow.
:::caution Don't out-clever your team
Reach for Jinja to remove genuine repetition, not to show off. If a macro is only used once, it's probably just hiding SQL — inline it. If a model needs five nested {% if %}s to understand, it's too clever. The test: could a SQL-literate analyst read the model and roughly predict the compiled output? If not, simplify. Readable SQL beats DRY SQL when they conflict.
:::
Testing: making data trustworthy
Templating makes models DRY; tests make them trustworthy. A dbt test is an assertion about your data that dbt checks by running a query: the test passes if the query returns zero rows (zero rows = nothing violates the assertion). Run them with dbt test. There are two kinds.
Generic tests (the four you'll use constantly)
A generic test is a reusable, parameterized assertion you attach to a column in a YAML file — no SQL to write. Four ship with dbt and cover most needs:
# models/staging/_staging.yml (ShopFlow — see Meet ShopFlow)
models:
- name: stg_orders
columns:
- name: order_id
tests:
- unique # no two orders share this id
- not_null # every order has an id
- name: status
tests:
- accepted_values: # value is one of ShopFlow's allowed statuses
values: ['placed', 'paid', 'shipped', 'cancelled']
- name: stg_order_items
columns:
- name: product_id
tests:
- relationships: # every line's product exists in products
to: ref('stg_products')
field: product_id
unique— the column has no duplicate values (great for primary keys;order_idonstg_orderscatches the duplicate-row bug from a badappendincremental load).not_null— the column is never null (a ShopFlow order line without anorder_idwould be unjoinable).accepted_values— every value is in an allowed set (catches a surprise newstatusthe ShopFlow app started sending beyondplaced/paid/shipped/cancelled).relationships— referential integrity: every value here has a matching row in another model (catches orphaned foreign keys — anorder_itemsline pointing at aproduct_idthat doesn't exist inproducts).
Behind the scenes each generic test compiles to a SELECT that finds violations. not_null on order_id compiles to roughly select * from stg_orders where order_id is null — if that returns any rows, the test fails and names them.
Singular tests (one-off SQL assertions)
When an assertion is too specific for a generic test, write a singular test — a .sql file in tests/ containing a SELECT that returns the bad rows. If it returns nothing, the test passes.
-- tests/assert_no_future_orders.sql
-- A ShopFlow order can't be placed in the future. Return any that are.
select order_id, order_ts
from {{ ref('stg_orders') }}
where order_ts > current_timestamp
That's it — any business rule you can express as "find the rows that violate this" becomes a test.
Shift-left: where data tests belong
The strategic point is when these tests run. "Shift left" means moving quality checks earlier — closer to where data enters and is transformed — instead of discovering bad data downstream when an executive's dashboard looks wrong. dbt tests run as part of the build, in CI before a change merges and on every scheduled production run, so a broken assumption is caught at the source, not in a board meeting.
dbt build runs models and their tests together in DAG order, so a model whose tests fail can stop its downstream models from building on bad data. This is the data-engineering version of "tests gate the deploy" — and it's the foundation of the data-quality work in Chapter 11. Tooling like Elementary layers dbt-native observability on top of these tests (anomaly detection, alerting); the durable idea underneath is just assert things about your data, automatically, on every run.
Why it matters
Jinja lets SQL write SQL — {{ }} for substituted expressions, {% %} for logic — so you can loop, branch, and DRY up transformations; macros package repeated logic into reusable functions, and packages like dbt_utils let you borrow proven ones instead of reinventing them. But power invites over-templating: keep models readable enough that an analyst can predict the compiled SQL. Tests make the output trustworthy: generic tests (unique, not_null, accepted_values, relationships) assert column-level rules in YAML, singular tests assert any "find the bad rows" query, and running them shift-left in the build means bad data fails the pipeline instead of reaching a dashboard. Next we'll handle the inputs and history side: declaring sources, checking their freshness, capturing slowly-changing history with snapshots, and turning the whole DAG into living documentation.