If CTEs and Subqueries confuse you, read this
Hey — Tom here.
CTEs and subqueries are similar things: they let you perform "on the fly" subsets of data and integrate them into queries like you would with tables.
So why do we have two ways to do it?
More importantly, when should you use each one?
Here's what we're covering this week:
- What CTEs and subqueries are
- The readability vs. performance trade-off
- Performance considerations
- A framework I use for deciding when to use CTE vs subquery
Let's get into it...
The Core Difference
A subquery is SQL inside SQL — you nest one SELECT statement inside another. The database processes it from the inside out.
A CTE (Common Table Expression) is a named result set you define at the top of your query with WITH.
They often produce identical results, but the optimiser may treat them differently depending on the database engine.
Example — subquery (derived table):
SELECT customer_id, total_spend
FROM (
SELECT customer_id, SUM(order_value) as total_spend
FROM orders
WHERE order_date >= '2025-01-01'
GROUP BY customer_id
) as customer_totals
WHERE total_spend > 1000
Example — CTE:
WITH customer_totals AS
(
SELECT customer_id, SUM(order_value) as total_spend
FROM orders
WHERE order_date >= '2025-01-01'
GROUP BY customer_id
)
SELECT customer_id, total_spend
FROM customer_totals
WHERE total_spend > 1000
Same output. Different approach.
When Subqueries Make Sense
Use subqueries for simple, one-off logic.
If you need to filter based on a quick calculation and you’ll never reference it again, a subquery keeps things concise.
SELECT product_name, price
FROM products
WHERE price > (SELECT AVG(price) FROM products)
This answers: "Show me products above average price."
You don't need to name this calculation.
You don't need to see it again.
Subquery wins.
Subqueries work well in WHERE clauses for filtering.
SELECT name, department
FROM employees
WHERE department_id IN (
SELECT department_id
FROM departments
WHERE location = 'London'
)
The nested logic makes sense here because the subquery acts as a filter condition, not a data transformation you'll reuse.
When CTEs Are Superior
CTEs shine when you’re building logic step by step — especially for analytics or reporting queries.
Imagine you need to:
-
Calculate monthly spend per customer
-
Rank customers by spend
-
Identify the top 10% of spenders
-
Calculate their average order frequency
Doing that with nested subqueries gets messy fast.
With CTEs, you can build it in stages:
WITH monthly_spend AS (
SELECT
customer_id,
DATE_TRUNC('month', order_date) as month,
SUM(order_value) as spend
FROM orders
GROUP BY customer_id, month
),
ranked_customers AS (
SELECT
customer_id,
spend,
NTILE(10) OVER (ORDER BY spend DESC) as spend_decile
FROM monthly_spend
),
top_spenders AS (
SELECT customer_id
FROM ranked_customers
WHERE spend_decile = 1
)
SELECT
t.customer_id,
COUNT(o.order_id) / COUNT(DISTINCT DATE_TRUNC('month', o.order_date)) as avg_orders_per_month
FROM top_spenders t
JOIN orders o ON t.customer_id = o.customer_id
GROUP BY t.customer_id
Each step is readable, testable, and easy to modify.
That’s maintainable SQL.
Reuse and Maintainability
If you reference the same logic more than once, a CTE keeps your query cleaner:
WITH high_value_customers AS (
SELECT customer_id
FROM orders
GROUP BY customer_id
HAVING SUM(order_value) > 5000
)
SELECT
(SELECT COUNT(*) FROM high_value_customers) as total_high_value,
(SELECT COUNT(DISTINCT o.product_id)
FROM orders o
JOIN high_value_customers h ON o.customer_id = h.customer_id) as products_bought_by_high_value
But remember — not every database computes a CTE once.
In BigQuery, for instance, the same CTE referenced twice may be re-executed each time.
In PostgreSQL and Snowflake, the optimiser decides whether to materialise or inline it.
So if performance matters, always check the query plan (EXPLAIN keyword).
What About Performance?
In most modern databases, CTEs and subqueries perform identically — the optimiser is clever enough to recognise the logic and run it efficiently either way.
But there are exceptions:
-
PostgreSQL used to materialise every CTE (pre-v12), which sometimes slowed things down. Newer versions can inline CTEs automatically, and you can now specify
MATERIALIZEDorNOT MATERIALIZEDto control this. -
BigQuery usually inlines CTEs but will re-run them if referenced multiple times — use temporary tables or materialised views for heavy reuse.
-
SQL Server treats CTEs mostly as inline views — performance is usually identical to subqueries.
-
Snowflake may materialise or inline depending on the plan — always inspect the query profile.
So: readability first, benchmark later.
How I decide
-
Simple filter or quick calculation? → Subquery
-
Multi-step logic? → CTE
-
Reusing logic multiple times? → CTE (but verify engine behaviour)
-
Collaborating with others or debugging later? → CTE
-
Still not sure? → Default to a CTE — readability wins
And always give your CTEs meaningful names.monthly_revenue > cte1 every single time.
TL;DR
-
Subqueries → great for simple, one-off filters or calculations
-
CTEs → best for multi-step logic, reuse, and readable collaboration
-
Performance → rarely the deciding factor; modern databases optimise both well
-
Exception → some engines materialise CTEs or re-run them per reference — test with
EXPLAINor query profiling -
Default → CTEs are clearer, easier to debug, and friendlier for teams
Worth knowing: Most of this applies to any SQL dialect, but performance quirks (like whether a CTE is re-run or materialised) mainly matter in PostgreSQL and BigQuery. Always check your database’s query plan before making assumptions.
If you want to practice yourself...
Here’s a sample from this week’s Premium edition. Test your skills with the basic SQL exercise included:
This is just the tip of the iceberg. In Premium, you get:
-
Full local sql database setup & realistic e-commerce data (provided)
-
Step-by-step exercises from basic → intermediate → advanced
-
Self-paced learning with hands-on examples you can run and modify
-
Reference materials, cheatsheets, and interview questions
Ready to push yourself?
Upgrade to Premium and get all exercises, code, and past issues instantly👇🏻