Skip to content

Common Table Expressions (CTEs)

Difficulty: Intermediate
Generated on: 2025-07-10 02:28:43
Category: SQL Cheatsheet for Database Development


What is a CTE? A Common Table Expression (CTE) is a temporary, named result set that you can reference within a single SELECT, INSERT, UPDATE, or DELETE statement. Think of it as a virtual table that exists only for the duration of the query.

When to use CTEs:

  • Improve Readability: Break down complex queries into logical, manageable chunks.
  • Recursive Queries: Implement hierarchical or graph traversal logic (e.g., organizational charts, bill of materials).
  • Simplify Subqueries: Replace nested subqueries with named CTEs.
  • Refer to a Result Set Multiple Times: Avoid repeating the same subquery multiple times within a query.
  • Window Functions: Use CTEs to pre-filter or aggregate data before applying window functions.
  • Modifying Data with Recursive Logic: (In some databases) Recursively update or delete data based on hierarchical relationships.

Basic CTE Syntax:

WITH CTE_Name AS (
-- CTE Query Definition (SELECT statement)
)
-- Main Query (SELECT, INSERT, UPDATE, or DELETE) referencing the CTE
SELECT * FROM CTE_Name;

Multiple CTEs:

WITH
CTE_1 AS (
-- CTE 1 Query Definition
),
CTE_2 AS (
-- CTE 2 Query Definition (can reference CTE_1)
)
-- Main Query referencing CTE_1 and/or CTE_2
SELECT * FROM CTE_1 JOIN CTE_2 ON CTE_1.ID = CTE_2.ID;

Recursive CTE Syntax (for hierarchical data):

WITH RECURSIVE RecursiveCTE AS (
-- Anchor Member (Base Case): Selects the initial row(s)
SELECT initial_data_columns
FROM base_table
WHERE condition
UNION ALL
-- Recursive Member: References the CTE itself to iterate
SELECT derived_data_columns
FROM base_table
JOIN RecursiveCTE ON base_table.parent_id = RecursiveCTE.id
WHERE condition
)
-- Main Query referencing the RecursiveCTE
SELECT * FROM RecursiveCTE;

Important Notes:

  • The WITH keyword starts the CTE definition.
  • Each CTE is defined with a name (CTE_Name) and an AS keyword followed by parentheses containing the CTE’s query.
  • The main query follows the CTE definitions.
  • Multiple CTEs are separated by commas.
  • Recursive CTEs must have an anchor member (non-recursive part) and a recursive member connected by UNION ALL. The recursive member must reference the CTE itself.
  • Recursive CTEs require the RECURSIVE keyword in some databases (e.g., PostgreSQL).
Use CaseDescription
Calculating Running TotalsCompute a cumulative sum of a value over a sequence of rows. Useful for financial reporting, inventory tracking, etc.
Finding Top N RecordsSelect the top N records from a dataset based on a specific criterion (e.g., top 10 customers by sales).
Hierarchical Data TraversalNavigate through hierarchical data structures (e.g., organizational charts, file system directories) to find related records at different levels.
Date Series GenerationGenerate a series of dates within a specified range. Useful for reporting on time-based data, filling in missing dates, etc.
Data TransformationRestructure or manipulate data before applying further analysis or reporting. Can simplify complex calculations or aggregations.
Removing DuplicatesIdentify and remove duplicate rows from a table based on specific criteria. CTEs can make the duplicate detection and removal process more efficient.
PaginationImplement pagination for large datasets by calculating row numbers and selecting a specific range of rows for each page.
Implementing Graph AlgorithmsCan be used to implement breadth-first search, depth-first search, or other graph algorithms. Useful in scenarios like social network analysis or route finding.
  • Keep CTEs Focused: Each CTE should perform a specific, well-defined task. This improves readability and maintainability.
  • Name CTEs Meaningfully: Use descriptive names that clearly indicate the purpose of the CTE.
  • Avoid Overly Complex CTEs: If a CTE becomes too complex, consider breaking it down into smaller, more manageable CTEs.
  • Index Appropriately: Ensure that the tables referenced in the CTEs have appropriate indexes to optimize query performance.
  • Use UNION ALL Instead of UNION (for Recursive CTEs): UNION removes duplicate rows, which can be computationally expensive. UNION ALL is faster but preserves duplicates, which is usually what you want in a recursive CTE. You can handle duplicates in the main query if needed.
  • Limit Recursion Depth: Recursive CTEs can potentially run indefinitely if the recursion condition is not met or if there is a cycle in the data. Most databases have a recursion limit to prevent this. Consider adding a WHERE clause to the recursive member to limit the recursion depth.
  • Materialization Considerations: The database engine may materialize CTEs (store the results in a temporary table) or inline them (rewrite the query). Materialization can improve performance in some cases, but it can also add overhead. Use query profiling tools to analyze the query execution plan and identify potential performance bottlenecks.
  • Security: CTEs do not inherently introduce any new security risks. However, always follow best practices for SQL injection prevention and data access control.

Sample Data:

-- Employees Table
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
FirstName VARCHAR(50),
LastName VARCHAR(50),
DepartmentID INT,
Salary DECIMAL(10, 2),
ManagerID INT NULL -- Self-referencing foreign key
);
INSERT INTO Employees (EmployeeID, FirstName, LastName, DepartmentID, Salary, ManagerID) VALUES
(1, 'John', 'Doe', 10, 60000.00, NULL),
(2, 'Jane', 'Smith', 20, 75000.00, 1),
(3, 'Robert', 'Jones', 10, 55000.00, 1),
(4, 'Emily', 'Brown', 20, 80000.00, 2),
(5, 'Michael', 'Davis', 30, 90000.00, 1),
(6, 'Jessica', 'Wilson', 30, 70000.00, 5);
-- Departments Table
CREATE TABLE Departments (
DepartmentID INT PRIMARY KEY,
DepartmentName VARCHAR(50)
);
INSERT INTO Departments (DepartmentID, DepartmentName) VALUES
(10, 'Sales'),
(20, 'Marketing'),
(30, 'Engineering');

Example 1: Calculating Average Salary by Department

WITH DepartmentSalaries AS (
SELECT
DepartmentID,
AVG(Salary) AS AverageSalary
FROM
Employees
GROUP BY
DepartmentID
)
SELECT
d.DepartmentName,
ds.AverageSalary
FROM
Departments d
JOIN
DepartmentSalaries ds ON d.DepartmentID = ds.DepartmentID;
-- Expected Output:
-- DepartmentName | AverageSalary
-- -------------- | -------------
-- Sales | 57500.00
-- Marketing | 77500.00
-- Engineering | 80000.00

Example 2: Finding Employees Above Department Average Salary

WITH DepartmentAvgSalaries AS (
SELECT
DepartmentID,
AVG(Salary) AS AvgSalary
FROM
Employees
GROUP BY
DepartmentID
)
SELECT
e.FirstName,
e.LastName,
e.Salary,
d.DepartmentName
FROM
Employees e
JOIN
DepartmentAvgSalaries das ON e.DepartmentID = das.DepartmentID
JOIN
Departments d ON e.DepartmentID = d.DepartmentID
WHERE
e.Salary > das.AvgSalary;
-- Expected Output:
-- FirstName | LastName | Salary | DepartmentName
-- --------- | -------- | ------- | --------------
-- Jane | Smith | 75000.00 | Marketing
-- Emily | Brown | 80000.00 | Marketing
-- Michael | Davis | 90000.00 | Engineering

Example 3: Recursive CTE - Organizational Hierarchy

WITH RECURSIVE EmployeeHierarchy AS (
-- Anchor Member: Select the top-level manager (ManagerID is NULL)
SELECT
EmployeeID,
FirstName,
LastName,
ManagerID,
1 AS Level
FROM
Employees
WHERE
ManagerID IS NULL
UNION ALL
-- Recursive Member: Find employees reporting to the current level
SELECT
e.EmployeeID,
e.FirstName,
e.LastName,
e.ManagerID,
eh.Level + 1 AS Level
FROM
Employees e
JOIN
EmployeeHierarchy eh ON e.ManagerID = eh.EmployeeID
)
SELECT
EmployeeID,
FirstName,
LastName,
ManagerID,
Level
FROM
EmployeeHierarchy
ORDER BY
Level, EmployeeID;
-- Expected Output:
-- EmployeeID | FirstName | LastName | ManagerID | Level
-- ---------- | --------- | -------- | --------- | -----
-- 1 | John | Doe | NULL | 1
-- 2 | Jane | Smith | 1 | 2
-- 3 | Robert | Jones | 1 | 2
-- 5 | Michael | Davis | 1 | 2
-- 4 | Emily | Brown | 2 | 3
-- 6 | Jessica | Wilson | 5 | 3

Example 4: Paginating Results

WITH PagedResults AS (
SELECT
EmployeeID,
FirstName,
LastName,
Salary,
ROW_NUMBER() OVER (ORDER BY LastName, FirstName) AS RowNum
FROM
Employees
)
SELECT
EmployeeID,
FirstName,
LastName,
Salary
FROM
PagedResults
WHERE
RowNum BETWEEN 1 AND 3 -- Get the first 3 records (page 1)
ORDER BY
RowNum;
-- Expected Output (first 3 employees alphabetically):
-- EmployeeID | FirstName | LastName | Salary
-- ---------- | --------- | -------- | -------
-- 4 | Emily | Brown | 80000.00
-- 5 | Michael | Davis | 90000.00
-- 1 | John | Doe | 60000.00

Example 5: Date Series Generation (PostgreSQL)

WITH RECURSIVE DateSeries AS (
SELECT DATE('2023-01-01') AS dt
UNION ALL
SELECT dt + INTERVAL '1 day'
FROM DateSeries
WHERE dt < DATE('2023-01-10')
)
SELECT dt FROM DateSeries;
-- Expected Output:
-- dt
-- ----------
-- 2023-01-01
-- 2023-01-02
-- 2023-01-03
-- 2023-01-04
-- 2023-01-05
-- 2023-01-06
-- 2023-01-07
-- 2023-01-08
-- 2023-01-09
-- 2023-01-10
  • Incorrect Recursion Logic: A common mistake in recursive CTEs is defining the recursive member incorrectly, leading to infinite loops or incorrect results. Carefully review the recursion condition and ensure it eventually terminates.
  • Forgetting the Anchor Member: Recursive CTEs must have an anchor member. Omitting it will result in a syntax error.
  • Exceeding Recursion Limit: If a recursive CTE runs for too long, it may exceed the database’s recursion limit. Adjust the recursion logic or increase the limit (with caution, as this can consume significant resources). MySQL does not support recursive CTEs without explicitly setting the max_recursion_depth variable.
  • Performance Issues with Complex CTEs: Overly complex CTEs can impact query performance. Break down complex CTEs into smaller, more manageable CTEs or consider alternative query strategies.
  • Incorrect UNION vs. UNION ALL: Using UNION when UNION ALL is appropriate can lead to unnecessary duplicate removal, which can significantly slow down the query. Only use UNION if you explicitly need to remove duplicates.
  • Confusing CTE Scope: CTEs are only valid within the single query they are defined in. You cannot reference them in subsequent queries without redefining them.
  • Data Type Mismatches: Ensure that the data types of columns used in the UNION ALL clause of a recursive CTE are compatible. Mismatched data types can lead to unexpected results or errors.
  • Incorrect Use of Window Functions within CTEs: When using window functions inside CTEs, make sure the PARTITION BY and ORDER BY clauses are correctly defined to achieve the desired results.
FeatureMySQLPostgreSQLSQL ServerOracle
Basic CTE SupportYes (from version 8.0)YesYesYes
Recursive CTEsYes (from version 8.0.1) Requires SET SESSION max_recursion_depth = <value>; to enable.Yes, requires RECURSIVE keyword.YesYes
MaterializationMySQL’s query optimizer decides whether to materialize or inline CTEs.PostgreSQL’s query optimizer decides whether to materialize or inline CTEs.SQL Server’s query optimizer decides whether to materialize or inline CTEs.Oracle’s query optimizer decides whether to materialize or inline CTEs.
LimitationsMySQL has a default recursion limit of 0. You must set max_recursion_depth to use recursive CTEs.PostgreSQL has a default recursion limit (configurable).SQL Server has a default recursion limit (configurable).Oracle has a default recursion limit (configurable).
SyntaxStandard SQL syntax.Standard SQL syntax, requires RECURSIVE keyword for recursive CTEs.Standard SQL syntax.Standard SQL syntax.

Key takeaways regarding database differences:

  • MySQL: Requires explicit enablement of recursive CTEs and setting a recursion depth limit.
  • PostgreSQL: Requires the RECURSIVE keyword for recursive CTEs.
  • SQL Server & Oracle: Generally follow standard SQL syntax and have configurable recursion limits.

Always consult the specific documentation for your database version for the most accurate and up-to-date information. Understanding these nuances will help you write portable and efficient SQL code across different database platforms.