Performance Tuning and Execution Plans

Difficulty: Advanced
Generated on: 2025-07-10 02:33:55
Category: SQL Cheatsheet for Database Development

SQL Performance Tuning & Execution Plans Cheatsheet (Advanced)

This cheatsheet provides a comprehensive guide to SQL performance tuning and execution plans. It’s designed for database developers seeking to optimize query performance and understand how the database engine executes their code.

1. Quick Overview

Performance Tuning: The process of optimizing SQL queries and database configurations to improve query execution speed and overall system performance.
Execution Plan: A detailed roadmap generated by the database engine that outlines the steps it will take to execute a SQL query. Understanding execution plans is crucial for identifying performance bottlenecks.

When to Use:

Slow-running queries
High CPU utilization by database server
Frequent timeouts
When optimizing for scale (e.g., handling larger datasets)
Proactive optimization during development

2. Syntax

A. Obtaining Execution Plans:

SQL Server:

-- Show Estimated Execution Plan (graphical)
SET SHOWPLAN_ALL ON;
GO
SELECT * FROM Customers WHERE City = 'London';
GO
SET SHOWPLAN_ALL OFF;
GO

-- Show Estimated Execution Plan (XML)
SET SHOWPLAN_XML ON;
GO
SELECT * FROM Customers WHERE City = 'London';
GO
SET SHOWPLAN_XML OFF;
GO

-- Show Actual Execution Plan (requires query execution, more accurate)
-- Right-click in SSMS query window and select "Include Actual Execution Plan"
SELECT * FROM Customers WHERE City = 'London';

-- Using STATISTICS PROFILE (deprecated, but sometimes useful)
SET STATISTICS PROFILE ON
GO
SELECT * FROM Customers WHERE City = 'London';
GO
SET STATISTICS PROFILE OFF
GO

PostgreSQL:

EXPLAIN SELECT * FROM customers WHERE city = 'London';

-- Include the actual execution time (requires query execution)
EXPLAIN ANALYZE SELECT * FROM customers WHERE city = 'London';

MySQL:

EXPLAIN SELECT * FROM customers WHERE city = 'London';

-- Extended EXPLAIN (MySQL 5.6.3 and later)
EXPLAIN EXTENDED SELECT * FROM customers WHERE city = 'London';
SHOW WARNINGS; -- To see the full query after optimization

-- Using Optimizer Trace (for deeper analysis)
SET optimizer_trace = "enabled=on";
SELECT * FROM customers WHERE city = 'London';
SELECT * FROM information_schema.OPTIMIZER_TRACE;
SET optimizer_trace = "enabled=off";

Oracle:

EXPLAIN PLAN FOR SELECT * FROM customers WHERE city = 'London';
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY); -- or DBMS_XPLAN.DISPLAY_CURSOR

B. Query Hints (Use with Caution):

SQL Server:

SELECT * FROM Customers WITH (INDEX(IX_Customers_City)) WHERE City = 'London'; -- Force index usage
SELECT * FROM Orders OPTION (MAXDOP 4); -- Limit parallelism
SELECT * FROM Products OPTION (RECOMPILE); -- Force recompilation

PostgreSQL:

SELECT * FROM customers WHERE city = 'London' USING INDEX ix_customers_city;
SET enable_seqscan = off; -- Disable sequential scans (use sparingly)
SELECT * FROM customers WHERE city = 'London';
SET enable_seqscan = on;

MySQL:

SELECT * FROM customers USE INDEX (ix_customers_city) WHERE city = 'London';
SELECT * FROM customers FORCE INDEX (ix_customers_city) WHERE city = 'London';

Oracle:

SELECT /*+ INDEX(customers IX_Customers_City) */ * FROM customers WHERE city = 'London';

C. Index Management:

-- Create Index
CREATE INDEX IX_Customers_City ON Customers (City);

-- Create Composite Index
CREATE INDEX IX_Orders_CustomerID_OrderDate ON Orders (CustomerID, OrderDate);

-- Create Filtered Index (SQL Server)
CREATE NONCLUSTERED INDEX IX_Products_Price_Active ON Products (Price) WHERE IsActive = 1;

-- Drop Index
DROP INDEX IX_Customers_City ON Customers;

D. Statistics Management:

SQL Server:

-- Update Statistics (all tables)
UPDATE STATISTICS Customers;

-- Update Statistics (specific index)
UPDATE STATISTICS Customers IX_Customers_City;

-- Create Statistics (if auto-create is off)
CREATE STATISTICS Stats_Customers_City ON Customers (City);

PostgreSQL:

ANALYZE customers; -- Analyze the table
ANALYZE verbose customers; -- More detailed output

MySQL: (Statistics are updated automatically in most cases)

ANALYZE TABLE customers; -- Explicitly analyze the table

Oracle:

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME => 'your_schema', TABNAME => 'customers');

3. Common Use Cases

Identifying Slow Queries: Use database monitoring tools or query logs to identify queries that consume significant resources or have long execution times.
Index Optimization: Analyze execution plans to determine if indexes are being used effectively. Create, modify, or remove indexes as needed.
Query Rewriting: Rewrite queries to use more efficient algorithms or take advantage of database-specific features.
Data Partitioning: Divide large tables into smaller, more manageable partitions to improve query performance.
Materialized Views: Create pre-computed views to speed up frequently executed queries that involve complex aggregations or joins.
Parameter Sniffing Issues (SQL Server): When the execution plan is optimized for the first parameter value used, and subsequent calls with different parameters suffer performance issues. Fix by using OPTION (RECOMPILE) or rewriting the query.
Statistics Issues: Outdated or missing statistics can lead to suboptimal execution plans. Update statistics regularly.
Deadlocks: Analyze deadlock graphs to identify the queries and resources involved in deadlocks. Optimize queries to reduce lock contention.

4. Best Practices

Use Indexes Wisely:
- Index columns used in WHERE, JOIN, ORDER BY, and GROUP BY clauses.
- Avoid indexing columns with low cardinality (few distinct values).
- Consider composite indexes for queries that filter on multiple columns.
- Regularly review and remove unused indexes.
Write Efficient Queries:
- Avoid using SELECT * (specify only the necessary columns).
- Use WHERE clauses to filter data early in the query.
- Avoid using DISTINCT unless absolutely necessary.
- Use EXISTS instead of COUNT(*) when checking for the existence of rows.
- Use UNION ALL instead of UNION when duplicate rows are not a concern.
Optimize Data Types:
- Use the smallest possible data types to reduce storage space and improve performance.
- Avoid implicit data type conversions in WHERE clauses.
Keep Statistics Up-to-Date: Regularly update statistics to ensure the query optimizer has accurate information about data distribution.
Monitor Performance Regularly: Use database monitoring tools to track query performance and identify potential bottlenecks.
Use Parameterized Queries: Parameterized queries prevent SQL injection and can improve performance by allowing the database to reuse execution plans.
Normalize Your Database: Proper normalization reduces data redundancy and improves data integrity, which can lead to better query performance. However, be aware of the potential for increased JOIN operations, which can sometimes degrade performance.
Denormalize Strategically: In some cases, denormalization (adding redundant data) can improve query performance by reducing the need for JOIN operations. However, this should be done carefully to avoid data inconsistencies.
Avoid Cursors (Generally): Cursors are typically slower than set-based operations. Try to rewrite queries to use set-based operations whenever possible. If you must use a cursor, optimize it by fetching multiple rows at a time.
Batch Operations: Batch insert, update, and delete operations to reduce the overhead of individual transactions.
Optimize Database Configuration: Tune database server settings (e.g., memory allocation, buffer pool size) to match the workload.
Hardware Considerations: Ensure the database server has sufficient CPU, memory, and disk I/O capacity to handle the workload. Consider using SSDs for improved I/O performance.
Understand Your Data: Knowing the distribution and characteristics of your data can help you write more efficient queries and choose appropriate indexes.
Test Thoroughly: Test all performance optimizations in a non-production environment before deploying them to production.
Use Query Profilers: Tools like SQL Server Profiler, pgAdmin’s query profiler, and MySQL Workbench’s performance dashboard can help you identify performance bottlenecks.
Avoid Functions in WHERE Clause: Using functions in the WHERE clause can prevent the database from using indexes. Try to rewrite the query to avoid functions or create a computed column that is indexed. For example: WHERE YEAR(OrderDate) = 2023 can be rewritten as WHERE OrderDate >= '2023-01-01' AND OrderDate < '2024-01-01'.
Use WITH (NOLOCK) Carefully (SQL Server): NOLOCK can improve read performance but may result in reading uncommitted data (dirty reads). Use with caution and only when the risk of dirty reads is acceptable.

5. Examples

Sample Data:

-- SQL Server Example
CREATE TABLE Customers (
    CustomerID INT PRIMARY KEY,
    FirstName VARCHAR(50),
    LastName VARCHAR(50),
    City VARCHAR(50),
    Country VARCHAR(50)
);

INSERT INTO Customers (CustomerID, FirstName, LastName, City, Country) VALUES
(1, 'John', 'Doe', 'London', 'UK'),
(2, 'Jane', 'Smith', 'Paris', 'France'),
(3, 'Peter', 'Jones', 'London', 'UK'),
(4, 'Mary', 'Brown', 'New York', 'USA');

CREATE TABLE Orders (
    OrderID INT PRIMARY KEY,
    CustomerID INT,
    OrderDate DATETIME,
    TotalAmount DECIMAL(10, 2),
    FOREIGN KEY (CustomerID) REFERENCES Customers(CustomerID)
);

INSERT INTO Orders (OrderID, CustomerID, OrderDate, TotalAmount) VALUES
(101, 1, '2023-01-15', 100.00),
(102, 2, '2023-02-20', 250.00),
(103, 1, '2023-03-10', 50.00),
(104, 3, '2023-04-05', 120.00);

Example 1: Identifying Index Usage

-- SQL Server
-- Check if index is being used
SET SHOWPLAN_ALL ON;
GO
SELECT * FROM Customers WHERE City = 'London';
GO
SET SHOWPLAN_ALL OFF;
GO

-- If no index is used, create one
CREATE INDEX IX_Customers_City ON Customers (City);

-- Re-run the query and check the execution plan
SET SHOWPLAN_ALL ON;
GO
SELECT * FROM Customers WHERE City = 'London';
GO
SET SHOWPLAN_ALL OFF;
GO

Example 2: Optimizing JOIN Operations

-- SQL Server
-- Slow query (no index on CustomerID in Orders table)
SELECT c.FirstName, c.LastName, o.OrderDate, o.TotalAmount
FROM Customers c
JOIN Orders o ON c.CustomerID = o.CustomerID
WHERE c.City = 'London';

-- Add an index on CustomerID in the Orders table
CREATE INDEX IX_Orders_CustomerID ON Orders (CustomerID);

-- Re-run the query and check the execution plan
SELECT c.FirstName, c.LastName, o.OrderDate, o.TotalAmount
FROM Customers c
JOIN Orders o ON c.CustomerID = o.CustomerID
WHERE c.City = 'London';

Example 3: Using Covering Index

-- SQL Server
-- Create a covering index (includes all columns needed by the query)
CREATE INDEX IX_Customers_City_FirstName_LastName ON Customers (City, FirstName, LastName);

-- Query that can benefit from the covering index
SELECT FirstName, LastName FROM Customers WHERE City = 'London';

Example 4: Parameter Sniffing (SQL Server)

-- Simulate Parameter Sniffing Issue
CREATE PROCEDURE GetOrdersByCustomerID (@CustomerID INT)
AS
BEGIN
    SELECT *
    FROM Orders
    WHERE CustomerID = @CustomerID;
END;
GO

-- Run with a CustomerID that has few orders (e.g., CustomerID = 2)
EXEC GetOrdersByCustomerID 2;  -- Initial execution

-- Run with a CustomerID that has many orders (e.g., CustomerID = 1)
EXEC GetOrdersByCustomerID 1;  -- May use the same plan, which is now suboptimal

-- Fix: Use OPTION (RECOMPILE)
ALTER PROCEDURE GetOrdersByCustomerID (@CustomerID INT)
AS
BEGIN
    SELECT *
    FROM Orders
    WHERE CustomerID = @CustomerID
    OPTION (RECOMPILE);
END;
GO

-- Alternatively, rewrite the query to avoid the issue
-- (e.g., using dynamic SQL with OPTIMIZE FOR UNKNOWN)

Example 5: Statistics Update (PostgreSQL)

-- PostgreSQL
-- Analyze the table to update statistics
ANALYZE customers;

-- Analyze a specific column
ANALYZE customers(city);

Example 6: Using EXISTS instead of COUNT(*)

-- Less efficient (counts all rows)
SELECT CustomerID
FROM Customers
WHERE (SELECT COUNT(*) FROM Orders WHERE Orders.CustomerID = Customers.CustomerID) > 0;

-- More efficient (stops when a row is found)
SELECT CustomerID
FROM Customers
WHERE EXISTS (SELECT 1 FROM Orders WHERE Orders.CustomerID = Customers.CustomerID);

Example 7: Partitioning (Example for SQL Server)

-- 1. Create a Partition Function
CREATE PARTITION FUNCTION OrderDatePartitionFunction (datetime)
AS RANGE RIGHT FOR VALUES ('2022-01-01', '2023-01-01', '2024-01-01');

-- 2. Create a Partition Scheme
CREATE PARTITION SCHEME OrderDatePartitionScheme
AS PARTITION OrderDatePartitionFunction
TO ([PRIMARY], [PRIMARY], [PRIMARY], [PRIMARY]);

-- 3. Create the table with the partition scheme
CREATE TABLE OrdersPartitioned (
    OrderID INT PRIMARY KEY,
    CustomerID INT,
    OrderDate DATETIME,
    TotalAmount DECIMAL(10, 2)
) ON OrderDatePartitionScheme(OrderDate);

-- Insert data (example)
INSERT INTO OrdersPartitioned (OrderID, CustomerID, OrderDate, TotalAmount)
VALUES
(1, 1, '2021-06-15', 100.00),
(2, 2, '2022-06-20', 250.00),
(3, 1, '2023-03-10', 50.00),
(4, 3, '2024-04-05', 120.00);

-- Query that benefits from partitioning
SELECT * FROM OrdersPartitioned WHERE OrderDate BETWEEN '2023-01-01' AND '2023-12-31';

6. Common Pitfalls

Ignoring Execution Plans: Failing to analyze execution plans is a common mistake. Always examine the plan to understand how the database is executing your query.
Blindly Adding Indexes: Adding too many indexes can negatively impact write performance. Evaluate the impact of each index on both read and write operations.
Outdated Statistics: Using outdated statistics can lead to suboptimal execution plans. Regularly update statistics.
Assuming Correlation: Just because two columns are often used together in WHERE clauses doesn’t automatically mean a composite index is the best solution. Test different index configurations.
Over-Reliance on Hints: Query hints should be used sparingly and with caution. They can mask underlying problems with the query or database configuration. Consider rewriting the query or optimizing the data model instead.
Ignoring Hardware Limitations: Performance problems can sometimes be caused by insufficient hardware resources (CPU, memory, disk I/O). Monitor hardware utilization and upgrade as needed.
Not Testing Changes: Always test performance optimizations in a non-production environment before deploying them to production.
Assuming Causation from Correlation: Observing slow performance and making assumptions about the cause without proper investigation. For example, assuming a slow query is due to missing statistics when it’s actually a locking issue.
Ignoring the Impact of Data Volume: A query that performs well on a small dataset may perform poorly on a large dataset. Test your queries with realistic data volumes.
Not Monitoring After Deployment: After deploying performance optimizations, continue to monitor performance to ensure that the changes are having the desired effect and that no new problems have been introduced.

Troubleshooting Tips:

Start with the Basics: Check for obvious problems, such as missing indexes or outdated statistics.
Simplify the Query: Try simplifying the query to isolate the source of the performance problem.
Break Down the Query: Break the query down into smaller parts and test each part separately.
Use a Query Profiler: Use a query profiler to identify the most time-consuming steps in the query execution.
Consult the Database Documentation: The database documentation is a valuable resource for understanding query optimization techniques.
Search Online Forums: Search online forums for similar performance problems and solutions.

7. Database Variations

SQL Server: Uses SET SHOWPLAN_ALL, SET SHOWPLAN_XML, and SET STATISTICS PROFILE for execution plans. Has a powerful query optimizer with extensive hint options. Good support for filtered indexes.
PostgreSQL: Uses EXPLAIN and EXPLAIN ANALYZE for execution plans. Has a cost-based optimizer. Supports partial indexes (similar to SQL Server’s filtered indexes).
MySQL: Uses EXPLAIN for execution plans. The query optimizer is less sophisticated than SQL Server or PostgreSQL. Supports index hints using USE INDEX and FORCE INDEX. The Optimizer Trace feature is useful for deeper analysis.
Oracle: Uses EXPLAIN PLAN and DBMS_XPLAN.DISPLAY for execution plans. Has a cost-based optimizer with a rich set of hint options. Supports function-based indexes and virtual columns.

This cheatsheet provides a solid foundation for understanding SQL performance tuning and execution plans. Remember to practice and experiment with these techniques to become proficient in optimizing your SQL queries. Good luck!