Indexes and Query Optimization
Difficulty: Intermediate
Generated on: 2025-07-10 02:27:09
Category: SQL Cheatsheet for Database Development
SQL Cheatsheet: Indexes and Query Optimization (Intermediate)
Section titled “SQL Cheatsheet: Indexes and Query Optimization (Intermediate)”1. Quick Overview
Section titled “1. Quick Overview”What is it?
- Indexes: Data structures that improve the speed of data retrieval operations on a database table. They act like an index in a book, allowing the database to quickly locate specific rows without scanning the entire table.
- Query Optimization: The process of analyzing and improving the performance of SQL queries, making them execute faster and more efficiently. It involves techniques like index usage, query rewriting, and statistics updates.
When to use it?
- Indexes:
- When queries are frequently slow on large tables.
- For columns used in
WHERE,JOIN,ORDER BY, andGROUP BYclauses. - When read operations significantly outweigh write operations.
- Query Optimization:
- When application performance is slow due to database queries.
- After adding new indexes or modifying data structures.
- During performance testing and load testing.
2. Syntax
Section titled “2. Syntax”Creating Indexes
Section titled “Creating Indexes”-- Single-column indexCREATE INDEX idx_lastname ON customers (last_name);
-- Composite index (multiple columns)CREATE INDEX idx_lastname_firstname ON customers (last_name, first_name);
-- Unique index (enforces uniqueness and improves lookup speed)CREATE UNIQUE INDEX idx_email ON users (email);
-- Partial index (PostgreSQL, useful for indexing a subset of data)CREATE INDEX idx_active_users ON users (user_id) WHERE is_active = TRUE;
-- Functional index (MySQL, PostgreSQL, SQL Server) - index based on a function of a columnCREATE INDEX idx_lower_email ON users (LOWER(email)); -- Case-insensitive email lookups
-- Clustered Index (SQL Server, Oracle) - determines the physical order of dataCREATE CLUSTERED INDEX idx_order_date ON orders (order_date);Dropping Indexes
Section titled “Dropping Indexes”DROP INDEX idx_lastname ON customers; -- MySQL, PostgreSQL
DROP INDEX customers.idx_lastname ON database_name; -- SQL Server
DROP INDEX idx_lastname; -- Oracle (requires schema prefix)Analyzing Queries (EXPLAIN)
Section titled “Analyzing Queries (EXPLAIN)”-- MySQL, PostgreSQL, SQL Server (Execution plan)EXPLAIN SELECT * FROM orders WHERE customer_id = 123 AND order_date > '2023-01-01';
-- Oracle (EXPLAIN PLAN, then SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);)EXPLAIN PLAN FOR SELECT * FROM orders WHERE customer_id = 123 AND order_date > '2023-01-01';SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);Updating Statistics
Section titled “Updating Statistics”-- MySQLANALYZE TABLE customers;
-- PostgreSQLANALYZE customers;
-- SQL ServerUPDATE STATISTICS customers;
-- OracleEXEC DBMS_STATS.GATHER_TABLE_STATS(ownname => 'your_schema', tabname => 'customers');3. Common Use Cases
Section titled “3. Common Use Cases”- Filtering Data: Speeding up queries with
WHEREclauses. - Joining Tables: Optimizing
JOINoperations between tables. - Sorting Data: Improving
ORDER BYperformance. - Grouping Data: Making
GROUP BYqueries faster. - Unique Constraints: Enforcing uniqueness and improving lookup speed for unique columns.
- Range Queries: Indexing date or numeric columns for efficient range-based searches.
- Text Search: Using full-text indexes for faster text-based searches (database-specific).
4. Best Practices
Section titled “4. Best Practices”- Index Selectively: Don’t index every column. Focus on columns frequently used in
WHERE,JOIN,ORDER BY, andGROUP BYclauses. - Composite Indexes: Create composite indexes for columns used together in queries. The order of columns matters (most selective column first).
- Index Cardinality: Indexes are most effective on columns with high cardinality (many distinct values). Indexing low-cardinality columns (e.g., gender) might not be beneficial.
- Index Size: Smaller indexes are generally faster. Avoid including large columns in indexes unless absolutely necessary.
- Regularly Update Statistics: Database optimizers rely on statistics to choose the best execution plan. Keep statistics up-to-date, especially after significant data changes.
- Use
EXPLAIN: Analyze query execution plans to identify bottlenecks and areas for improvement. - Query Rewriting: Sometimes, rewriting a query can significantly improve its performance. Look for opportunities to simplify queries or use more efficient constructs.
- Consider Covering Indexes: A covering index includes all the columns needed to satisfy a query, so the database doesn’t need to access the table itself.
- Avoid Functions in
WHEREClauses (if possible): Using functions on indexed columns inWHEREclauses can prevent index usage. Consider using functional indexes (if your database supports them) or rewriting the query. - Avoid Implicit Type Conversions: Ensure that data types in
WHEREclauses match the data types of the columns. Implicit type conversions can prevent index usage. - Monitor Index Usage: Use database monitoring tools to track index usage and identify unused or redundant indexes.
5. Examples
Section titled “5. Examples”Sample Data (Customers Table)
Section titled “Sample Data (Customers Table)”CREATE TABLE customers ( customer_id INT PRIMARY KEY, first_name VARCHAR(50), last_name VARCHAR(50), email VARCHAR(100), city VARCHAR(50), country VARCHAR(50));
INSERT INTO customers (customer_id, first_name, last_name, email, city, country) VALUES(1, 'John', 'Doe', 'john.doe@example.com', 'New York', 'USA'),(2, 'Jane', 'Smith', 'jane.smith@example.com', 'London', 'UK'),(3, 'Peter', 'Jones', 'peter.jones@example.com', 'Paris', 'France'),(4, 'Alice', 'Brown', 'alice.brown@example.com', 'Tokyo', 'Japan'),(5, 'Bob', 'Wilson', 'bob.wilson@example.com', 'Sydney', 'Australia'),(6, 'Charlie', 'Garcia', 'charlie.garcia@example.com', 'Madrid', 'Spain'),(7, 'David', 'Rodriguez', 'david.rodriguez@example.com', 'Berlin', 'Germany'),(8, 'Eve', 'Martinez', 'eve.martinez@example.com', 'Rome', 'Italy'),(9, 'Frank', 'Anderson', 'frank.anderson@example.com', 'Toronto', 'Canada'),(10, 'Grace', 'Thomas', 'grace.thomas@example.com', 'Mexico City', 'Mexico');
CREATE INDEX idx_lastname ON customers (last_name);CREATE INDEX idx_city_country ON customers (city, country);Example 1: Filtering by Last Name
Section titled “Example 1: Filtering by Last Name”-- Without index (slow on large tables)SELECT * FROM customers WHERE last_name = 'Smith';
-- With index (idx_lastname) - much fasterSELECT * FROM customers WHERE last_name = 'Smith';
-- Explain Plan (MySQL)EXPLAIN SELECT * FROM customers WHERE last_name = 'Smith';-- Possible Output:-- id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |-- -- | ----------- | --------- | ---------- | ---- | --------------- | ------------ | ------- | ----- | ---- | -------- | ------------ |-- 1 | SIMPLE | customers | NULL | ref | idx_lastname | idx_lastname | 152 | const | 1 | 100.00 | Using index condition |Example 2: Filtering by City and Country
Section titled “Example 2: Filtering by City and Country”-- Without index (slow)SELECT * FROM customers WHERE city = 'London' AND country = 'UK';
-- With composite index (idx_city_country) - fasterSELECT * FROM customers WHERE city = 'London' AND country = 'UK';
-- Explain Plan (PostgreSQL)EXPLAIN SELECT * FROM customers WHERE city = 'London' AND country = 'UK';-- Possible Output:-- QUERY PLAN-- ------------------------------------------------------------------------- Index Scan using idx_city_country on customers (cost=0.15..8.17 rows=1 width=165)-- Index Cond: ((city = 'London'::text) AND (country = 'UK'::text))-- (2 rows)Example 3: Using a Covering Index
Section titled “Example 3: Using a Covering Index”-- Without covering index (requires table lookup)SELECT first_name, last_name FROM customers WHERE city = 'New York';
-- Create a covering index (if your database supports it)CREATE INDEX idx_city_name ON customers (city, first_name, last_name); -- MySQL, PostgreSQL-- orCREATE INDEX idx_city_name ON customers (city) INCLUDE (first_name, last_name); -- SQL Server
-- With covering index (idx_city_name) - fastest (no table lookup)SELECT first_name, last_name FROM customers WHERE city = 'New York';Example 4: Using a Functional Index
Section titled “Example 4: Using a Functional Index”-- Without functional index (case-sensitive search, may miss results)SELECT * FROM customers WHERE email = 'John.Doe@example.com';
-- With functional index (case-insensitive search)CREATE INDEX idx_lower_email ON customers (LOWER(email));
SELECT * FROM customers WHERE LOWER(email) = 'john.doe@example.com';Example 5: Query Rewriting (Avoiding OR)
Section titled “Example 5: Query Rewriting (Avoiding OR)”-- Original query (using OR - can be slow)SELECT * FROM customers WHERE city = 'London' OR country = 'USA';
-- Rewritten query (using UNION ALL - can be faster if indexes exist on city and country)SELECT * FROM customers WHERE city = 'London'UNION ALLSELECT * FROM customers WHERE country = 'USA'WHERE NOT EXISTS (SELECT 1 FROM customers WHERE city = 'London'); -- Avoid duplicates6. Common Pitfalls
Section titled “6. Common Pitfalls”- Over-Indexing: Creating too many indexes can slow down write operations (inserts, updates, deletes) and increase storage space.
- Ignoring Cardinality: Indexing low-cardinality columns can be ineffective and waste resources.
- Not Updating Statistics: Outdated statistics can lead to poor query plans and slow performance.
- Using Functions on Indexed Columns: Using functions in
WHEREclauses can prevent index usage. - Implicit Type Conversions: Data type mismatches can prevent index usage.
- Ignoring
EXPLAIN: Not analyzing query execution plans to identify bottlenecks. - Blindly Adding Indexes: Adding indexes without understanding the query patterns can be counterproductive.
- Not Considering Composite Index Order: The order of columns in a composite index matters. Place the most selective column first.
- Index Fragmentation: Over time, indexes can become fragmented, reducing their performance. Rebuild or reorganize indexes periodically. (Database-specific commands)
7. Database Variations
Section titled “7. Database Variations”| Feature | MySQL | PostgreSQL | SQL Server | Oracle |
|---|---|---|---|---|
| Index Creation | CREATE INDEX, CREATE UNIQUE INDEX, CREATE FULLTEXT INDEX | CREATE INDEX, CREATE UNIQUE INDEX, CREATE INDEX ... WHERE (partial indexes) | CREATE INDEX, CREATE UNIQUE INDEX, CREATE CLUSTERED INDEX, CREATE XML INDEX, CREATE FULLTEXT INDEX | CREATE INDEX, CREATE UNIQUE INDEX, CREATE BITMAP INDEX, CREATE CLUSTERED INDEX |
| Dropping Indexes | DROP INDEX index_name ON table_name; | DROP INDEX index_name; | DROP INDEX index_name ON table_name; | DROP INDEX index_name; |
| Analyze Table | ANALYZE TABLE table_name; | ANALYZE table_name; | UPDATE STATISTICS table_name; | EXEC DBMS_STATS.GATHER_TABLE_STATS(...); |
| Explain Plan | EXPLAIN SELECT ...; | EXPLAIN SELECT ...; | EXPLAIN SELECT ...; | EXPLAIN PLAN FOR SELECT ...; SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY); |
| Functional Indexes | Supported via expression indexes. | Supported via expression indexes. | Supported via computed columns and indexing the computed column. | Supported via function-based indexes. |
| Covering Indexes | Supported via covering indexes (InnoDB 5.6+). | Supported via INCLUDE clause in CREATE INDEX. | Supported via INCLUDE clause in CREATE INDEX. | Supported via INCLUDE clause in CREATE INDEX (Oracle 11g R2 and later). |
| Auto Indexing | No automatic index creation. | No automatic index creation. | Auto-create stats (but not indexes) configurable. | No automatic index creation. |
Security Note: Be mindful of who has permissions to create and drop indexes, as these operations can significantly impact database performance and stability. Limit access to these privileges to authorized database administrators.