Segment Tree
segment-tree: The Ultimate Cheatsheet
Section titled “segment-tree: The Ultimate Cheatsheet”1. Detailed Explanation
Section titled “1. Detailed Explanation”-
What is a segment-tree? A segment-tree (also known as an interval tree) is a tree data structure used for storing information about intervals, or segments. It allows querying which of the stored segments contain a given point. More generally, it allows fast retrieval of aggregated information (e.g., sum, min, max) over a range or segment of an array.
-
Why is it important and what kind of problems does it solve? Segment-trees are important because they provide an efficient way to perform range queries on an array. They excel in scenarios where you need to repeatedly query a range of elements in an array for a specific property (e.g., sum, minimum, maximum) and/or update individual elements within the array. They solve problems involving:
- Range Sum Queries
- Range Minimum/Maximum Queries (RMQ)
- Range Update Queries (lazy propagation is often used here)
-
Core concepts, underlying principles, and key terminology:
- Tree Structure: A segment-tree is a binary tree where each node represents an interval.
- Leaf Nodes: Leaf nodes represent single elements of the original array.
- Internal Nodes: Internal nodes represent the aggregation of their children’s intervals. For example, if the operation is sum, an internal node stores the sum of the elements represented by its children.
- Complete Binary Tree (Ideally): The structure is often designed to be a complete or nearly complete binary tree for optimal performance. Padding the array with neutral elements (e.g., 0 for sum, infinity for min/max) can help achieve this.
- Recursive Construction: The tree is typically built recursively, dividing the array into halves at each level.
- Querying: Queries are also performed recursively, traversing the tree to find the nodes that cover the query range.
- Updating: Updates can be done in O(log n) time by updating the relevant nodes in the tree.
- Lazy Propagation: A technique used to efficiently handle range updates. Instead of updating every node in the range immediately, updates are “lazily” applied to nodes as they are visited during queries or further updates.
2. When to Use segment-tree (and When Not To)
Section titled “2. When to Use segment-tree (and When Not To)”-
Identify problem patterns that suggest segment-tree is a good fit:
- Frequent Range Queries: The problem involves a large number of range queries (e.g., sum, min, max) on an array.
- Updates: The array elements are frequently updated.
- Non-Overlapping Intervals (Less Critical): While segment trees can handle overlapping intervals, they are particularly efficient for non-overlapping intervals after preprocessing.
-
Discuss scenarios where a different data structure or algorithm would be more appropriate:
- Static Array, Few Queries: If the array is static (not updated) and there are only a few queries, a simple prefix sum array can be more efficient for range sum queries (O(1) query time).
- Single Point Queries, Frequent Updates: If you only need to query individual elements and updates are very frequent, a simple array might be sufficient or a Fenwick Tree (Binary Indexed Tree) could be a better choice as it often has a smaller constant factor.
- Simple Updates and Limited Range Queries: For extremely simple updates and infrequent range queries, iterating through the range is often sufficient.
- Immutable Data: If the data is immutable, persistent segment trees might be considered.
3. Core Algorithm / Data Structure Template
Section titled “3. Core Algorithm / Data Structure Template”Here’s a high-level template for approaching segment-tree problems:
- Define the Operation: Determine the operation to be performed on the range (e.g., sum, min, max).
- Choose Data Structure: Decide on the type of segment tree (e.g., basic, lazy propagation).
- Build the Tree:
- Calculate the size of the segment tree array (usually 4 * n).
- Recursively build the tree from the bottom up.
- Leaf nodes store the values of the original array.
- Internal nodes store the aggregated values of their children.
- Query:
- Recursively traverse the tree to find the nodes that cover the query range.
- If a node’s range is completely within the query range, return its value.
- If a node’s range is completely outside the query range, return a neutral value (e.g., 0 for sum, infinity for min, -infinity for max).
- Otherwise, recursively query the left and right children and combine their results.
- Update:
- Recursively traverse the tree to find the leaf node corresponding to the element to be updated.
- Update the leaf node’s value.
- Update the values of the parent nodes up to the root.
- Lazy Propagation (if applicable):
- Store pending updates in nodes (lazy values).
- Before visiting a node during a query or update, push down the lazy values to its children.
- Apply the lazy value to the current node.
4. Code Implementations (Python, Java, C++)
Section titled “4. Code Implementations (Python, Java, C++)”Python
Section titled “Python”class SegmentTree: def __init__(self, arr): self.arr = arr self.n = len(arr) self.tree = [0] * (4 * self.n) # Allocate space for the tree self.build(0, 0, self.n - 1) # root, start, end
def build(self, node, start, end): if start == end: self.tree[node] = self.arr[start] return
mid = (start + end) // 2 self.build(2 * node + 1, start, mid) # Left child self.build(2 * node + 2, mid + 1, end) # Right child self.tree[node] = self.tree[2 * node + 1] + self.tree[2 * node + 2] # Sum operation
def query(self, node, start, end, left, right): if right < start or end < left: return 0 # Out of range
if left <= start and end <= right: return self.tree[node] # Completely within range
mid = (start + end) // 2 left_sum = self.query(2 * node + 1, start, mid, left, right) right_sum = self.query(2 * node + 2, mid + 1, end, left, right) return left_sum + right_sum # Combine results
def update(self, node, start, end, idx, val): if start == end: self.arr[idx] = val # Update the original array self.tree[node] = val # Update the leaf node return
mid = (start + end) // 2 if idx <= mid: self.update(2 * node + 1, start, mid, idx, val) # Update left child else: self.update(2 * node + 2, mid + 1, end, idx, val) # Update right child self.tree[node] = self.tree[2 * node + 1] + self.tree[2 * node + 2] # Update parentclass SegmentTree { private int[] arr; private int[] tree; private int n;
public SegmentTree(int[] arr) { this.arr = arr; this.n = arr.length; this.tree = new int[4 * n]; // Allocate space for the tree build(0, 0, n - 1); // root, start, end }
private void build(int node, int start, int end) { if (start == end) { tree[node] = arr[start]; return; }
int mid = (start + end) / 2; build(2 * node + 1, start, mid); // Left child build(2 * node + 2, mid + 1, end); // Right child tree[node] = tree[2 * node + 1] + tree[2 * node + 2]; // Sum operation }
public int query(int node, int start, int end, int left, int right) { if (right < start || end < left) { return 0; // Out of range }
if (left <= start && end <= right) { return tree[node]; // Completely within range }
int mid = (start + end) / 2; int leftSum = query(2 * node + 1, start, mid, left, right); int rightSum = query(2 * node + 2, mid + 1, end, left, right); return leftSum + rightSum; // Combine results }
public void update(int node, int start, int end, int idx, int val) { if (start == end) { arr[idx] = val; // Update the original array tree[node] = val; // Update the leaf node return; }
int mid = (start + end) / 2; if (idx <= mid) { update(2 * node + 1, start, mid, idx, val); // Update left child } else { update(2 * node + 2, mid + 1, end, idx, val); // Update right child } tree[node] = tree[2 * node + 1] + tree[2 * node + 2]; // Update parent }}#include <iostream>#include <vector>
using namespace std;
class SegmentTree {public: vector<int> arr; vector<int> tree; int n;
SegmentTree(vector<int>& arr) { this->arr = arr; this->n = arr.size(); tree.resize(4 * n); // Allocate space for the tree build(0, 0, n - 1); // root, start, end }
void build(int node, int start, int end) { if (start == end) { tree[node] = arr[start]; return; }
int mid = (start + end) / 2; build(2 * node + 1, start, mid); // Left child build(2 * node + 2, mid + 1, end); // Right child tree[node] = tree[2 * node + 1] + tree[2 * node + 2]; // Sum operation }
int query(int node, int start, int end, int left, int right) { if (right < start || end < left) { return 0; // Out of range }
if (left <= start && end <= right) { return tree[node]; // Completely within range }
int mid = (start + end) / 2; int left_sum = query(2 * node + 1, start, mid, left, right); int right_sum = query(2 * node + 2, mid + 1, end, left, right); return left_sum + right_sum; // Combine results }
void update(int node, int start, int end, int idx, int val) { if (start == end) { arr[idx] = val; // Update the original array tree[node] = val; // Update the leaf node return; }
int mid = (start + end) / 2; if (idx <= mid) { update(2 * node + 1, start, mid, idx, val); // Update left child } else { update(2 * node + 2, mid + 1, end, idx, val); // Update right child } tree[node] = tree[2 * node + 1] + tree[2 * node + 2]; // Update parent }};5. Complexity Analysis
Section titled “5. Complexity Analysis”| Operation | Time Complexity | Space Complexity |
|---|---|---|
| Build | O(n) | O(n) |
| Query | O(log n) | O(1) |
| Update | O(log n) | O(1) |
| Lazy Propagation | Varies, O(log n) per operation in many cases | O(n) |
- Best Case: The best-case scenario for queries and updates is when the query range or update index corresponds directly to a node in the tree, resulting in O(1) time. This is rare.
- Average Case: The average case for queries and updates is O(log n), as the tree is typically balanced.
- Worst Case: The worst-case scenario for queries and updates is also O(log n), as the height of the tree is logarithmic with respect to the size of the array.
6. Pro Tips, Tricks, and Common Pitfalls
Section titled “6. Pro Tips, Tricks, and Common Pitfalls”-
Pro Tips:
- Lazy Propagation: Master lazy propagation for efficient range updates. It significantly improves performance when dealing with multiple overlapping updates.
- Non-Commutative Operations: Be careful when using segment trees with non-commutative operations (e.g., matrix multiplication). The order of operations matters.
- Memory Allocation: Allocate enough memory for the segment tree array. A common mistake is to allocate only
2 * ninstead of4 * n. - Neutral Elements: Understand and correctly use neutral elements for the chosen operation (e.g., 0 for sum, infinity for min/max).
- Index Management: Be mindful of index offsets (0-based vs. 1-based indexing) when implementing the tree.
-
Common Pitfalls:
- Incorrect Tree Size: Allocating insufficient memory for the tree array. Always use
4 * nor calculate the required size precisely. - Off-by-One Errors: Errors in calculating the
midpoint or in comparing indices during queries and updates. - Missing Base Cases: Forgetting the base cases in the recursive
build,query, andupdatefunctions. - Incorrect Neutral Elements: Using the wrong neutral element for the chosen operation. This will lead to incorrect results.
- Not Pushing Down Lazy Updates: Forgetting to push down lazy updates before visiting a node during a query or further update in lazy propagation implementations.
- Integer Overflow: Ensure that intermediate calculations (e.g., sum of a large range) don’t overflow. Use appropriate data types (e.g.,
long longin C++,longin Java).
- Incorrect Tree Size: Allocating insufficient memory for the tree array. Always use
7. Classic Problem Examples
Section titled “7. Classic Problem Examples”Description: Given an integer array nums, handle multiple queries of the following types: Implement the NumArray class: Example 1: Constraints:
High-Level Approach:
- Use a Segment Tree: A segment tree is perfectly suited for this problem because it allows both range sum queries and updates to individual elements.
- Build the Tree: Construct a segment tree from the given
numsarray. Each node in the tree will store the sum of the elements in its corresponding range. - Update (update(index, val)):
- Traverse the tree to find the leaf node corresponding to the given
index. - Update the value of the leaf node to
val. - Update the sums of all parent nodes along the path from the leaf to the root.
- Traverse the tree to find the leaf node corresponding to the given
- Query (sumRange(left, right)):
- Traverse the tree to find the nodes that cover the range
[left, right]. - If a node’s range is completely within
[left, right], return its sum. - If a node’s range is completely outside
[left, right], return 0. - Otherwise, recursively query the left and right children and combine their results.
- Traverse the tree to find the nodes that cover the range