Radix Sort
radix-sort: The Ultimate Cheatsheet
Section titled “radix-sort: The Ultimate Cheatsheet”1. Detailed Explanation
Section titled “1. Detailed Explanation”-
What is radix-sort? Radix sort is a non-comparative sorting algorithm. It sorts data with integer keys by grouping keys by the individual digits which share the same significant position and value (radix). Radix sort can be applied to data that can be lexicographically sorted, such as strings, by treating each character as a digit.
-
Why is it important and what kind of problems does it solve? Radix sort is important because it can achieve linear time complexity (O(nk) where n is the number of elements and k is the length of the longest key), which is faster than comparison-based sorting algorithms like merge sort or quicksort (O(n log n)). It solves the problem of sorting integers or strings efficiently when the range of values is known or can be estimated. It’s particularly useful when sorting large datasets of integers with a limited range.
-
Core concepts, underlying principles, and key terminology.
- Radix: The base of the number system being used (e.g., 10 for decimal, 2 for binary, 256 for ASCII characters).
- Digit/Character: An individual component of a key based on the radix.
- Least Significant Digit (LSD): The rightmost digit. LSD radix sort starts with the least significant digit.
- Most Significant Digit (MSD): The leftmost digit. MSD radix sort starts with the most significant digit.
- Counting Sort: A linear-time sorting algorithm often used as a subroutine within radix sort to sort the elements based on a single digit. It works by counting the occurrences of each digit and then using those counts to determine the correct position of each element in the sorted output.
- Buckets/Bins: Temporary storage used to group elements with the same digit value during each pass of the radix sort algorithm.
2. When to Use radix-sort (and When Not To)
Section titled “2. When to Use radix-sort (and When Not To)”-
Identify problem patterns that suggest radix-sort is a good fit.
- You need to sort integers or strings.
- The range of values is known or can be estimated.
- You want to achieve linear time complexity (O(nk)).
- The keys can be easily broken down into digits or characters.
-
Discuss scenarios where a different data structure or algorithm would be more appropriate.
- Small datasets: Comparison-based sorts (like insertion sort) might be faster due to lower overhead.
- Floating-point numbers: radix-sort is not directly applicable to floating-point numbers without significant pre-processing due to the representation of exponents and mantissas.
- Large range of values: If the range of values is extremely large and the key lengths are also very large (k is large), the space complexity of radix sort can become a concern, and comparison-based sorts might be preferable.
- Data that doesn’t easily decompose into digits/characters: Radix sort relies on the ability to break down the data into discrete components, which is not always possible or efficient.
- Already partially sorted data: Insertion sort can be very efficient.
3. Core Algorithm / Data Structure Template
Section titled “3. Core Algorithm / Data Structure Template”- Pseudo-code (LSD Radix Sort):
function radix_sort(array, radix): // Find the maximum number to determine the number of digits max_value = find_max(array) num_digits = number_of_digits(max_value, radix)
for digit_place from 0 to num_digits - 1: // Use counting sort to sort the array based on the current digit array = counting_sort(array, radix, digit_place)
return array
function counting_sort(array, radix, digit_place): // Create count array to store the count of each digit count = array of size radix, initialized to 0
// Count the occurrences of each digit in the current digit place for element in array: digit = get_digit(element, radix, digit_place) count[digit] = count[digit] + 1
// Calculate the cumulative counts for i from 1 to radix - 1: count[i] = count[i] + count[i - 1]
// Create output array output = array of size length(array)
// Place the elements in the correct position based on the counts for element in array (iterate in reverse order): digit = get_digit(element, radix, digit_place) index = count[digit] - 1 output[index] = element count[digit] = count[digit] - 1
return output
function get_digit(number, radix, digit_place): // Extract the digit at the specified digit place return (number / (radix ^ digit_place)) mod radix
function number_of_digits(number, radix): // Determine the number of digits in a number based on the radix count = 0 while number > 0: number = number / radix count = count + 1 return count4. Code Implementations (Python, Java, C++)
Section titled “4. Code Implementations (Python, Java, C++)”Python
Section titled “Python”def radix_sort(arr): """Sorts a list of non-negative integers using radix sort (LSD).""" if not arr: return arr
max_val = max(arr) radix = 10 # Using base 10
# Perform counting sort for each digit, starting from the least significant exp = 1 while max_val // exp > 0: arr = counting_sort(arr, exp, radix) exp *= radix return arr
def counting_sort(arr, exp, radix): """Sorts arr[ ] according to the digit represented by exp.""" n = len(arr) output = [0] * n count = [0] * radix
# Store count of occurrences in count[] for i in range(n): index = arr[i] // exp count[index % radix] += 1
# Change count[i] so that count[i] now contains actual # position of this digit in output array for i in range(1, radix): count[i] += count[i - 1]
# Build the output array i = n - 1 while i >= 0: index = arr[i] // exp output[count[index % radix] - 1] = arr[i] count[index % radix] -= 1 i -= 1
return output
# Example Usageif __name__ == '__main__': arr = [170, 45, 75, 90, 802, 24, 2, 66] sorted_arr = radix_sort(arr) print("Sorted array:", sorted_arr) # Output: Sorted array: [2, 24, 45, 66, 75, 90, 170, 802]import java.util.Arrays;
class RadixSort {
public static void radixSort(int[] arr) { if (arr == null || arr.length == 0) { return; }
int maxVal = Arrays.stream(arr).max().getAsInt(); int radix = 10; // Using base 10
for (int exp = 1; maxVal / exp > 0; exp *= radix) { countingSort(arr, exp, radix); } }
private static void countingSort(int[] arr, int exp, int radix) { int n = arr.length; int[] output = new int[n]; int[] count = new int[radix];
// Store count of occurrences in count[] for (int i = 0; i < n; i++) { int index = (arr[i] / exp) % radix; count[index]++; }
// Change count[i] so that count[i] now contains actual // position of this digit in output array for (int i = 1; i < radix; i++) { count[i] += count[i - 1]; }
// Build the output array for (int i = n - 1; i >= 0; i--) { int index = (arr[i] / exp) % radix; output[count[index] - 1] = arr[i]; count[index]--; }
// Copy the output array to arr[], so that arr[] now // contains sorted numbers according to current digit System.arraycopy(output, 0, arr, 0, n); }
public static void main(String[] args) { int[] arr = {170, 45, 75, 90, 802, 24, 2, 66}; radixSort(arr); System.out.println("Sorted array: " + Arrays.toString(arr)); // Output: Sorted array: [2, 24, 45, 66, 75, 90, 170, 802] }}#include <iostream>#include <vector>#include <algorithm>
using namespace std;
void countingSort(vector<int>& arr, int exp, int radix) { int n = arr.size(); vector<int> output(n); vector<int> count(radix, 0);
// Store count of occurrences in count[] for (int i = 0; i < n; i++) { int index = (arr[i] / exp) % radix; count[index]++; }
// Change count[i] so that count[i] now contains actual // position of this digit in output array for (int i = 1; i < radix; i++) { count[i] += count[i - 1]; }
// Build the output array for (int i = n - 1; i >= 0; i--) { int index = (arr[i] / exp) % radix; output[count[index] - 1] = arr[i]; count[index]--; }
// Copy the output array to arr[], so that arr[] now // contains sorted numbers according to current digit for (int i = 0; i < n; i++) { arr[i] = output[i]; }}
void radixSort(vector<int>& arr) { if (arr.empty()) { return; }
int maxVal = *max_element(arr.begin(), arr.end()); int radix = 10; // Using base 10
for (int exp = 1; maxVal / exp > 0; exp *= radix) { countingSort(arr, exp, radix); }}
int main() { vector<int> arr = {170, 45, 75, 90, 802, 24, 2, 66}; radixSort(arr);
cout << "Sorted array: "; for (int num : arr) { cout << num << " "; } cout << endl; // Output: Sorted array: 2 24 45 66 75 90 170 802 return 0;}5. Complexity Analysis
Section titled “5. Complexity Analysis”| Operation | Time Complexity | Space Complexity |
|---|---|---|
| Best Case | O(nk) | O(n+k) |
| Average Case | O(nk) | O(n+k) |
| Worst Case | O(nk) | O(n+k) |
Where:
nis the number of elements in the input array.kis the maximum number of digits in the largest number (or the length of the longest string).
Explanation:
- Time Complexity: Radix sort performs
kpasses, wherekis the number of digits in the largest number. Each pass uses counting sort, which takes O(n+radix) time. Sinceradixis often a constant (e.g., 10 for decimal numbers, 256 for ASCII characters), it can be considered O(1). Thus, each pass takes O(n) time. Therefore, the overall time complexity is O(nk). - Space Complexity: Counting sort uses an auxiliary array of size
radixfor counting and an output array of sizen. Hence, the space complexity is O(n + radix). Again, ifradixis considered constant, the space complexity becomes O(n).
6. Pro Tips, Tricks, and Common Pitfalls
Section titled “6. Pro Tips, Tricks, and Common Pitfalls”- Pro Tips:
- Choosing the right radix can significantly impact performance. A larger radix can reduce the number of passes but may increase the space required for the counting array.
- Radix sort is efficient for fixed-length keys. If keys have highly variable lengths, padding with zeros might be necessary, but it can affect performance.
- Use a stable sorting algorithm (like counting sort) for each digit to ensure the correct order is preserved.
- Common Pitfalls:
- Negative Numbers: The standard implementation of radix sort typically works with non-negative integers. Handling negative numbers requires pre-processing (e.g., separating negative and positive numbers, sorting them independently, and then merging). Another approach is to shift the range of numbers.
- Incorrect Radix: Choosing an inappropriate radix can lead to incorrect sorting. Ensure the radix is consistent with the data type being sorted.
- Memory Usage: Radix sort can consume significant memory, especially with a large radix and a large number of elements. Monitor memory usage carefully.
- Forgetting to use a stable sorting algorithm for the digit sorting step: This is essential for radix sort to work correctly.
- Integer Overflow: When calculating indices for the counting sort, be cautious of potential integer overflows, especially when dealing with large numbers or large exponents.
7. Classic Problem Examples
Section titled “7. Classic Problem Examples”Description: Given an array of integers nums, sort the array in ascending order and return it. You must solve the problem without using any built-in functions in O(nlog(n)) time complexity and with the smallest space complexity possible.
High-level Approach:
-
Determine Applicability: Radix sort can be used to solve this problem, especially if the constraints on the number range allow it to be more efficient than O(n log n) comparison sorts. However, since the problem explicitly mentions O(n log n) and small space, a radix sort solution might not be the intended solution by LeetCode.
-
Handle Negative Numbers (if necessary): Check if the input array contains negative numbers. If it does, you can either:
- Separate the negative and positive numbers, sort them independently, and then merge the results.
- Find the minimum value in the array and add its absolute value to all elements to make them non-negative. After sorting, subtract the absolute value of the minimum back from each element.
-
Implement Radix Sort: Use the LSD radix sort algorithm as described above. Find the maximum number to determine the number of digits. Then, iterate through each digit place (from least significant to most significant) and use counting sort to sort the array based on that digit.
-
Consider the Constraints: Evaluate whether the range of numbers allows radix sort to be more efficient than comparison-based sorts like merge sort, which would also satisfy the constraints (O(n log n)). If the range is very large, merge sort might be preferable due to lower memory overhead.