Radix Sort

radix-sort: The Ultimate Cheatsheet

1. Detailed Explanation

What is radix-sort? Radix sort is a non-comparative sorting algorithm. It sorts data with integer keys by grouping keys by the individual digits which share the same significant position and value (radix). Radix sort can be applied to data that can be lexicographically sorted, such as strings, by treating each character as a digit.
Why is it important and what kind of problems does it solve? Radix sort is important because it can achieve linear time complexity (O(nk) where n is the number of elements and k is the length of the longest key), which is faster than comparison-based sorting algorithms like merge sort or quicksort (O(n log n)). It solves the problem of sorting integers or strings efficiently when the range of values is known or can be estimated. It’s particularly useful when sorting large datasets of integers with a limited range.
Core concepts, underlying principles, and key terminology.
- Radix: The base of the number system being used (e.g., 10 for decimal, 2 for binary, 256 for ASCII characters).
- Digit/Character: An individual component of a key based on the radix.
- Least Significant Digit (LSD): The rightmost digit. LSD radix sort starts with the least significant digit.
- Most Significant Digit (MSD): The leftmost digit. MSD radix sort starts with the most significant digit.
- Counting Sort: A linear-time sorting algorithm often used as a subroutine within radix sort to sort the elements based on a single digit. It works by counting the occurrences of each digit and then using those counts to determine the correct position of each element in the sorted output.
- Buckets/Bins: Temporary storage used to group elements with the same digit value during each pass of the radix sort algorithm.

2. When to Use radix-sort (and When Not To)

Identify problem patterns that suggest radix-sort is a good fit.
- You need to sort integers or strings.
- The range of values is known or can be estimated.
- You want to achieve linear time complexity (O(nk)).
- The keys can be easily broken down into digits or characters.
Discuss scenarios where a different data structure or algorithm would be more appropriate.
- Small datasets: Comparison-based sorts (like insertion sort) might be faster due to lower overhead.
- Floating-point numbers: radix-sort is not directly applicable to floating-point numbers without significant pre-processing due to the representation of exponents and mantissas.
- Large range of values: If the range of values is extremely large and the key lengths are also very large (k is large), the space complexity of radix sort can become a concern, and comparison-based sorts might be preferable.
- Data that doesn’t easily decompose into digits/characters: Radix sort relies on the ability to break down the data into discrete components, which is not always possible or efficient.
- Already partially sorted data: Insertion sort can be very efficient.

3. Core Algorithm / Data Structure Template

Pseudo-code (LSD Radix Sort):

function radix_sort(array, radix):
  // Find the maximum number to determine the number of digits
  max_value = find_max(array)
  num_digits = number_of_digits(max_value, radix)

  for digit_place from 0 to num_digits - 1:
    // Use counting sort to sort the array based on the current digit
    array = counting_sort(array, radix, digit_place)

  return array

function counting_sort(array, radix, digit_place):
  // Create count array to store the count of each digit
  count = array of size radix, initialized to 0

  // Count the occurrences of each digit in the current digit place
  for element in array:
    digit = get_digit(element, radix, digit_place)
    count[digit] = count[digit] + 1

  // Calculate the cumulative counts
  for i from 1 to radix - 1:
    count[i] = count[i] + count[i - 1]

  // Create output array
  output = array of size length(array)

  // Place the elements in the correct position based on the counts
  for element in array (iterate in reverse order):
    digit = get_digit(element, radix, digit_place)
    index = count[digit] - 1
    output[index] = element
    count[digit] = count[digit] - 1

  return output

function get_digit(number, radix, digit_place):
  // Extract the digit at the specified digit place
  return (number / (radix ^ digit_place)) mod radix

function number_of_digits(number, radix):
  // Determine the number of digits in a number based on the radix
  count = 0
  while number > 0:
    number = number / radix
    count = count + 1
  return count

4. Code Implementations (Python, Java, C++)

Python

def radix_sort(arr):
    """Sorts a list of non-negative integers using radix sort (LSD)."""
    if not arr:
        return arr

    max_val = max(arr)
    radix = 10  # Using base 10

    # Perform counting sort for each digit, starting from the least significant
    exp = 1
    while max_val // exp > 0:
        arr = counting_sort(arr, exp, radix)
        exp *= radix
    return arr


def counting_sort(arr, exp, radix):
    """Sorts arr[ ] according to the digit represented by exp."""
    n = len(arr)
    output = [0] * n
    count = [0] * radix

    # Store count of occurrences in count[]
    for i in range(n):
        index = arr[i] // exp
        count[index % radix] += 1

    # Change count[i] so that count[i] now contains actual
    # position of this digit in output array
    for i in range(1, radix):
        count[i] += count[i - 1]

    # Build the output array
    i = n - 1
    while i >= 0:
        index = arr[i] // exp
        output[count[index % radix] - 1] = arr[i]
        count[index % radix] -= 1
        i -= 1

    return output


# Example Usage
if __name__ == '__main__':
    arr = [170, 45, 75, 90, 802, 24, 2, 66]
    sorted_arr = radix_sort(arr)
    print("Sorted array:", sorted_arr)  # Output: Sorted array: [2, 24, 45, 66, 75, 90, 170, 802]

Java

import java.util.Arrays;

class RadixSort {

    public static void radixSort(int[] arr) {
        if (arr == null || arr.length == 0) {
            return;
        }

        int maxVal = Arrays.stream(arr).max().getAsInt();
        int radix = 10; // Using base 10

        for (int exp = 1; maxVal / exp > 0; exp *= radix) {
            countingSort(arr, exp, radix);
        }
    }

    private static void countingSort(int[] arr, int exp, int radix) {
        int n = arr.length;
        int[] output = new int[n];
        int[] count = new int[radix];

        // Store count of occurrences in count[]
        for (int i = 0; i < n; i++) {
            int index = (arr[i] / exp) % radix;
            count[index]++;
        }

        // Change count[i] so that count[i] now contains actual
        // position of this digit in output array
        for (int i = 1; i < radix; i++) {
            count[i] += count[i - 1];
        }

        // Build the output array
        for (int i = n - 1; i >= 0; i--) {
            int index = (arr[i] / exp) % radix;
            output[count[index] - 1] = arr[i];
            count[index]--;
        }

        // Copy the output array to arr[], so that arr[] now
        // contains sorted numbers according to current digit
        System.arraycopy(output, 0, arr, 0, n);
    }

    public static void main(String[] args) {
        int[] arr = {170, 45, 75, 90, 802, 24, 2, 66};
        radixSort(arr);
        System.out.println("Sorted array: " + Arrays.toString(arr)); // Output: Sorted array: [2, 24, 45, 66, 75, 90, 170, 802]
    }
}

C++

#include <iostream>
#include <vector>
#include <algorithm>

using namespace std;

void countingSort(vector<int>& arr, int exp, int radix) {
    int n = arr.size();
    vector<int> output(n);
    vector<int> count(radix, 0);

    // Store count of occurrences in count[]
    for (int i = 0; i < n; i++) {
        int index = (arr[i] / exp) % radix;
        count[index]++;
    }

    // Change count[i] so that count[i] now contains actual
    // position of this digit in output array
    for (int i = 1; i < radix; i++) {
        count[i] += count[i - 1];
    }

    // Build the output array
    for (int i = n - 1; i >= 0; i--) {
        int index = (arr[i] / exp) % radix;
        output[count[index] - 1] = arr[i];
        count[index]--;
    }

    // Copy the output array to arr[], so that arr[] now
    // contains sorted numbers according to current digit
    for (int i = 0; i < n; i++) {
        arr[i] = output[i];
    }
}

void radixSort(vector<int>& arr) {
    if (arr.empty()) {
        return;
    }

    int maxVal = *max_element(arr.begin(), arr.end());
    int radix = 10; // Using base 10

    for (int exp = 1; maxVal / exp > 0; exp *= radix) {
        countingSort(arr, exp, radix);
    }
}

int main() {
    vector<int> arr = {170, 45, 75, 90, 802, 24, 2, 66};
    radixSort(arr);

    cout << "Sorted array: ";
    for (int num : arr) {
        cout << num << " ";
    }
    cout << endl; // Output: Sorted array: 2 24 45 66 75 90 170 802
    return 0;
}

5. Complexity Analysis

Operation	Time Complexity	Space Complexity
Best Case	O(nk)	O(n+k)
Average Case	O(nk)	O(n+k)
Worst Case	O(nk)	O(n+k)

Where:

n is the number of elements in the input array.
k is the maximum number of digits in the largest number (or the length of the longest string).

Explanation:

Time Complexity: Radix sort performs k passes, where k is the number of digits in the largest number. Each pass uses counting sort, which takes O(n+radix) time. Since radix is often a constant (e.g., 10 for decimal numbers, 256 for ASCII characters), it can be considered O(1). Thus, each pass takes O(n) time. Therefore, the overall time complexity is O(nk).
Space Complexity: Counting sort uses an auxiliary array of size radix for counting and an output array of size n. Hence, the space complexity is O(n + radix). Again, if radix is considered constant, the space complexity becomes O(n).

6. Pro Tips, Tricks, and Common Pitfalls

Pro Tips:
- Choosing the right radix can significantly impact performance. A larger radix can reduce the number of passes but may increase the space required for the counting array.
- Radix sort is efficient for fixed-length keys. If keys have highly variable lengths, padding with zeros might be necessary, but it can affect performance.
- Use a stable sorting algorithm (like counting sort) for each digit to ensure the correct order is preserved.
Common Pitfalls:
- Negative Numbers: The standard implementation of radix sort typically works with non-negative integers. Handling negative numbers requires pre-processing (e.g., separating negative and positive numbers, sorting them independently, and then merging). Another approach is to shift the range of numbers.
- Incorrect Radix: Choosing an inappropriate radix can lead to incorrect sorting. Ensure the radix is consistent with the data type being sorted.
- Memory Usage: Radix sort can consume significant memory, especially with a large radix and a large number of elements. Monitor memory usage carefully.
- Forgetting to use a stable sorting algorithm for the digit sorting step: This is essential for radix sort to work correctly.
- Integer Overflow: When calculating indices for the counting sort, be cautious of potential integer overflows, especially when dealing with large numbers or large exponents.

7. Classic Problem Examples

Example: Sort an Array

Description: Given an array of integers nums, sort the array in ascending order and return it. You must solve the problem without using any built-in functions in O(nlog(n)) time complexity and with the smallest space complexity possible.

High-level Approach:

Determine Applicability: Radix sort can be used to solve this problem, especially if the constraints on the number range allow it to be more efficient than O(n log n) comparison sorts. However, since the problem explicitly mentions O(n log n) and small space, a radix sort solution might not be the intended solution by LeetCode.
Handle Negative Numbers (if necessary): Check if the input array contains negative numbers. If it does, you can either:
- Separate the negative and positive numbers, sort them independently, and then merge the results.
- Find the minimum value in the array and add its absolute value to all elements to make them non-negative. After sorting, subtract the absolute value of the minimum back from each element.
Implement Radix Sort: Use the LSD radix sort algorithm as described above. Find the maximum number to determine the number of digits. Then, iterate through each digit place (from least significant to most significant) and use counting sort to sort the array based on that digit.
Consider the Constraints: Evaluate whether the range of numbers allows radix sort to be more efficient than comparison-based sorts like merge sort, which would also satisfy the constraints (O(n log n)). If the range is very large, merge sort might be preferable due to lower memory overhead.