Edit Distance (Levenshtein)
Generated on 2025-07-10 02:01:45 Algorithm Cheatsheet for Technical Interviews
Edit Distance (Levenshtein) Cheatsheet
Section titled “Edit Distance (Levenshtein) Cheatsheet”1. Quick Overview
Section titled “1. Quick Overview”What it is: The Edit Distance (Levenshtein Distance) measures the similarity between two strings by calculating the minimum number of single-character edits (insertions, deletions, or substitutions) required to change one string into the other.
When to use it:
- Spell checking and correction
- DNA sequencing
- Information retrieval (fuzzy string matching)
- Data cleaning and deduplication
- String similarity scoring
2. Time & Space Complexity
Section titled “2. Time & Space Complexity”| Approach | Time Complexity | Space Complexity |
|---|---|---|
| Dynamic Programming (Standard) | O(m * n) | O(m * n) |
| Dynamic Programming (Space Optimized) | O(m * n) | O(min(m, n)) |
Where:
mis the length of the first string.nis the length of the second string.
3. Implementation
Section titled “3. Implementation”Python:
def edit_distance(s1, s2): """ Calculates the Levenshtein distance between two strings.
Args: s1: The first string. s2: The second string.
Returns: The Levenshtein distance between s1 and s2. """ m, n = len(s1), len(s2) dp = [[0] * (n + 1) for _ in range(m + 1)]
# Initialize the first row and column for i in range(m + 1): dp[i][0] = i for j in range(n + 1): dp[0][j] = j
# Fill in the DP table for i in range(1, m + 1): for j in range(1, n + 1): if s1[i - 1] == s2[j - 1]: dp[i][j] = dp[i - 1][j - 1] # No cost if characters match else: dp[i][j] = 1 + min( dp[i - 1][j], # Deletion dp[i][j - 1], # Insertion dp[i - 1][j - 1] # Substitution )
return dp[m][n]
# Example usages1 = "kitten"s2 = "sitting"distance = edit_distance(s1, s2)print(f"Edit distance between '{s1}' and '{s2}': {distance}") # Output: 3Java:
class Solution { public int editDistance(String s1, String s2) { int m = s1.length(); int n = s2.length();
int[][] dp = new int[m + 1][n + 1];
// Initialize the first row and column for (int i = 0; i <= m; i++) { dp[i][0] = i; } for (int j = 0; j <= n; j++) { dp[0][j] = j; }
// Fill in the DP table for (int i = 1; i <= m; i++) { for (int j = 1; j <= n; j++) { if (s1.charAt(i - 1) == s2.charAt(j - 1)) { dp[i][j] = dp[i - 1][j - 1]; } else { dp[i][j] = 1 + Math.min( dp[i - 1][j], // Deletion Math.min(dp[i][j - 1], // Insertion dp[i - 1][j - 1]) // Substitution ); } } }
return dp[m][n]; }
public static void main(String[] args) { Solution sol = new Solution(); String s1 = "kitten"; String s2 = "sitting"; int distance = sol.editDistance(s1, s2); System.out.println("Edit distance between '" + s1 + "' and '" + s2 + "': " + distance); // Output: 3 }}C++:
#include <iostream>#include <string>#include <vector>#include <algorithm>
using namespace std;
int editDistance(string s1, string s2) { int m = s1.length(); int n = s2.length();
vector<vector<int>> dp(m + 1, vector<int>(n + 1));
// Initialize the first row and column for (int i = 0; i <= m; i++) { dp[i][0] = i; } for (int j = 0; j <= n; j++) { dp[0][j] = j; }
// Fill in the DP table for (int i = 1; i <= m; i++) { for (int j = 1; j <= n; j++) { if (s1[i - 1] == s2[j - 1]) { dp[i][j] = dp[i - 1][j - 1]; } else { dp[i][j] = 1 + min({ dp[i - 1][j], // Deletion dp[i][j - 1], // Insertion dp[i - 1][j - 1] // Substitution }); } } }
return dp[m][n];}
int main() { string s1 = "kitten"; string s2 = "sitting"; int distance = editDistance(s1, s2); cout << "Edit distance between '" << s1 << "' and '" << s2 << "': " << distance << endl; // Output: 3 return 0;}4. Common Patterns
Section titled “4. Common Patterns”-
Standard Dynamic Programming: Builds a 2D table
dp[i][j]wheredp[i][j]represents the edit distance betweens1[0...i-1]ands2[0...j-1]. -
Space Optimization: Since each row
ionly depends on the previous rowi-1, we can optimize space by using only two rows (or even one row) to store the necessary values. This reduces space complexity from O(m*n) to O(min(m, n)). -
Variations:
- Weighted Edit Distance: Different edit operations (insertion, deletion, substitution) have different costs.
- Damerau-Levenshtein Distance: Includes transposition (swapping adjacent characters) as an edit operation.
5. Pro Tips
Section titled “5. Pro Tips”-
Base Cases: Always handle the base cases correctly (empty strings).
dp[i][0] = ianddp[0][j] = j. -
Early Exit: If the strings are identical, the edit distance is 0. Check this before starting the DP process for a minor optimization.
-
Space Optimization Implementation: When using space optimization, be careful with the order of updates to avoid overwriting values needed in subsequent calculations. Use temporary variables if necessary.
-
Understand the DP Table: Visualizing the DP table and tracing the path to the final result can help debug and understand the algorithm.
-
Edge Cases: Consider cases with very long strings or strings with special characters.
6. Practice Problems
Section titled “6. Practice Problems”- LeetCode 72. Edit Distance: (Classic edit distance problem)
- LeetCode 583. Delete Operation for Two Strings: (Similar to edit distance, but only allows deletions)
- LeetCode 161. One Edit Distance: (Check if two strings are one edit away)
7. Templates
Section titled “7. Templates”These templates provide basic structure and can be easily adapted for specific problems.
C++ Template:
#include <iostream>#include <string>#include <vector>#include <algorithm>
using namespace std;
int solveEditDistance(string s1, string s2) { int m = s1.length(); int n = s2.length();
// Create DP table vector<vector<int>> dp(m + 1, vector<int>(n + 1, 0));
// Initialize base cases
// Fill DP table for (int i = 1; i <= m; ++i) { for (int j = 1; j <= n; ++j) { // Implement logic here based on s1[i-1] and s2[j-1] } }
return dp[m][n]; // Return final result}
int main() { string s1, s2; cin >> s1 >> s2; cout << solveEditDistance(s1, s2) << endl; return 0;}Python Template:
def solve_edit_distance(s1, s2): m, n = len(s1), len(s2)
# Create DP table dp = [[0] * (n + 1) for _ in range(m + 1)]
# Initialize base cases
# Fill DP table for i in range(1, m + 1): for j in range(1, n + 1): # Implement logic here based on s1[i-1] and s2[j-1] pass
return dp[m][n] # Return final result
# Example usages1 = input()s2 = input()result = solve_edit_distance(s1, s2)print(result)Java Template:
class Solution { public int solveEditDistance(String s1, String s2) { int m = s1.length(); int n = s2.length();
// Create DP table int[][] dp = new int[m + 1][n + 1];
// Initialize base cases
// Fill DP table for (int i = 1; i <= m; i++) { for (int j = 1; j <= n; j++) { // Implement logic here based on s1.charAt(i-1) and s2.charAt(j-1)
} }
return dp[m][n]; // Return final result }
public static void main(String[] args) { Solution sol = new Solution(); String s1 = "example"; String s2 = "sample"; int result = sol.solveEditDistance(s1, s2); System.out.println(result); }}