Edit Distance (Levenshtein): Dynamic Programming on Two Strings
Edit distance (also called Levenshtein distance) is the minimum number of single-character edits — insert, delete, or replace — needed to turn one string into another. Turning "horse" into "ros" takes 3 edits. It powers spell-checkers, fuzzy search, and diff, and it’s a beautiful 2D DP closely related to Longest Common Subsequence.
Defining the DP
Compare prefixes of the two strings. The subproblem is “how many edits to turn the first i characters of a into the first j characters of b?”
- State:
dp[i][j]= edit distance betweena[0..i-1]andb[0..j-1]. - Transition:
- if
a[i-1] == b[j-1]: no edit needed →dp[i][j] = dp[i-1][j-1]. - else take the cheapest of three operations, plus 1:
- replace:
dp[i-1][j-1] - delete from
a:dp[i-1][j] - insert into
a:dp[i][j-1]
- replace:
- if
- Base cases:
dp[i][0] = i(delete allichars),dp[0][j] = j(insert alljchars).
The answer is dp[m][n].
Tabulation: O(mn) time, O(mn) space
int editDistance(const std::string& a, const std::string& b) {
int m = a.size(), n = b.size();
std::vector<std::vector<int>> dp(m + 1, std::vector<int>(n + 1, 0));
for (int i = 0; i <= m; i++) dp[i][0] = i; // base: delete all
for (int j = 0; j <= n; j++) dp[0][j] = j; // base: insert all
for (int i = 1; i <= m; i++)
for (int j = 1; j <= n; j++)
if (a[i - 1] == b[j - 1])
dp[i][j] = dp[i - 1][j - 1];
else
dp[i][j] = 1 + std::min({ dp[i - 1][j - 1], // replace
dp[i - 1][j], // delete
dp[i][j - 1] }); // insert
return dp[m][n];
}
int editDistance(String a, String b) {
int m = a.length(), n = b.length();
int[][] dp = new int[m + 1][n + 1];
for (int i = 0; i <= m; i++) dp[i][0] = i; // base: delete all
for (int j = 0; j <= n; j++) dp[0][j] = j; // base: insert all
for (int i = 1; i <= m; i++)
for (int j = 1; j <= n; j++)
if (a.charAt(i - 1) == b.charAt(j - 1))
dp[i][j] = dp[i - 1][j - 1];
else
dp[i][j] = 1 + Math.min(dp[i - 1][j - 1], // replace
Math.min(dp[i - 1][j], dp[i][j - 1])); // delete / insert
return dp[m][n];
}
function editDistance(a, b) {
const m = a.length, n = b.length;
const dp = Array.from({ length: m + 1 }, () => new Array(n + 1).fill(0));
for (let i = 0; i <= m; i++) dp[i][0] = i; // base: delete all
for (let j = 0; j <= n; j++) dp[0][j] = j; // base: insert all
for (let i = 1; i <= m; i++)
for (let j = 1; j <= n; j++)
if (a[i - 1] === b[j - 1])
dp[i][j] = dp[i - 1][j - 1];
else
dp[i][j] = 1 + Math.min(dp[i - 1][j - 1], // replace
dp[i - 1][j], // delete
dp[i][j - 1]); // insert
return dp[m][n];
}
def edit_distance(a, b):
m, n = len(a), len(b)
dp = [[0] * (n + 1) for _ in range(m + 1)]
for i in range(m + 1):
dp[i][0] = i # base: delete all
for j in range(n + 1):
dp[0][j] = j # base: insert all
for i in range(1, m + 1):
for j in range(1, n + 1):
if a[i - 1] == b[j - 1]:
dp[i][j] = dp[i - 1][j - 1]
else:
dp[i][j] = 1 + min(dp[i - 1][j - 1], # replace
dp[i - 1][j], # delete
dp[i][j - 1]) # insert
return dp[m][n]
Reading the three moves
Each transition maps to a concrete edit on a to make it match b:
dp[i-1][j-1]— characters already match (free) or are replaced.dp[i-1][j]— deletea[i-1], then solve the rest.dp[i][j-1]— insertb[j-1], then solve the rest.
For beginners: The base cases carry the whole logic for empty prefixes. To turn
""into"abc"you must insert 3 characters, sodp[0][3] = 3. Forgetting to initialize that first row/column is the most common bug.
Space optimization
Each cell depends only on the current and previous row, so you can keep two rows (or even one, with care) to drop space to O(min(m, n)). Keep the full grid only if you need to reconstruct the actual sequence of edits.
int editDistance(const std::string& a, const std::string& b) {
int m = a.size(), n = b.size();
std::vector<int> prev(n + 1), cur(n + 1);
for (int j = 0; j <= n; j++) prev[j] = j;
for (int i = 1; i <= m; i++) {
cur[0] = i;
for (int j = 1; j <= n; j++)
cur[j] = (a[i - 1] == b[j - 1]) ? prev[j - 1]
: 1 + std::min({ prev[j - 1], prev[j], cur[j - 1] });
std::swap(prev, cur);
}
return prev[n];
}
int editDistance(String a, String b) {
int m = a.length(), n = b.length();
int[] prev = new int[n + 1], cur = new int[n + 1];
for (int j = 0; j <= n; j++) prev[j] = j;
for (int i = 1; i <= m; i++) {
cur[0] = i;
for (int j = 1; j <= n; j++)
cur[j] = (a.charAt(i - 1) == b.charAt(j - 1)) ? prev[j - 1]
: 1 + Math.min(prev[j - 1], Math.min(prev[j], cur[j - 1]));
int[] tmp = prev; prev = cur; cur = tmp;
}
return prev[n];
}
function editDistance(a, b) {
const m = a.length, n = b.length;
let prev = new Array(n + 1), cur = new Array(n + 1);
for (let j = 0; j <= n; j++) prev[j] = j;
for (let i = 1; i <= m; i++) {
cur[0] = i;
for (let j = 1; j <= n; j++)
cur[j] = a[i - 1] === b[j - 1] ? prev[j - 1]
: 1 + Math.min(prev[j - 1], prev[j], cur[j - 1]);
[prev, cur] = [cur, prev];
}
return prev[n];
}
def edit_distance(a, b):
m, n = len(a), len(b)
prev = list(range(n + 1))
cur = [0] * (n + 1)
for i in range(1, m + 1):
cur[0] = i
for j in range(1, n + 1):
cur[j] = (prev[j - 1] if a[i - 1] == b[j - 1]
else 1 + min(prev[j - 1], prev[j], cur[j - 1]))
prev, cur = cur, prev
return prev[n]
Complexity
| Approach | Time | Space |
|---|---|---|
| 2D DP | O(mn) | O(mn) |
| Rolling rows | O(mn) | O(min(m, n)) |
Pitfall: When
a[i-1] == b[j-1], takedp[i-1][j-1]directly — do not add 1. Adding 1 on a match is the classic error that inflates every answer.
Edit distance and LCS are siblings; both generalize to the subsequence & string DP patterns. Next, see two-dimensional movement DP in DP on Grids.