Sweet Snippet 之 字符串编辑距离
source link: https://blog.csdn.net/tkokof1/article/details/100709721
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Sweet Snippet 之 字符串编辑距离
字符串编辑距离的简单实现
字符串编辑距离应该是动态规划中的代表问题了:
给定两个字符串 a a a 与 b b b,求解将 a a a 编辑至 b b b 的操作步数(距离),编辑包含以下两种操作:
- 删除某一字符
- 增加某一字符
(这里我们不允许变更某一字符,注意一下)
求解方法则是根据子问题的结果"递推"出原问题的结果:
设字符串 a a a 的长度为 m m m, 字符串 b b b 的长度为 n n n, 我们定义问题 C ( i , j ) C(i, j) C(i,j)
C ( i , j ) C(i, j) C(i,j) : a a a 的(前缀)子串(长度为 i i i) 与 b b b 的(前缀)子串(长度为 j j j) 的字符串编辑距离.
接着就是 C ( i , j ) C(i, j) C(i,j) 的递推公式了(这里就不做细节的讲述了,不熟悉的朋友可以参考进一步的资料)
C ( i , j ) = { i , i f j = 0 j , i f i = 0 C ( i − 1 , j − 1 ) , i f a [ i ] = b [ j ] m i n ( C ( i − 1 , j ) , C ( i , j − 1 ) ) + 1 , o t h e r w i s e C(i, j) = \left\{
下面简单列份实现(Lua):
-- get key from two index
function get_key(m, n)
return m .. "_" .. n
end
function edit_dist_iter(a, b, m, n)
local edit_dist_buffer = {}
edit_dist_buffer[get_key(0, 0)] = 0
for i = 1, m do
edit_dist_buffer[get_key(i, 0)] = i
end
for i = 1, n do
edit_dist_buffer[get_key(0, i)] = i
end
for i = 1, m do
for j = 1, n do
local ac = a:sub(i, i)
local bc = b:sub(j, j)
if ac == bc then
edit_dist_buffer[get_key(i, j)] = edit_dist_buffer[get_key(i - 1, j - 1)]
else
local d1 = edit_dist_buffer[get_key(i - 1, j)]
local d2 = edit_dist_buffer[get_key(i, j - 1)]
edit_dist_buffer[get_key(i, j)] = math.min(d1, d2) + 1
end
end
end
return edit_dist_buffer[get_key(m, n)]
end
function edit_dist(a, b)
return edit_dist_iter(a, b, #a, #b)
end
以上的代码使用了迭代形式,我们也可以用递归形式(来编写代码),只是递归会引起不少的重复计算,所以(工程)实现上,我们需要使用缓存来记录计算过的子问题结果(迭代版本也使用了缓存,作用上和递归版本其实也是一致的,记录的也是子问题的结果):
-- get key from two index
function get_key(m, n)
return m .. "_" .. n
end
function edit_dist_recur(a, b, m, n, buffer)
if m <= 0 then
-- result is trivial, do not need buffer
return n
elseif n <= 0 then
-- result is trivial, do not need buffer
return m
else
local ac = a:sub(m, m)
local bc = b:sub(n, n)
if ac == bc then
local d = buffer[get_key(m - 1, n - 1)]
if d then
buffer[get_key(m, n)] = d
return d
else
local d = edit_dist_recur(a, b, m - 1, n - 1, buffer)
buffer[get_key(m, n)] = d
return d
end
else
local d1 = buffer[get_key(m - 1, n)]
if not d1 then
d1 = edit_dist_recur(a, b, m - 1, n, buffer)
end
local d2 = buffer[get_key(m, n - 1)]
if not d2 then
d2 = edit_dist_recur(a, b, m, n - 1, buffer)
end
local d = math.min(d1, d2) + 1
buffer[get_key(m, n)] = d
return d
end
end
end
function edit_dist(a, b)
-- create buffer
local edit_dist_buffer = {}
return edit_dist_recur(a, b, #a, #b, edit_dist_buffer)
end
另外还看到一种基于编辑图(Edit Graph)的实现方式,不过思路上仍然和之前的讲述是一致的,实现上则会更复杂些,在此就不列代码了~
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK