Consider 2 sequences X[1..m] and Y[1..n]. The memoization algorithm would compute the LCS in time O(m*n). Is there any better algorithm to find out LCS wrt time? I guess memoization done diagonally can give us O(min(m,n)) time complexity.
-
Perhaps you mean Longest Common Substring? en.wikipedia.org/wiki/Longest_common_substring_problemTomer Vromen– Tomer Vromen2010-06-09 06:19:17 +00:00Commented Jun 9, 2010 at 6:19
-
1Nope. en.wikipedia.org/wiki/Longest_common_subsequence_problemdan04– dan042010-06-09 06:21:45 +00:00Commented Jun 9, 2010 at 6:21
-
Nope its subsequence allright.tsudot– tsudot2010-07-15 19:44:26 +00:00Commented Jul 15, 2010 at 19:44
3 Answers
Gene Myers in 1986 came up with a very nice algorithm for this, described here: An O(ND) Difference Algorithm and Its Variations.
This algorithm takes time proportional to the edit distance between sequences, so it is much faster when the difference is small. It works by looping over all possible edit distances, starting from 0, until it finds a distance for which an edit script (in some ways the dual of an LCS) can be constructed. This means that you can "bail out early" if the difference grows above some threshold, which is sometimes convenient.
I believe this algorithm is still used in many diff implementations.
Comments
yes we could create a better algorithm than Order O(m*n)--- i.e O(min(m,n)). to find a length..... just compare the diagonal elements.and whenever the increment is done suppose it occured in c[2,2] then increment all the value from c[2,2++] and c[2++,2] by 1.. and proceed till c[m,m]..(suppose m