Sunday, February 18, 2018

[LintCode]Longest Common Substring


经典的DP问题,寻找最长公共子串的。请注意和公共最长子序列是有区别的。子序列可以不连续,而子串是必须要连续的。假设输入字符串为s1和s2,我们用DP[i][j]表示s1中在i结尾,s2中在j结尾的最长公共子串的长度。那么我们有递推公式:

  • DP[i][j] = DP[i - 1][j - 1] + 1, if s1[i] == s2[j]
  • else, DP[i][j] = 0
我们return DP中的最大值即可,假设输入串长度分别为m和n,时间空间复杂度均为O(m * n),代码如下:


class Solution {
public:
/**
* @param A: A string
* @param B: A string
* @return: the length of the longest common substring.
*/
int longestCommonSubstring(string &A, string &B) {
// write your code here
int lenA = A.size(), lenB = B.size();
//dp[i][j] represents length of LCS ended at i in A and j in B
//dp[i][j] = A[i] == B[j]? dp[i - 1][j - 1] + 1: 0;
//return max value in dp
vector<vector<int>> dp(lenA, vector<int>(lenB, 0));
int res = 0;
for(int i = 0; i < lenA; ++i)
{
for(int j = 0; j < lenB; ++j)
{
dp[i][j] = A[i] == B[j]? (i && j? dp[i - 1][j - 1] + 1: 1): 0;
res = max(res, dp[i][j]);
}
}
return res;
}
};



优化方面,我们注意到DP每次只会用到当前行和上一行的数据,所以我们开一个2 * n的滚动数组即可。空间复杂度可以降低为O(n)。注意在代码中我们用e来控制行的滚动,e每次在0,1,0,1...中切换,e为0的时候,上一行就在1;e为1的时候,上一行就在0。代码如下:


class Solution {
public:
/**
* @param A: A string
* @param B: A string
* @return: the length of the longest common substring.
*/
int longestCommonSubstring(string &A, string &B) {
// write your code here
int lenA = A.size(), lenB = B.size();
//dp[i][j] represents length of LCS ended at i in A and j in B
//dp[i][j] = A[i] == B[j]? dp[i - 1][j - 1] + 1: 0;
//return max value in dp
//rolling array optimization
vector<vector<int>> dp(2, vector<int>(lenB, 0));
int res = 0, e = 1;
for(int i = 0; i < lenA; ++i, e ^= 1)
{
for(int j = 0; j < lenB; ++j)
{
dp[e][j] = A[i] == B[j]? (j? dp[e ^ 1][j - 1] + 1: 1): 0;
res = max(res, dp[e][j]);
}
}
return res;
}
};

No comments:

Post a Comment