区间DP的问题,我们用dp[i][j]表示子串s[i:j]的最小压缩串,我们有如下递推方程:
- dp[i][j] = min(dp[i][k] + dp[k + 1][j]), where 1 <= k < j, '+' means concatenate here
- dp[i][j] = min(dp[i][j], compress(s[i:j])), compress is a function to find if we can shorten input string by find a repeating pattern in itself
这里compress的功能是,找到string自身的重复串来压缩string。比如对于abcabcabc,我们只进行区间dp是没有办法找到3[abc]的,所以我们必须对自身进行compress。对自身进行压缩的就是寻找自身最小覆盖子串的过程。有多个覆盖子串我们肯定是取最小的,对于d[str]的形式,如果我们double str的长度最多也就使得d减少一位,所以str越短越好。
时间复杂度O(n^3),寻找自身最小覆盖子串需要线性时间,代码如下:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
class Solution { | |
public: | |
string encode(string s) { | |
int len = s.size(); | |
//dp represents minimum encoded stirng for substring s[i:j] | |
//dp[i][j] = min(min(dp[i][k] + dp[k + 1][j]), compress(s[i:j])), where i <= k < j | |
//compress(s) represent compressed s by finding a repeating pattern in s itself | |
//e.g., for s = "abcabcabc" compress(s) = 3[abc] | |
//for s = "abcabcab", compress(s) = abcabcab since we can't find a repeating | |
//pattern which can reconstruct s | |
vector<vector<string>> dp(len, vector<string>(len)); | |
for(int j = 0; j < len; ++j) | |
{ | |
for(int i = j; i >= 0 ; --i) | |
{ | |
dp[i][j] = s.substr(i, j - i + 1); | |
if(j - i >= 4) | |
{ | |
int subLen = compress(dp[i][j]); | |
if(subLen) | |
{ | |
string compressed = to_string((j - i + 1) / subLen) + "[" + dp[i][i + subLen - 1] + "]"; | |
if(compressed.size() < dp[i][j].size())dp[i][j] = compressed; | |
} | |
for(int k = i; k < j; ++k) | |
{ | |
string str = dp[i][k] + dp[k + 1][j]; | |
if(str.size() < dp[i][j].size()) | |
dp[i][j] = str; | |
} | |
} | |
} | |
} | |
return dp[0][len - 1]; | |
} | |
private: | |
//if current string itself can find a repeating pattern | |
int compress(string str) | |
{ | |
//kmp to find minimum substring cover | |
//same as https://leetcode.com/problems/repeated-substring-pattern/description/ | |
int k = -1, i = 0, len = str.size(); | |
vector<int> next(len + 1, -1); | |
while(i < len) | |
{ | |
if(k == -1 || str[k] == str[i]) | |
next[++i] = ++k; | |
else k = next[k]; | |
} | |
int subLen = len - next[len]; | |
return len % subLen? 0: subLen; | |
} | |
}; |
No comments:
Post a Comment