Given a non-empty string, encode the string such that its encoded length is the shortest.
The encoding rule is: k[encoded_string], where the encoded_string inside the square brackets is being repeated exactly k times.
Note:
k will be a positive integer and encoded string will not be empty or have extra space.
You may assume that the input string contains only lowercase English letters. The string‘s length is at most 160.
If an encoding process does not make the string shorter, then do not encode it. If there are several solutions, return any of them is fine.
Example 1:
Input: "aaa"
Output: "aaa"
Explanation: There is no way to encode it such that it is shorter than the input string, so we do not encode it.
Example 2:
Input: "aaaaa"
Output: "5[a]"
Explanation: "5[a]" is shorter than "aaaaa" by 1 character.
Example 3:
Input: "aaaaaaaaaa"
Output: "10[a]"
Explanation: "a9[a]" or "9[a]a" are also valid solutions, both of them have the same length = 5, which is the same as "10[a]".
Example 4:
Input: "aabcaabcd"
Output: "2[aabc]d"
Explanation: "aabc" occurs twice, so one answer can be "2[aabc]d".
Example 5:
Input: "abbbabbbcabbbabbbc"
Output: "2[2[abbb]c]"
Explanation: "abbbabbbc" occurs twice, but "abbbabbbc" can also be encoded to "2[abbb]c", so one answer can be "2[2[abbb]c]".
DP:
Initially I think of 1D DP, dp[i] stands for the shortest string of first i characters, then:
dp[i] = minLen{dp[k] + encode(substring(k+1, i))}
then I realize that the second part encode(substring(k+1, i)) is actually the same with our dp problem. So it turns out the transfer function is
dp[i] = minLen{dp[k] + dp(substring(k+1, i))}
then 1D is not enough, I introduce the second dimension, which indicates the end. dp[i][j] is the shortest encoded string from i to j
But the hardest part of this problem is how to generate dp[i][j] from dp[i][k] and dp[k+1][j]
I‘ve thought about the cases like:
dp[i][k] = 3[abc] dp[k+1][j] = 2[abc], then dp[i][j] = 5[abc]
dp[i][k] = 3[abc] dp[k+1][j] = xyz, then dp[i][j] = 3[abc]xyz
dp[i][k] = aabc dp[k+1][j] = aabc, then dp[i][j] = 2[aabc]
No idea what to implement this conveniently, so refer to idea https://discuss.leetcode.com/topic/71963/accepted-solution-in-java
The idea is to firstly concantenate dp[i][k] and dp[k+1][j] directly to construct dp[i][j], and then check if there exist possible repeat patterns in the original substring s.substring(i, j+1) that could further shorten dp[i][j]
replaceAll function is really clever
1 public class Solution {
2 public String encode(String s) {
3 if (s==null || s.length()==0) return "";
4 String[][] dp = new String[s.length()][s.length()];
5
6 for (int len=0; len) {
7 for (int i=0; i+len) {
8 int j = i + len;
9 String subStr = s.substring(i, j+1);
10 dp[i][j] = subStr; //initialize
11 if (len <4) continue;
12 for (int k=i; k) {
13 if (dp[i][k].length() + dp[k+1][j].length() < dp[i][j].length()) {
14 dp[i][j] = dp[i][k] + dp[k+1][j];
15 }
16 }
17
18 //check if subStr has repeat pattern
19 for (int k=i; k) {
20 String repeat = s.substring(i, k+1);
21 if (subStr.length()%(k-i+1)==0 && subStr.replaceAll(repeat, "").length()==0) {
22 String ss = subStr.length()/repeat.length() + "[" + dp[i][k] + "]";
23 if (ss.length() < dp[i][j].length())
24 dp[i][j] = ss;
25 }
26 }
27 }
28 }
29 return dp[0][s.length()-1];
30 }
31 }
Leetcode: Encode String with Shortest Length && G面经