1, Application scenario - string matching
2, Solution - violence matching
If we use the idea of violent matching and assume that str1 is now matched to position i and the substring str2 is matched to position j, then:
public static int violenceMatch(String source, String target) { int i = 0, j = 0; char[] s1 = source.toCharArray(); char[] s2 = target.toCharArray(); //If any index is out of bounds, it will end. Generally, if i is out of bounds, it may not be found. If j is out of bounds, it is found while (i < source.length() && j < target.length()) { count++; //The first character is matched successfully, and then the following character is matched if (s1[i] == s2[j]) { i++; j++; } else { //Failed to match. Start matching again from the next position i = i - j + 1; j = 0; } //j points to the end of the target string. The match is successful if (j == target.length()) return i - j; } return -1; }
3, Solution KMP algorithm
3.1 prefix and suffix
3.2 next array and partial matching value:
3.3 examples
If we have already matched here, we can see that the next character does not match.
If the violence method is used, the next matching position should be the last position of the first matching position:
Using KMP algorithm, it is calculated that the next backward shift is 4 bits:
(number of matched characters (6) - partial matching value of the last matched character (2))
IV. code demonstration of KMP
1. First, get the next array. This is an online version. I don't understand it. I achieved the same result with other ideas in the main program
//Get the online version of the next array. I can't understand it public static int[] Next1(String s) { int[] next = new int[s.length()]; next[0] = 0;//The first element is 0 anyway, which does not match itself //Index backward for (int i = 1, k = 0; i < s.length(); i++) { while (k > 0 && s.charAt(k) != s.charAt(i)) k = next[k - 1];// if (s.charAt(k) == s.charAt(i)) k++;//If a match is found, make the matching length +1 next[i] = k;//Assign k to the corresponding element of the next, that is, the matching length, regardless of whether it is matched each time } return next; }
2. main program
Note: the idea of next array:
1. i points to the first bit of the string, and j points to the last bit of the first bit of the string. J moves backward continuously until i and j move backward at the same time when they encounter the same character as the first character. At this time, the matching length is recorded in the next array at the same time.
2. Once a different character is encountered, i returns to the first place of the string, and j always moves back. Clear the matching length variable and wait for the next matching.
3. If the matching is successful again, repeat the operation similar to 1 (i and j move back at the same time). Until J points to the end of the string.
public static Integer count=0;//Match count, compare the two methods public static int KMPMatch(String source,String target){ //1. Get the next array, which is the key of kmp algorithm, so that we can decide where to start matching next time int[] next = new int[target.length()]; int matchedLength = 0, i = 0, j = 1; next[0] = 0;//The first element is 0 anyway, which does not match itself //i points to the first character and j moves back. while (j < target.length()) { //If the characters indicated by i and j match successfully, they will be moved back at the same time, and the length of the matched characters will be recorded at the same time if (target.charAt(i) == target.charAt(j)) { next[j] = ++matchedLength; i++; //Otherwise, i will be moved to the first place and the matching length will be cleared } else { i = 0; matchedLength = 0; } //j always traverse backwards j++; } //2. Real matching process int k = 0, l = 0;//k points to the source string, l points to the destination string int matched=0; char[] s1 = source.toCharArray(); char[] s2 = target.toCharArray(); //If any index is out of bounds, it will end. Generally, if i is out of bounds, it may not be found. If j is out of bounds, it is found while (k < source.length() && l < target.length()) { count++; //The first character is matched successfully, and then the following character is matched if (s1[k] == s2[l]) { k++; l++; matched++; } else { //The matching is unsuccessful. The matching starts again from the back position, which has been optimized by kmp algorithm if(matched>=1) k=k-l+matched-next[matched-1];//If the character length is not 0, select the position to start the comparison next time according to the algorithm else k=k-l+1;//If the character length is 0, it will return to the next bit of the original position like the violence method l = 0; matched=0; } //l points to the end of the target string, and the match is successful if (l == target.length()) return k - l; } return -1; }
5, Final comparison
@Test public void testViolenceMatch() { String source = "BBC ABCDAB ABCDABCDABDE"; String target = "ABCDABD"; System.out.println(solution.violenceMatch(source, target)); System.out.println("Matching times:"+solution.count); } 15 Matching times: 36 @Test public void testKMPMatch() { String source = "BBC ABCDAB ABCDABCDABDE"; String target = "ABCDABD"; System.out.println(solution.KMPMatch(source,target)); System.out.println("Matching times:"+solution.count); } 15 Matching times: 29