Hi, in this video, you will find and learn the Knuth-Morris-Pratt algorithm that allows to find all the occurrences of a pattern in the text in the time linear in terms of the length of the pattern and length of the text. So instead of the product of those lengths as in the brute force algorithm, we will need just some of those lengths to find all the occurrences. The algorithm goes as following. First we create a new long string S which consists of the pattern, the text and between them we insert a special character called dollar. Which is basically. Any character that is absent from both pattern and text. It cannot be specifically the character dollar. It is just a placeholder for some character that is absent from both pattern and the text. After we've assembled this longstring S, we need to compute it's prefix function. And after we've computed its prefix function, we need to look in the positions in the string s, which are inside the text part of it. So we'll look at all positions i such that i is more than length of the pattern. So after the pattern and after the dollar, and if the prefix function for that position is equal to the length of the pattern, then we know that there is an occurrence of the pattern in text ending in that position. For example, here we have a prefix function value of four. And we have an occurrence of the pattern ending in the corresponding position. We need to find all the positions where the pattern starts in the text. So from the position where it ends we need to compute the position where it starts, and to do that we need to subtract length of the pattern -1, but that would be the position in S. And to compute the position in the text, we also need to subtract the length of the pattern and one for the dollar. So in total, we need to subtract two length of the pattern from the position in the string S to find the starting position of the pattern in the initial text. And there is another place where prefix function of string big S is equal to the length of the pattern. And again, there is an occurrence of the pattern ending in that position. And why does algorithm even works? First, we need to notice that the prefix function for this string big S is always less than or equal to the length of the pattern. Because of the dollar sign it occurs right after the end of the pattern so when the border is bigger, we would need to have another occurrence of dollar in the string big S. But dollar is only between pattern on the text and is absent from the text. So prefix function cannot be bigger in under the life of the pattern. If we look at the position i, which is to the right from the dollar and the prefix function is equal to the length of the pattern. And that means that the pattern is a border of the corresponding prefix of S. And so it ends in position i. And we only need to determine the position in which it starts in the text T. And to do that, we need to do a few computations. And we will see that this position is i minus two length of the pattern. However, if the prefix function in some position i is strictly less than the length of the pattern, then it means that the pattern doesn't end in that position in the string S. And that means that it doesn't end in the corresponding position in the text. And that means that we've found all the positions in which the pattern ends in the text. And so we've of course also found all the positions where it starts in the texts by subtracting two lines of pattern from each side position. So the codes for this algorithm is already pretty simple. We take as input pattern P and text T, we assemble string S by pattern with special symbol dollar and with the text, then we compute the prefix function of this long string S. We initialize the resulting list of positions where pattern occurs in the text. And we go through all the positions, i and s, which are to the right from the dollar sign. And if, at some position, we see that the value of the prefix function is the same as the length of the pattern, we just append i minus two length of the pattern to the result. And we return this resulting list of positions in the end. This algorithm is already pretty simple, and it works in time proportional to sum of the length of the pattern and the text. To prove that, we know that string S can be built in the time proportional to sum of the length of strings P and T. Computing prefix function is done in the proportional time. And the four loop runs through part of the string. So it also runs in time proportional to sum of the length of the pattern and the text. In conclusion, you now know the Knuth-Morris-Pratt algorithm for exact pattern matching. You can find all occurrences of the same pattern, in all of the text in time linear in terms of the length of the pattern and length of the text. You can also compute prefix function of any string in linear time. And you can go through all the borders of any string in the order of decreasing length using prefix function. And in the next lessons, we will learn how to build suffix array and suffix tree in time which will allow you to find many different patterns in the same text even faster than if you use algorithms like Knuth-Morris-Pratt's.