Now we'll see how our understanding of basic properties of random lots on the non negative integers unlocks the analysis of Papadimitriou's Algorithm, showing that randomized local search yields a polynomial time algorithm for the two set problem. Let me remind you the details of Papadimitriou's two sat algorithm. Consider an instance of two sets. Suppose it has n boolean variables. The algorithm has two for loops. The outer for loop, it's only responsibility is to run log n independent random trials. In each iteration of this outer for loop we're just going to do randomized local search. So we begin with an initial random assignment. We flip a fair coin independently for each of the n variables to determine whether it starts out true or starts out false. Then we get a budget of 2n squared local moves in an attempt to transform this initial assignment into one that's satisfying. In each of these 2n squared inner iterations, we first check if the current assignment is satisfying. Obviously if it is, we're done. So suppose the current assignment is not a satisfying assignment. Then there's at least one, possibly many unsatisfied clauses. Of them we pick one, arbitrarily. It's a two set instance so this clause involves two variables and if we flip the value of either of those two variables we're going to make amends with this clause. It will flip from unsatisfied to satisfied. There's a tricky question of how do you pick which of the two variables you flip, but we take the easy way out. We just pick one of the two uniformly at random. If, at the end of the day, after these 2 n squared times log n local moves, Papadimitriou's algorithm has failed to find a satisfying assignment, it's somewhat arrogantly declares that therefore one must not exist. What's clear about this algorithm is that it runs in polynomial time, after all, we've explicitly bounded the number of steps it's permitted to take. Secondly, it's clear that if there is no satisfying assignment, then the algorithm will naturally fail to find one and correctly report that it's unsatisfiable. What's not at all obvious is the probability that Papadimitriou's algorithm will find a satisfying assignment in instances where at least one such assignment exists. That analysis is the subject of this video. And again what's remarkable and surprising about this theorem is that if you think about the most natural measure of progress for Papadimitriou's algorithm, namely the number of clauses which are currently satisfied. That number can actually go down when you make a local step. By flipping the assignment to one variable, yeah you fixed the one clause but you might screw up a bunch of others leaving you with an assignment that seems worse in the sense that even fewer clauses are satisfied then before. Nonetheless if you define the right measure of progress you can argue, by a random walk analogy, that you are making progress over time resulting eventually with a satisfying assignment. Lets see the details. Each iteration of the outer for loop represents a fresh opportunity to discover a satisfying assignment. All of these iterations of the outer for loop are identical, they just use fresh random coins. So lets just analyze for now the success probability of a particular outer iteration. We're assuming that this instance is satisfiable. So there's at least one, perhaps many satisfying assignments. Pick one, an arbitrary one, I don't care which, call it a*. Let's denote by a sub t the assignment that Papadimitriou's algorithm is considering after t iterations of the inner for loop have completed. So a sub 0, that's the random initial assignment with which we begin this iteration of the outer for loop. And then after we've done t local flips, that results in the assignment a sub t. This is of course, a random variable. It's a function of the random choices made by Papadimitriou's algorithm. Now the great idea in this analysis is to use the right progress measure. Rather than counting the number of clauses that the current assignment satisfies, we're going to count the number of variable assignments that it got right. That is the number of variable assignments with which it agrees with this reference satisfying assignment a*. Evidently, X sub t is an integer. It's 0 if a assigns every variable the opposite of what a* does, and it's equal to n if a and a* are exactly the same assignment. Of course, if we ever get to the point where X sub t equals n, and a* is the same as a t, then a t is itself a satisfying assignment and we're good to go. The algorithm halts correctly. Now we come to the key point in the analysis, now we're going to understand the sense in which the algorithm on average makes progress. So suppose we are currently looking at the assignment a sub t, and suppose it's not a satisfying assignment, then there's one or more clauses that are unsatisfied. The algorithm picks one of those. Let's say it's a clause involving the variables x sub i and x sub j. So for example if it looks at the clause not x sub 3 or x sub 7 then it's involving the variables x sub 3 and x sub 7. So here's an easy observation. Our current assignment a sub t is failing to satisfy this clause. The reference satisfying assignment a* naturally is satisfying this clause. Therefore, a* must give a different value to one of the two variables xi or xj, than our assignment a sub t does. It is also possible, of course, that a* makes different assignments to both of the variables xi and xj. All we know is that a* satisfies the clause, whereas our assignment a sub t does not. And the amazing part is just this simple observation is enough to imply that we're going to make progress on average. Here's exactly what I mean. So Papadimitriou's algorithm is going to choose either xi or xj, that choice is uniform at random. And it's going to flip the value currently assigned to that variable. Now the situation at which this is guaranteed to be good for us, is where we gave the opposite assignment to both xi and xj, relative to the assignment a*. If both of our value assignments of these variables was the opposite of a*, whichever one we flip we are going to share one more variable in agreement with a* than we did before. That is, in our previous notation, x sub t + 1, the number of variables in which we agree next time step will be guaranteed to be one more than it is right now. The situation is more subtle when the current assignment a sub t agrees with the reference satisfying assignment a* on exactly one of the two variables xi and xj. So for example, maybe the current assignment agrees with a* at xi, but it disagrees with the reference assignment a* on xj. Now of course, the algorithm has no idea what a* is, this arbitrary satisfying assignment, so it has no idea which of the two variables xi or xj it should flip. It just picked one of the two at random. Now if we get lucky and we flip the value of the variable on which a sub t and a* disagree then this is just like case one. The flip causes the two assignments to agree on one more variable than they did previously. As a consequence, x sub t + 1 is going to be one more than x sub t. In the unlucky case we wind up flipping the variable on which we agreed already with the reference assignment a*, then once we flip it we're going to disagree on yet another variable with the reference assignment a*. So in that case, x sub t + 1 is actually going to be 1 less than x sub t. And now at this point, I hope things are starting to look a little familiar relative to our previous video about random walks on the non-negative integers. There we again had a random variable, namely the position of the walk, and at each time step it either went up by one or down by one, with 50-50 probability each. So the situation here where the x sub t's is clearly not identical to random walks on non-negative integers. So the next quiz asks you to identify the differences between the random process governing the x sub t's and your vanilla random walk on the non-negative integers as studied in the previous video. So the answer is, D. If we think about two random processes the first one is your straight forward random walk on the non-negative integers as discussed in the last video, the second random process is the evolution of these random variables the x sub t's. The number of variables on which the current assignment in Papadimitriue's algorithm agrees with the reference satisfying assignment a*. There are exactly three differences. One we already pointed out. In the vanilla random walk on the line unless you're at position zero, if you're at any bigger position it's always 50-50 left or right. In Papadimitriou's algorithm, you might sometimes have a 0% probability of moving to the left. A 100% chance of moving to the right. That's when the algorithm zooms in on an unsatisfied clause in which the current assignment a sub t happens to differ with the reference assignment a* on both of the two variables. Then it doesn't matter which of the variables you pick, you're guaranteed to agree with the reference assignment a* on one more variable than before. The second difference is that when we studied random walks by definition we always started at position zero. that would correspond to this first random variable Xo = 0. However, in general Xo will not be equal to 0. In general it will be strictly bigger than 0. Why is that? Well remember, Xo is defined as the number of variables that have the same value in the reference assignment a* and the initial random assignment ao. So the only way that Xo is going to be zero is if you randomly chose an assignment where every variable had the opposite value as what it is in a*. And that's very unlikely to happen. Generally you start with a random assignment, you're going to get lucky and some of the values will be correct will be the same as in a*. So Xo will generally start out bigger than zero. Here's the third and final difference, in the random walks that we studied, the process by definition stopped exactly the first time you reach position n. Now in Papadimitriou's algorithm, what is true is that if you ever get to the point that x sub t equals n, then at that point the algorithm will terminate. Why? Well if x sub t equals n, that means that in the assignment a sub t every single variable has exactly the same value as in the reference assignment a*. Now a* is a satisfying assignment and Papadimitriou's algorithm halts as soon as it finds a satisfying assignment. So if x sub t equals n you found a* and you're good to go, you stop. However, the converse is not true. Papadimitrious's Algorithm might halt even though the current value of x sub t is less than n. Why is that? Well remember x sub t measures your distance from just this one satisfying assignment a* but nobody said that was the only one. There might be other satisfying assignments out there in the world as well. And Papadimitriou's algorithm might well stumble upon one of them before it finds a*. So it halts the first time it finds a satisfying assignment. And that can happen before x sub t equals n. So where does this leave us? On the one hand, there does seem to be a strong metaphor between the distance between the assignment in Papadimitriou's algorithm, and this reference assignment a*, and the position of a random walk on the non-negative integers. On the other hand, there's kind of a lot of differences. Some of them are pretty subtle. So what you've gotta be wondering is, was that whole analysis last video useful? Well what's amazing and what allows us to transfer the random walk analysis over to Papadimitriou's algorithm is that every single one of these three discrepancies between the two random processes only helps us. It only means the algorithm's going to terminate even faster than we might have suspected from the random walk analysis. So here's another way to think about it. Imagine I force you to run Papadimitriou's algorithm under an extremely unfortunate set of circumstances. So first I give you a two set instance with just one, satisfying assignment. So there's a unique satisfying assignment, a*, you're looking for a needle in the hay stack, there's no way you can get lucky by halting early with a different satisfying assignment. Secondly I force you to begin from the worst possible initial assignment where every variable's value initially is the wrong one, is flipped from what it should be in a*. And finally imagine I further contrived things. So that whenever you pick an unsatisfied clause to do a variable flip, I force you to pick a clause where one of the variables is set the same as a a* and the other variable is set the opposite of a*. Well, for this kind of execution of Papadimitriou's Algorithm, the correspondence between the evolution of the random variables, the x sub t's and a random walk on the non-negative integers is perfect. The two random processes are identical under this worst case set of circumstances. Therefore, in the worst case circumstances, the probability that Papadimitriou's algorithm fails to find a satisfying assignment, that is, a*, within 2n squared steps is exactly the same as the probability that a random walk beginning at zero fails to reach position n within 2n squared steps. And we analyzed that failure probability in the previous video. We proved its at most one-half. That is the success probability is at least one-half. Now if you look at any other circumstances other than the worse case circumstances, Papadimitriou's algorithm is only more likely to succeed. So first of all it doesn't start as far away from a satisfying assignment. Second of all, there's other satisfying assignments it might be able to find. And thirdly, sometimes it's guaranteed to make progress, as opposed to having only a 50-50 chance. So the success probability under any circumstances of Papadimitriou's algorithm is at least 50% in a given iteration of this outer for loop. And now we're home free. The probability that a given iteration of the outer for loop fails, is at most one-half. The probability that each of the log based 2n iterations of the outer for loop fail, is then at most one-half, raised to the log base 2 of N, also know as 1 over n. So the overall success probability of the algorithm, is at least 1 minus 1 over n as claimed.