0:00

Let's complete the proof of the master method. Let me remind you about the story

so far, the first thing we did is we analyzed the work done by a recursive

algorithm using a recursion tree. So we zoomed in on a given level J, we

identified the total amount of work done at level J and then we summed up over all

of the levels resulting in this rather intimidating expression star. C into the D

times a sum over the levels J from zero to log base B of N of quantity A over B to

the B raised to the J. Having derived this expression star we then spent some time

interpreting it, attaching to it some semantics Sticks. And we realize that the

roll of this ratio A to the B over D, is to distinguish between three fundamentally

different types of recursion trees. Those in which A = B to the D and the amount of

work is the same at every level. Those in which A is less than B to the D and

therefore the amount of work is going down with the level. And those where A is

bigger than B to the D in which case the amount of work is growing with the level.

This gave us intuition about the three cases of the master method and even gave

us predictions f or the running times we might see. So what remains to do is turn

this hopeful intuition into. A rigorous proof. So we need to verify that in fact

the simplest possible scenarios outlined in the previous video. Actually occur. In

addition, we need to demystify the third case and understand what the expression

has to do with the number of leaves of the recursion tree. Let's begin with the

simplest case, which is case one. We're calling case one, we're assuming that A

equals B to the D. This is the case where we have a perfect equilibrium between the

forces of good and evil. Where the rate of the sub problem proliferation exactly

cancels out with a rate at which we do less work per sub problem. And now,

examining the expression, star, we can see how easy our lives get when A equals B to

the D. In that case, this ratio is equal to one. So naturally this ratio raised to

the J is also equal to one for all J. And then of course this sum evaluates to

something very simple. Namely one summed with itself log base B of N plus one

times. So the sum simply equals log base B of N. Plus one, and that's going to get

multiplied by. This CN to the D term which is independent of the sum. So summarizing,

when A equals B to the D, we find that star equals CN to the D times log base B

of N plus one. Writing this in big o notation, we would write big o of end of a

d login. And again, I'm going to suppress the base of the logarithms. Since all

logarithms differ only by a constant factor we don't have to specify the base.

That's just suppressed by the constant hidden in the big O notation. So that's it

for case one. Like I said, this is the easy case. So what do we do when A is not

equal to B to the D? And remember A could be either less than or bigger than B to

the D. To answer that question, let's take a short detour into geometric series. For

this single slide detour we're going to think about a single constant number R.

Now, what you want to think about is R. Representing that ratio A. Over B. To the

D. From the previous slot. But for this slide only let's just call it R. This is a

constant. It's bigger than zero, and it's not equal to one. Now, suppose we sum up

powers of R stopping, let's say, at the Kth power of R. I claim that this sum has

a nice closed form formula. Specifically it is exactly, R. To the K. Plus one,

minus one. Divided by or a minus one. Now, whenever you see a general formula like

this, it's useful to keep in mind a couple of canonical values of the parameters that

you can plug in to develop intuition. And for this expression, you might wanna think

canonically about the cases, R=2, and R=1/2. So when R=2, or something that

powers a two. One+2+4+8+16, and so on. One hour's a half, [inaudible] have one, plus

a half, plus a quarter, plus an eighth, and so on. Now I'm not gonna prove this

for you, I'd like you to prove this yourself. If you don't already know this

fact. So the way to prove this is simply by induction. And I will leave this an, an

exercise. What I wanna focus on instead is what this fact can do for us. The way that

we use this fact is to formalize the idea that, that in recursion trees where the

amount of work is increasing in the levels, the leaves dominate the overall

running time. And where recursion trees, where the amount of work is decreasing in

the level, the root dominates the running time. In the sense that we can ignore all

of the other levels of the recursion tree. So, and in the vision in this slide, we

have two upshots. First of all, for the case when R is less than one. And in this

case, this expression on the right-hand side. R to the Q plus one minus one over R

minus one can be upper bounded by one over one minus R. So again, remember, you might

want to have a canonical value of r in mind here, namely, one half. So what we're

claiming here is that the right hand side is nor more than two for the case of

R=1/2. And that's easy to see if you think about one plus one-half plus a one-fourth

plus one-eighth and so on. That sum is converging to, to as k grows large.

So in general, for our less than one constant, the sum is divided by one minus

one over R. Now, we're not actually gonna care about this formula, one minus one

over R. The point for us is just that this is a constant. And by constant, I mean

independent of K, independent of how many terms we sum up. Obviously, it depends on

R of the ratio, but it does not depend on how many things we sum up on K. So the way

to think about this is, when we sum up a bunch of terms where R is less than one,

then the very first term dominates. The first term is with a one. And no matter

how many terms we sum up, we never get, grow bigger than the sum constant. A

similar situation holds for the case where r is a constant bigger than one. When r is

bigger than one. A tiny bit of algebra shows that we can upper bound the right

hand side by r to the k. Times something which is constant, independent of K. So

again, let's interpret the second upshot in terms of a canonical value of R.

Namely, R equals two. Then our sum is one plus two plus four plus eight plus

sixteen, and so on. And what this is saying is that no matter how many terms we

sum up, the overall sum is never gonna be more than twice. The largest and final

term. So if we sum up to say 128, the sum, you'll notice, will be 255, which is, at

most, twice that largest term, 128. And that saying is true for any K. The entire

sum is no more than twice that of the largest term. In this sense, the largest

term in the series dominates the whole thing. So to summarize this slide in just

into one sentence we sum up powers of a constant R when R is bigger than one the

largest power of that constant dominate to the sun when R is smaller than one then

the sun is just a constant. Let's now apply this to prove case two of the master

method. In case two of the master method, we assume that A is less than B to the D.

That is, the rate at which sub problems are proliferating is drowned out by the

rate at which we do less work per sub problem. So this is the case where the

amount of work is decreasing with each level of the recursion tree. And our

intuition said that, well, in the simplest possible scenario, we might hope that all

of the work, up to a constant factor, is being done at the root. So let's make that

intuition precise by using the basic Sums fact on the previous slide.

So, since A is less than B to the D. This ration is less than one. So let's call

this ratio equal to R. So R, you'll notice, does depend on the three

parameters, A, B and D. But R is a constant, it does not depend on N. So what

is this sum? The sum is just, we're just summing up powers of this constant R,

where R is less than one. What did we just learn? We just learned that any such sum

is bounded above by a constant, independent of the number of terms that

you sum up. So therefore, what is this expression star evaluates to. It evaluates

to C, which is a constant, times N to the D. Times another constant. So suppressing

the product of these two constants in Big O notation we can say that the expression

starts upper bounded by Big O(n to the d). And this makes precise our intuition that

indeed the overall running time of the algorithm, in this type of recursion tree

with decreasing work per level, is dominated by the root. The overall amount

of work is only a constant factor larger than the work done and merely at level

zero of the tree. Let's move on to the final and most challenging part of the

proof, the final case. In case three we assume that A is bigger than B to the D.

So in conceptual terms, we're assuming the rate at which sub problems proliferate is

exceeding the rate at which we do less work per sub problem. So these are

recursion trees where the amount of work is increasing with each level, with the

most work being done at the leaves. And once again, using the basic sums fact, we

can make precise the hope that, in fact, we only have to worry about the leaves. We

can throw away the rest of work, losing only a constant factor. So to see that,

you will again denote this ratio between A and B to the D as R. And in this case R is

bigger that one. So this sum is a sum of a bunch of powers of R were R is bigger than

one, what did we just learn about that two slides ago in the basic sums facts, we

learned that such sums are dominated by the largest and last term of the sum. Okay

so the bounded it by a constant factor times the largest term. Therefore, we can

we can simplify the expression star to the following. I'm

gonna write it in terms of big O notation. And, like, on the last slide, I'll use it

to suppress two different constants. On the one hand, I'm gonna be suppressing the

constant C, which we inherited way back when from the original recurrence. And on

the other hand, I'm gonna use it to also suppress this constant that comes from the

basic sums fact. So ignoring those two constants, what do we have left? We have N

to the D. Times the largest term of the sum. So what is the largest term of the

sum? Well, it's the last one so we plug in the biggest value of J that we're ever

going to see. So what's the biggest value of J that we're ever going to see? We'll

it's just this. Log base B of N. So, we get the ratio A over B to the D, raised to

the log base B of N. Power. Now don't despair how messy this looks. We can do

some remarkable simplifications. So what I want to do next is I want to focus just on

this one over B to the D, raised to the log base B of N term. So that's going to

be. You can write that as B to the minus D log base B of N. Which if I factor this

exponent into two successive parts I can write this as B Raise to the log base B of

N power. And only then raised to the minus D. And now of course what happens is that

taking the logarithm of N base B, followed by taking, raising it to the B power,

those are inverse operations that cancel, so that leaves us just with the N. So this

results in a end to the minus D. And now remarkably this end to the minus D is just

gonna cancel out with this end to the D. Leaving us with merely. A, raise the log

based B event. And thus, out of this crazy sea of letters, rises a formula we can

actually understand. So A to the log based B of N, if we step back and pick for a

minute, is actually a supernatural quantity. Describe something about the

recursion trees that we already knew was supposed to pop up in the analysis. I'll

let, I'll let you think through exactly what that is in the following quiz. So the

correct answer to this quiz is the fourth response. A raised to the logarithm event

is precisely the number of leaves of the recursion tree. And remember in our

intuition for case three, recursion trees where the amount of work is increasing per

level, we thought that perhaps the work would be dominated by the work done at the

leaves which is as proportional as the number of leaves. So why is this the

answer? Well just remember what recursion trees look like at level zero. We have a

single node, and then with each level we have eight times as many nodes as before.

That is, with each level of the recursion tree, the number of nodes goes up by a

factor of A. How far does this, how long does this process go on? Well, it goes on

until we reach down the, the leaves. Recall that in the input size starts at N

up at the root. It gets divided by a factor of B each time, and it terminates

once we get down to one. So the leaves preside at precisely level log base B of

N. So therefore. The number of leaves is just a branching factor which is A raised

to the number of times that we actually multiply by A which is just the number of

levels which is log base b n. So each time we go down a level we

increase the number of nodes by a factor of A and we go down a level log base B of

N times. Leaving us with a number of leaves equal to A raised to the log base B

of N. So what we've done is we've mathematically confirmed, in a very cool

way, our intuition about what case three should look like in the master method.

We've proven that in case three when A is. Bigger than b to the d. The running time

is, o of the number of leaves in the recursion tree, just as the intuition

predicted. But, this leaves us with one final mystery. If you go back to the

statement of the master method, we didn't say, a to the log base b of n. In case

three, it says the running time is, n to the log base b of a. And, not only that,

we've used this case three formula over and over again, to evaluate Gauss's

recursive algorithm for integer multiplication, to evaluate the Strassen's

matrix multiplication algorithm, and so on. So, what's the story? How come we're

not getting the same thing, as in the statement of the master method? Well

there's a very simple explanation, which is simply that, believe it or not. A log

base B of N, and N to the log base B of A. Are exactly the same thing. This looks

like the kind of mistake you'd make in freshmen algebra. But actually, if you

think about it, these are simply the same quantity. If you don't believe me, just

take the logarithm base B of both sides, and it'll give the same thing in both

sides. Now, you might well be wondering why I didn't just state in the master

method that the running time of case three is this very sensible and meaningful

expression, a raised log based b of n, i.e., the number of leaves in the

recursion tree. Well, it turns out that while this expression on the left hand

side is the more meaningful conceptually. The right hand side. N. To the log base B.

Of A. Is the easiest one to apply. So recall when we worked through a bunch of

examples, of the master method, this right hand side was super convenient, when we

evaluated the running times of out rhythms. When we plugged in the numbers of

A. And B. In any case, whether or not you want to think about the running time in

case three as proportional to the number of leaves in the tree or as proportional

at the end of the log base B of A, we're done. We've proved it. That's case three.

That was the last one. So we're done with the master method. Qed. So that was a lot

of hard work for doing the master method and I would never expect someone to be

able to regurgitate all of the details of this proof you know it's something like a

cocktail party well maybe except the nerdiest of all cocktail parties but I do

think there's a few high level conceptual points of this proof that are worth

remembering in the long term, so we started by just writing down a recursion

tree for the recursive algorithm and in a generic way. And going level by level, we

counted up the work done by the algorithm. And this part of the proof had nothing to

with how A and B related to each other. Then we recognized that there

are three fundamentally different types of recursion trees. Those with the same

amount of work per level, those where it increases with the level, and those where

it decreases with the level. If you can remember that, you can even remember what

the running times of the three cases should be. In the case where you do the

same amount of every work at each level. We know there's a logarithmic number of

levels. We know we do end in D work at the root. So that gives us the running time in

case one had ended the day you log in. When the amount of work is decreasing with

the levels, we now know that the route dominates. Up to a constant, we can throw

out the rest of the levels, and we know end of the D work gets done at the root,

so that's the overall running time. And in the third case, where it's increasing in

the levels, the leaves dominate. The number of leaves is A raised to the log

based of B of N, and that's the same as N, the log based B of A. And that's

proportional to running time in case three of the master method.