0:16

As a warm up, I want to talk first about binary search,

which is everybody's first divide-and-conquer algorithm.

And that's where we have a sorted array, and

we look to see if a particular value is in the array by looking at the middle.

The value we're looking for is less than the item at the middle, go left.

If it's greater, go right.

and in either case, divide the size of the array about in two.

That's the code for binary search, and you can find it in the algorithms book or

in the algorithms book site, and its behavior is described by that recurrence.

The number of compares in the worst case to find,

1:03

the size of the sub array that you look in, it's going to be floor of N over 2,

so that's the, you take N over 2, and it's the biggest integer less than that.

If it's an odd size, then that'll be a exact equation.

So, even size you go down one, and so just check that for

yourself that that is the number of compares in the worst case.

And so now we have this mathematical model and that's what we want to study.

Now usually we are trying to get approximate solution or

just an idea of about how many compares are taken.

And the easy case for binary search is if the file size is exactly of power of two.

Then divides in half,

then the part of the array you are looking at is also a power of two.

So it becomes a recurrence that telescopes.

So we will take A sub n = B sub 2 to the nth.

If we start with the power of 2, and then we just have a simple recurrence

that telescopes, that's just substituting of B of 2 to the nth.

and then that reoccurrence that we just looked at telescopes to A of N equals

N each time you throw out an N for N times and that means that,

3:51

Maybe it's a little bit easier to think about counting the number of bits in

the binary representation of N than it is thinking

about the binary search algorithm.

And so that's an easy model.

And then it's pretty straightforward to prove that the number, that number.

The number of bits in a binary representation of n

is the floor of the log base 2 of n plus 1.

And it's worthwhile.

Now, this table and these formulas are here for

you to check that math for yourself.

To be sure that you understand how that might be the case.

It's actually a pretty simple calculation.

Just the leading digit of log base two of n changes at the powers of two.

And if you just ignore the rest of it, then you floor of log base two of n and

then we're adding one to that.

And this is This math and that table is for

you to check and prove to yourself that this is true.

We had an algorithm binary search, and we had an idea.

The number of bits in the binary representation of the numbers, and

these two are matched up through the recurrence.

5:20

Divide the two arrays, the array into two halves.

Sort the two halves and then merge to put them back together.

For simplicity, we'll assume that the merge implementation always uses

N compares, and then if you do that, you get this recurrence for

the number of compares for the sort.

And this kind of recurrence is more complicated than the ones that we've

looked at because the appearance of the floor and

ceiling functions over on the right-hand side.

They're not immediately easy to deal with and

they result in some interesting effects if we're trying to come up with a formula for

the answer, what's the answer of C n?

We can go ahead and do our computation, but let's first talk about the easy case.

Which We already did.

It's the same as for binary search.

That is if N is a power of 2 then the floors and the ceilings go away.

6:30

C sub 2 to the n how will get this recurrence here.

And that's one of the first ones we solved with the telescoping sum but

summation factor is 2 to the n divide by 2 to the n and

then we get the result n 2 to the n in terms of the original,

it means that c sub cap n equals n log cap n when n is power of 2.

But what if N is not a power of 2?

So that's the next question to address.

So there's a natural question that arises.

So for Quicksort, we did the number of compares and got a very precise

analysis that we could check against the performance of the algorithm and it was

7:58

accurately approximate the performance of mergesort.

It's a prototype of what can happen in many computer algorithms that have

this same kind of characteristic so

how do I know that well here is a demonstration

of what goes on And this is really why I wanted to spend the time at

the beginning to motivate you to go ahead and do plots.

This is simply using the same structure that we use for Fibonacci and

Quick Sort, to go ahead and compute values of the bird sort recurrence.

And then the second four loop is just going through and scaling,

it's trying to determine if there's this alpha value and so just divide by

9:51

This is a fundamental example,

it is a relatively simple example, comes over and over and over again.

And really this kind of isolation,

something that is intrinsic, and the analysis are over this

because of the idea that we need to go down to the discreet.

And that means that we get stuck in this kind of oscillation all the time.

So but I do want to talk about undoing mergesort

specifically in the general case because it's so important.

So and there is a little trick involved to simplify it a little bit.

That again I'm not going to completely motivate.

But you'll see what I mean.

So if we write down the same formula for n plus 1.

Then we are going to have to do some math with floors and ceilings.

So floor of n plus 1 over 2 is the ceiling of n over 2.

And ceiling of n plus 1 over 2 is the.

Floor of n over 2 + 1.

That's just, you can do a little math to convince yourself of that.

So that's what this table does.

And then once we have that, then we can subtract those two formulas.

And that gives us something that telescopes, so

little magic trickery based on the relationships between floor and

ceiling with the plus ones, and that gives us a simpler formula to work with.

So we'll just define that difference to be D sub N,

and that's going to then be the equation for

D sub N But, that equation is a familiar one.

That's the binary search equation.

It has a different initial value, so the solution is DN = floor of lgN + 2.

And then telescoping that one,

so that's CN+1- CN = that,

And then so Cn plus one equals Cn plus that so

then immediately telescopes to give this sum.

And so, now what's interesting about that sum is that this thing here is

the number of bits in the binary representation of K So

this is a proof that C sub N equals N minus 1 plus the number of

bits in the binary representation of all the numbers less than N.

And that's a interesting fact,

that here's a combinatorial way to prove the same thing.

So if s sub n is the number of bits and

the binary representation of all the numbers less than n,

then what we can do is just over in the left is all the numbers less than 15.

And what we do carve off the right most bit.

If you carve off the right most bit, then taking every other number,

you get s of over 2.

So it's just counting one, two, three up to over 2.

And then The [COUGH] alternate bits are at the ceiling of N/2,

and then the right-most bits are N-1.

So that's a combinatorial proof that the number of bits in the binary

representation of all the numbers less than N satisfies this recurrence,

that's the same recurrence that merge sort satisfies.

So that's a proof So number compares taken by merge sort is ms1 plus number

of bits in the binary representation of all the numbers less than n.

And when you think about it, that number of bits in the binary representation

of all the numbers less than n, that's going to have some kind of oscillation.

16:15

If you substitute this formula into this solution for

the number compares by mergesort, you can split it off into three functions and

that's again just the algebra from this for the coefficient of N.

I'm sorry, in the two function the coefficient of N is 1 minus

a a the a function part of log N and

the other thing that you have is the 2 minus log N and those things are parted,

They look kind of anti-symmetric, but actually it doesn't cancel out.

What's left is that little oscillating function.

And if you want to play that by N,

you get the exact function that we plotted by before.

So you can check this math.

And if this is also described in the book, but it's just and indication of the kind

of challenges that we're going to face when trying to analyze algorithms.

It looks like things are going fine but we might be stuck with oscillation where

things are just complicated because we're working.

On the one hand with wanting to work with familiar functions like logn and

on the other hand needing to work with functions that are forced to be integer

value like the of logn or largest integer less than logn.