0:04

That fast algorithm for sorting that we're going to look at is called

Mergesort and it's a very simple method.

We take the array that we need to sort,

and we divide it into two halves,

and then what we do is recursively,

we resort each of the half.

So, now we have two sorted sub arrays,

half the size of the original one.

And then, we merge the two halves together to make a sorted whole, that's Mergesort.

It was invented actually by one of the pioneers of computing,

John von Neumann, we'll talk about him later on in the course.

The early focus when computers were first being built,

was on numerical calculations but what John von Neumann did was he invented Mergesort,

really as a test to see how the computer would work on other tasks,

and Mergesort definitely has stood the test of time.

So, let's look in more detail at how Mergesort works.

First, what we need to do is implement the merge.

Now, it turns out that merging is a little more complicated than you might think.

One thing that is hard to do is to merge an array in place.

If we have our two sort of sub arrays to do it within the same array,

is a little tricky, is very tricky in fact.

So, instead what we're going to do is take an auxiliary array,

an extra array to put the results into.

And then, when we've got the merge done then we'll copy it back.

So, this is a flaw of Mergesort,

there's other methods that don't have this characteristic that you need an extra array,

and you can learn about those in an algorithm course.

So, we're going to keep an array called aux,

that's an extra array,

and then we're going to write this program

merge that behaves as if it did the merge in place.

So that is, we take our array,

we take low, mid, and high,

and it's going to say low,

mid, again using our notation,

doesn't include mid, and then mid high and location doesn't include I.

And then, we're gonna put the result into aux and then merge it

back and then copy it back as if the merge happened in place.

So, that sets up our indices,

all we do is maintain one pointer I that points to the low part of the array,

another pointer J that points to the high part of the array.

We compute the number of things that we're going to have to do,

which is just hi minus lo,

and then k is an index into aux the result.

So, if we get to a point where the lo part is exhausted,

we copy over, this is an edge case that we'll talk about later.

We just copy over but,

the main part of this is to check the element at J against the element

at I and then move the smaller one into aux and increment that pointer.

So, if J is less then we move the one at j,

if I less we move the one at I,

and then we copy it back into a.

So, let's look at how this works.

So, I starts out pointing to Alice and J starts out pointing to Bob.

So, Alice is smaller so we move that one over,

and now Bob smaller so we move that one over.

Each time the aux of k gets the next element that belongs there,

but all we're doing each time is one comparison picking

the smaller from the two sub arrays.

So now Eve is smaller,

so that's going to go, and Frank is smaller,

so that's going to go and the one that we move,

that's the index that we increment.

It's quite a simple process,

and you look at this code,

you'll understand how it works.

And then eventually, one of the arrays exhausts

and that's what the two first conditions in the if statement are for,

it's just to move everything from the other one over.

So, that's merge. And then,

we copy it back so the effect is as if we did it in place,

but we had that auxiliary array.

Now, given that merge,

then Mergesort is a recursive program

that uses that abstract in place merge to get the job done.

And so, there's the merge method and aux is

a global array that everybody can refer to and we do that so that within our sort method,

we allocate that array just once,

and then call a recursive method that takes indices just like a binary search did.

And here's the recursive method that sorts the sub array of an array,

a, between index lo and hi,

again not including a of hi using our standard notation.

So, we compute N,

if it's less than one element we return otherwise,

we compute our midpoint,

sort the half for the smallest indices,

sort the half with the larger indices, and then merge.

And so, all of that setup for the merge is to make this code so transparent.

Divide the array in half,

sort the two halves,

merge the two halves to make a sorted whole, that's Mergesort.

And we'll use the same test client as for insertion,

which is just read strings of standard input and then sort them.

Again, this is going to sort for our example and we'll do a trace in just a minute.

That's a Mergesort, pretty straightforward implementation

although I have to make sure that you get the auxiliary array part right.

If you put that within the recursive routine,

it would be too slow.

Okay. So, let's look at how it works dynamically.

So, what do we have to do to sort that array?

We have to first sort the first half,

what do we have to do that one?

We have to first sort the first half of that one.

To sort that one, we have to sort Wendy and Alice.

And finally, we get down to an array of size one that can be considered sorted.

So, then we have to consider the other size,

array of size one and now those are considered sorted.

And they are sorted they're just one,

now we merge those two sorted sub files together to

get Alice and Wendy in the proper order.

Now, we do the second half of the array of size four at the beginning,

and then we have Dave and Walter and they're already in order.

Now, we have two sub arrays of size two that we can

merge together to get the first four elements in sorted order.

Now we're finished with the first half of the array of size eight at the beginning,

and we can work on the second half in the same way.

File size two, second file size to merge

those together to get two sub files of size four,

and now we can merge those together to get the first eight elements sorted.

Now we do the same thing for the second half of the array,

and you can follow through the dynamics of Mergesort.

Now we have the one of size four sorted,

we do the second one of size four same way.

Now merge together those two fours to get an eight,

and now we have the two eights and we can merge those together to

get the entire array sorted, that's Mergesort.

So, let's do the analysis.

So, to make it even simpler,

we'll count just the number of times a string moves from one array to the other,

the comparison is a little more complicated to analyze but this is quite simple.

And again, we'll do the analysis when N is the power of two this

time and that's now if big N equals two to the little n,

little n is exactly log base to big N.

And we start out with an array of size two to the little n,

and we split it in half,

we have two arrays the size two to the little n minus one.

Divide two them by two you get two to the little n minus one,

split all those in half you get four sub arrays,

the size two the n minus two.

Now, if you look down at the bottom,

it does the calculation of how many data moves do you do.

Well, the merge for the beginning you

do a merge of two things of size N over two into the auxiliary array so,

that's n data moves and then you move them back so,

it's two n data moves.

And that's true for each sub array but if we do it for two sub arrays,

then it's going to be two n data moves total for both of those sub arrays.

In going all that way every time,

the total number of data moves for all arrays of that size is going to be two n. So,

that means the total number of data moves is 2N natural log N because we

have N phases that's counted by the sub array size little n phases,

which is log of big N so,

the total number is 2 and natural log of big N. That's a quick way to see

that the running time of Mergesort is proportional to n log N. And again,

this basic argument works even when N is not a power of two,

and you can learn more detail about that in a course on algorithms.

That's our mathematical model and again,

we want to do empirical tests to check the mathematical model,

again using the same model that we use for insertion sort.

And now, it takes just a second to do a million,

two seconds to do two million,

and then in 16 million, 20 seconds,

so you could do a billion in not too much time in a few minutes,

20 minutes to do sort of billion items which is

definitely fast enough for Alice and Bob type applications.

And it confirms the hypothesis that

the order growth is either linear and log N,all right,

and our math model said n log N and this confirms that.

So, that's a fast sorting method.

Now sorting is so important that people have studied other methods and there are

methods that are constant factors faster than this,

but certainly by comparison with insertion sort,

this method solves our problem.

It scales and that's what we need,

it is a scalable method so that as our company grows,

we can grow with it along with Moore's Law.

And so, Alice is okay,

I'm ready to go now,

I've got my whiteless generator.