The primary topics in this part of the specialization are: greedy algorithms (scheduling, minimum spanning trees, clustering, Huffman codes) and dynamic programming (knapsack, sequence alignment, optimal search trees).

Loading...

来自 斯坦福大学 的课程

Greedy Algorithms, Minimum Spanning Trees, and Dynamic Programming

402 评分

The primary topics in this part of the specialization are: greedy algorithms (scheduling, minimum spanning trees, clustering, Huffman codes) and dynamic programming (knapsack, sequence alignment, optimal search trees).

从本节课中

Week 3

Huffman codes; introduction to dynamic programming.

- Tim RoughgardenProfessor

Computer Science

So now that we've done some work with our thought experiment, understanding exactly

what the optimal solution, the maximum rate independent set had to look like in

the path graph, let's turn that work into a linear time algorithm.

Let me quickly remind you, the bottom line from the previous video.

So we argued two things. First of all, if the max weight

independent set of a path graph happens to exclude the rightmost vertex, then it

has to in fact be a max weight independent set with the smaller graph g

prime, obtained from g by plucking off the rightmost vertex.

If on the other hand a max weight independent set does include the

rightmost vertex v sub n, then if you take out v sub n and look at the

remaining part of the independent set, that has to be max weight among the

smaller graph g double prime, defined by plucking the two rightmost vertices off

of the original graph g. Ergo, if we happen to know, if a little

birdie told us which of these two cases we were in we could recursively compute

the optimal solution of either G prime or G double prime as appropriate and be

done. If we recurse on G prime we just return

the result. If we recurse on G double prime we

augment it by V sub N and return the result.

Now, there is no little birdie, we don't know which of the two cases, we're in.

So we're concluded the previous video by proposing just trying both case.

So let's write down what that proposed algorithm would look like before we take

a step back and critique it. So, the good news about this algorithm is

it is correct. It's guaranteed to return the maximum

weight independence set. So, how would you prove this formally?

Well, it would be by induction, in exactly the same way we proceeded with

divide and conquer algorithms. And for the same reason, I'm not going to

talk about the details here. If you're interested, I'll leave it as an

optional exercise to write this out, formally.

But intuitively, the inductive hypothesis guarantees correctness of our recursive

calls. So computing the maximum weight solution

in either G prime or G double prime, and then, in the previous video, the

whole point of that, in effect, was arguing the correctness of our inductive

step, given the correct solution to the sub-problem, we argued what has to be the

optimal way to extend it to a solution, to the original graph, G.

The bad news, on the other hand, is that this algorithm takes exponential time.

It's essentially no better than brute force search.

So while the correctness of this kind of recursive algorithm, follows the template

of divide and conquer pretty much exactly.

the running time is blown up to exponential.

And the reason for that difference is, in our divide an conquer algorithms, think

of Merge Sort as a canonical example, we made a ton of progress before we

recursed. Right?

We threw out half of the array, 50% of the stuff before we bothered with

any recursion. How much progress are we making in this

algorithm? Well, very little.

It's positive progress, but very small. We throw out either one or two vertices

out of maybe this graph with say a million vertices before recursing.

So we're branching by a factor two and making very little progress before each

branch. That's why we give this exponential

running time rather than something more in the neighborhood of n log n.

So this brings us to the following question.

This is an important question. I want you to think about it carefully

before you respond. So we have this exponential time

algorithm, it makes an exponential number of recursive calls.

Each recursive call is handed some graph, for which it's responsible for computing

the maximum-weight independence set. The question is this. Over all of these

exponentially many different sub-problems, each of which is passed

some graph as a sub-problem, how many distinct,

How many fundamentally different sub problems are ever considered across these

exponentially many recursive calls? So the answer to this question, and the

key to unlock the crazy efficiency of our dynamic programming implementation, is B.

So despite the fact that there's an exponential number of recursive calls, we

only ever solve a linear number of distinct sub-problems.

In fact, we can explicitly say what are the different sub-problems it could solve

throughout the recursion. What happens before you recurse?

Well you pluck vertices from the graph you were given off from the right.

Maybe you pluck off one vertex that's in the first recursive call, or in the

second recursive call you pluck off two vertices, but both from the right.

So an invariant maintains throughout the recursion is that the sub-problem you're

handed was obtained by successive plucking off of vertices from the right

part of the graph, from the end of the, end of the graph.

So, however you got to where you are, whatever the sequence of recursive calls

led to where you are now, you are guaranteed to be handed a prefix of the

graph. The graph induced by the first I

vertices, where I is some number between zero and n.

So therefore there's only a linear number of sub-problems you ever have to worry

about, the prefixes of the original input graph.

From this, we conclude that the exponential running time of the previous

algorithm arises solely from the spectacular redundancy of solving exactly

the same sub-problem from scratch, over and over and over and over again.

So this observation offers up the promise of a linear time implementation of this

algorithm. After all, there's no need to solve a

sub-problem more than once. Once you've solved it once you know the

answer. So an obvious way to speed up this

algorithm, to speed it up dramatically is to simply cache the results of a

sub-problem the first time you see it. Then you can look it up in some array,

constant time, at any point later on in the algorithm.

There is a word for this, I won't really use it in this class, but just so you

that know what it is, it's called memoization.

So in case this is a little vague, what I mean is you would have some array, some

global array, indexed one to N or maybe zero to N with the semantics that the Ith

entry of this array, is the value of the solution of the Ith sub-problem.

Now when you do a recursive call and you're handed the first I vertices of the

graph, and again remember that we know that the sub-problem has to look like the

first I vertices of the graph for sub I. You check the array, if it's already been

filled in, if you already know the answer, great.

You just return it and count the time, you don't bother to resolve.

If this is the first time you've ever seen.

this sub problem then you recurse and you solve it,

as as we saw, as we suggested in the previous slot.

So with this simple memoization fixed, this action, this algorithm is linear

time. The easiest way to see that, and actually

in fact a better implementation, is to go away from this recursive top down

paradigm, and instead implement the solution in a bottom up way.

So solving sub problems in a principled way from the smallest to the original

one, the biggest. So a little bit more precisely,

let me use the notation G sub I to denote the sub graph of the original graph,

consisting of the first I vertices. So we're going to again going to ha-,

going to have an array with the same semantics as in the recursive version.

The Ith entry denotes the solution to the Ith sub-problem.

That is the max rate independent set of G sub I, and the plan is to populate that

bottom up, that is from left to right. So we'll handle the edge cases of the

first two entries of, of this array specially G sub zero would be the empty

graph, so there's no independent set. So lets define the max weight,

independent set of the map graph, to be zero.

And, if you graph in G one, where the only vertex is v one, the max weight

independent set is obviously that one vertex.

Remember weights are not negative. So our main four loop just builds up

solutions to all of the sub-problems in a systematic way, going from smallest

graph, G sub two up to the biggest graph, the original one, G sub n.

And when we consider the graph G sub I, how do we figure out what the max weight

independence set is of that graph? Well now we use, completely directly, the

work that we put in the previous video. The previous video told us what an

optimal solution has to look like. And it has to have one of two forms.

We know, we proved, either, the maximum independent set of G sub I.

Excludes the last vertex V sub I, and then is merely an optimal solution of the

graph G sub I minus one. If it's not that, there is nothing else

it can be, other than including the last vertex V sub I.

And being an optimal max weighted independent set for the graph G sub I

minus two. We know its one of those two things.

We don't know which. We do brute force search among the two

possibilities, and that gives us the optimal solution

for the Ith sub problem. Crucially, when we need to do this brute

force search for the Ith sub problem, we already know.

We've already computed the optimal solutions to the smaller sub problems

that are relevant, those can be looked up in constant time, and that's what makes

this, each iteration of this four loop run in constant time.

So we've done a fair amount of work to get to this point,

but, after seeing that the greedy algorithm design paradigm failed us.

The divide-and-conquer algorithm design paradigm was inadequate.

Brute force search is too slow. With this, as we'll see, dynamic

programming algorithm, we now have a one line solution, effectively, to the max

weight independent set problem in path graphs.

Pretty cool. What's the run time?

Well this is probably the easiest algorithm is run time we've ever had to

analyze. Obviously, it's linear time,

constant time per each iteration of the four loop.

Why is the algorithm correct? Well it's as same as our recursive

algorithm. It makes exactly the same decisions.

The only difference is it doesn't bother with the spectacular redundancy of

resolving sub-problems it's already solved.

Again if you wanted to prove it by scratch, it would just be a straight

forward induction, like in our divide and conquer algorithm, the recursive calls

are correct by the inductive hypothesis. The inductive step is justified by the

case analysis of the of the previous video.