The primary topics in this part of the specialization are: greedy algorithms (scheduling, minimum spanning trees, clustering, Huffman codes) and dynamic programming (knapsack, sequence alignment, optimal search trees).

Loading...

来自 Stanford University 的课程

Greedy Algorithms, Minimum Spanning Trees, and Dynamic Programming

664 个评分

The primary topics in this part of the specialization are: greedy algorithms (scheduling, minimum spanning trees, clustering, Huffman codes) and dynamic programming (knapsack, sequence alignment, optimal search trees).

从本节课中

Week 1

Two motivating applications; selected review; introduction to greedy algorithms; a scheduling application; Prim's MST algorithm.

- Tim RoughgardenProfessor

Computer Science

Okay. So to this point we've proven the correctness of Prim's minimum spanning

sheet algorithm under an assumption, under an assumption that the cut property

is true. So, the purpose of this video was to

supply the missing proof to convince you of this cut property.

Let me remind you where we stand. So through all the minimum spanning tree

videos, we're assuming distinct edge costs.

All of this can be extended to edge costs with ties.

In particular, there's a version of the cut property when the edges have ties,

but we're just going to focus on the main ideas which were exposed already with

distinct edge costs. So what does the cut property say?

Well, it's meant to be a guarantee that an edge is safe to include on, in your

tree so far. So it justifies an iteration of a greedy

algorithm like Prim's algorithm. Specifically, consider an edge of a

graph, and suppose you can find some cut A, B.

So that, amongst all the edges that are crossing this cut, E is the cheapest.

So E, the edge E has to not just cross this cut,

but it has to be cheaper than any edge that crosses this cut.

If you can find just one cut of this form, so that E is the cheapest crossing

edge, then it's definitely not a mistake to

include E in your tree. You definitely want it.

It is in the MST. So this is a non-trivial claim and, let's

turn to the proof. At a high level, the plan will be not

that different than the correctness of our greedy scheduling algorithm for

minimizing the weighted sum of the completion times.

That is, we're going to use an exchange argument embedded in a proof by

contradiction. [COUGH] The type of contradiction will be

of the same form. We'll start with an optimal solution.

Suppose it doesn't have the property that we want it to have,

and then we do an exchange to make it even better, contradicting the assumption

that this solution is optimal. So specifically, if we argue by

contradiction we assume, that the cup property is false.

So let's just make sure we understand what that means.

If the cup property is false, then there's a graph and there's an edge,

which actually is the cheapest crossing some cut, and yet, that edge does not

belong to the minimum cost spanning tree T star.

The plan then is to exchange this missing edge E with some edge that isn't a tree T

star, which is more expensive, thereby getting a better spanning tree providing

the contradiction. So, this idea currently is a little bit

hand-wavy. To really execute it, we have to specify

which edge we're going to exchange the edge E with.

That's a subtle point and we'll develop it over the next couple of slides.

So let's begin with a sort of first cut attempt at an exchange argument.

So what's the world look like? Well, we have some cut of a graph.

So at one blob, the vertices is A and then the rest of the vertices are in this

blob B. this is the cut for which edge E is the

cheapest. And by our assumption in this proof by

contradiction, this cheapest edge E does not belong to the minimum spanning tree T

star. That said,

I claim while T star may not have this edge E to cross in this cut, it better

possess some other edge crossing this cut A, D.

So why is that true? Well, suppose the opposite, suppose in

fact T star did not have any edge crossing this cut A,

B, well then, T star wouldn't be connected. It wouldn't span all the

vertices, right?

Remember our proof of empty cut lemma, so if you had this empty cut and there's no

way to have a path from one side to the other side.

Okay? But that's spanning trees have to have

paths between each pair of vertices. So T star as a spanning tree have to

contain something from this cut, by assumption it doesn't contain edge E. So

it contains some other edge, let's call it F, that crosses this cut.

Now of course, since E is the cheapest edge crossing this cut and F is some

other edge crossing this cut, F is strictly more expensive than E.

And at this point, we seem beautifully set up to execute the desired exchange

arguement. We have the edge that the optimal solutions missing. We have a

canvid replacement edge F, which is more expensive.

So if we swap E and F, hopefully we get a new spanning tree that has strictly

smaller cost providing the desired contradiction.

But, things are more settled than they were with these scheduling applications.

The reason being is that schedules are simply sequences of jobs.

So whenever you do an exchange of two jobs, it's clear you get another

schedule. But spanning trees and graphs are subtle

objects and there's a question, if we take a spanning tree and we add one new

edge and delete an edge, do we get another spanning tree of the

graph or not? So the following quiz is going to ask you

to think about that question carefully. Okay. So what we wish that the answer to

this quiz was, was either answer A or answer C.

So A would be the cleanest solution. If it were always true that when you take

a spanning tree, you take an edge outside of the spanning tree and then you swap

those two edges, you get a new spanning tree,

then in fact, our proof of the cut property would be done, right?

We would just go on that previous slide. We would rip out the edge F from the

spanning tree. We'd plug in the edge E,

because E costs less than F, we'd get a spanning tree which was

cheaper. And we'd be done,

that would be our contradiction. Now if A wasn't true, we'd still be okay

if C was true. If maybe not every swap yields a new

spanning tree, but at least if you're swapping in the

edge that's the cheapest crossing some cut,

you get a spanning tree. Then we'd also be golden, because in

fact, we're only are trying to execute the swap,

the exchange, using an edge, which is the cheapest crossing some cut.

If you go back to the previous slide, you'll see that was the only case we

needed this fact to be true. Unfortunately, the correct answer to this

quiz is D. You need not get a new spanning tree when

you execute an exchange, even if you're plugging an edge which is the cheapest

edge crossing some cut. So to understand this better, let me the

picture that we had on the previous slide.

We had our cut A, B, we had our cheapest edge E which by assumption does not

belong to the spanning three T star, but we observed that T star has to contain at

least one other edge crossing this cut because it's connected and we called that

F. And we're wondering if swapping E and F

yields a new spanning tree or not. So, to reason about this, let me just

draw you what the rest of the spanning tree might look like.

So in this picture, this spanning tree T star is given by the pink edges.

And it's just to this path of five edges on the six vertices.

So, what happens if we exchange E and F? Well unfortunately, something bad

happens. So we certainly get a new subgraph of

five edges, after all, we just subtracted one and

added one. But this new spanning, this new subgraph

fails to be a spanning tree. It fails on both counts.

First of all, it obviously has a cycle and it's a four cycle and secondly, it's

not connected. The upper right vertex is just totally

disconnected from the rest of the rest of the vertices.

So that's no good. That's an exchange which just does not

produce a feasible solution and it is therefore not useful in our proof by

contradiction. Now, if you just want to salvage some

hope from this seemingly promising proof plan, we could take solace from the fact

that there is not just one pink edge crossing the cut A, B.

F isn't the only one, there's actually this other one on the bottom.

so let me call that E prime. Now, E being the cheapest edge crossing

this cut overall. Not only is E cheaper than F, it's

cheaper than E prime also. So in some sense with our motivation, we

could care less which edge we exchange E from crossing this cut, because it's

cheaper than all of them. And we see that at least in this example,

swapping with E prime yields a good solution, yields a feasible spanning

tree. So what have we learned?

What we've learned is that if we want to execute this exchange argument, we cannot

blithely exchange with any edge of T star that crosses this cut.

So the best case scenario, so what we're hoping is true that we can always find

some suitable edge, like E prime on the previous slide.

So that when we execute this swap, we do in fact, get a spanning tree.

And I'm happy to report that we can indeed always do this.

So what I need to explain is the procedure by which we exhibit edge, this

edge E Prime, which doesn't get us into trouble after we swap, which still gives

us a spanning tree after the swap. So let me explain the procedure by which

we identify this magical edge E prime that we can swap with and still be a

spanning tree. So here's the way to think about it, so

we've got this spanning tree T star, we've got this edge which is not yet in T

star. Now, if we just plug E into T star, we're

going to get a cycle. Why?

Well, a spanning tree, remember, it has a path between each pair of vertices. So if

this new edge, maybe its endpoints are U and V. T star already has a path between

U and V, so when you plug in this new direct edge between U and V, it closes

the loop, it gives you a cycle. So let's go ahead and call that cycle

capital C. Let me also redraw the picture from the

example on the previous slide so you can see what that cycle was in that special

case. Now, here's the pattern to notice about

this cycle capital C, at least in this example, which is that the lousy edge,

the edge F, for which when we swapped, we didn't get a spanning tree, that's off of

this capital C. Whereas, the good edge, the edge where we could do a swap and get

a spanning tree, E prime that's on this same cycle capital C.

And that turns out to be true in general. So, when you add the edge to the spanning

tree and you get a new cycle, that cycle is what furnishes the candidates for

swaps that will give you a new spanning tree.

So the one lingering concern then, we have this cycle.

We would, all edges of the cycle are going to be good candidates for the

swapping. Wee just need to make sure that there is some edge that actually crosses

this cut A, B like the edge E prime does in the picture.

But here, we're going to rely on a fact from a previous video, the double

crossing lemma. Remember the double crossing lemma says,

that if you have a cycle that crosses a cut at least once, then it has to cross

it twice. All right.

So if it'd cross once, then it has to loop back around, then in looping back

around, it's going to cross it a second time.

So, in this cycle capital C, we know it crosses the cut A, B once,

that's because it includes the original cheapest edge across the cut E.

So, it's got to cross it a second time. There's got to be an E prime in the cycle

crossing the cut and that's going to allow us to do the swap and get a new,

cheaper spanning tree completing the contradiction.

So, just to spell things out in a little more detail.

So what we do is we first say we use the double crossing lemma.

So, we have this reported minimum spanning tree T star.

We have this cheap edge E knot in it. We plug E into the spanning tree, we get

cycle, we call the cycle capital C. The cycle crosses the cut A, B once,

through the edge E. It crosses it a second time by the double

crossing lemma. We're going to call that edge E prime.

Since E prime crosses the same cut as E, we know that E prime is strictly more

expensive than E. Remember we use the cheapest one crossing

this cut, A, B.

So now what we do is we execute the swap with this new edge E prime.

So E prime in T star. The cheapest edge, E, is not in T star so

we can swap them. We can take,

we can rip E prime out, we can stick E in.

Now something I want you to think through carefully at home, convince yourself this

is true, is that because we plucked E prime from the cycle, this new tree which

I'm going to call capital T, this is a spanning tree necessarily.

You know, intuitively, the reason being, you plug in E and you get this one cycle

involving E, and then when you rip out E prime, you destroy the cycle.

And because it's on a cycle, you don't destroy connectivity between any pair of

veriticies, there is still one path between each pair.

But make sure you believe that, convince yourself at home.

And once you're so convinced, you will also realize that we've finished a proof.

We've executed our proof by contradiction.

since E was the cheapest edge crossing the cut, and E prime is another edge

crossing the cut, E is got to be cheaper.

Since T differs from T star only in the swap of these two edges,

it's aggregated cost has gone down and that contradicts the purported optimality

of T star comp completing the proof of the cut property.