0:00

In this video, and then in the next. We're going to develop, from scratch, a

algorithm for the all pair shortest path problem.

Rather than by relying on the aforementioned reduction to the single

source version of the problem. So, in this video, we're going to develop

the relevant optimal substructure lemma. In the next video, we'll culminate in the

dynamic programming algorithm. It is a famous one.

The Floyd Warshall algorithm. Before developing the optimal

substructure let me just tell you a little bit about where we're going.

The optimal sub-structure we're about to identify will lead to a dynamic

programming solution via the usual recipe, which we've now seen over, and

over, and over again. It's going to solve the all-pairs

shortest paths problem in O of N cubed time.

So we get a bound independent of the sparsity.

[SOUND] N cubed, even if the graph has negative edge lengths.

This algorithm is know as the Floyd-Warshall algorithm.

0:50

Last video we discussed the performance that you get by running a single source,

shortest path sub routine N times. And the Floyd-Warshall algorithm is a

better solution in many situations. So first of all, for graphs that have

general edge links. That is edge links that are allowed to be

negative then Floyd-Warshall is looking pretty good.

Remember with the previous solution was to run the Bellman-Ford algorithm end

times. We can't use Dijkstra's algorithm cause

that's not generally correct with negative edge costs.

And if you run the Bellman-Ford algorithm in times you get a running time of N

squared M. So even in the best case of

Bellman-Ford's sparse graphs, we're doing just as well with Ford Warshall.

We've got a cubic running time. And we're doing quite a bit better for

dense graphs. There N times Bellman-Ford will give us N

to the fourth. A cubic running time whereas here we're

getting cubic. Now for graphs with non negative edge

costs, reducing to the single source problem is actually a pretty good

solution because Dijkstra is so fast. So if you're on Dijkstra N times, you're

running time is going to be N times M times log N.

So for sparse graphs you'd really want to use Dijkstra N times, rather than ford

warshaw. Cause you get a running time roughly

quadratic of N, as opposed to the cubic running time we're going to get here.

For dense graphs, it's not so clear. If you run dyxtra N times you're going to

get a running time which is ball park cubic, so that's going to be roughly the

same as Floyd-Warshall. And you'd expect them to be comparable.

It's not clear which would be better in practice.

If you. Had to choose between those two solutions

for dense graphs, you'd really want to code them both up and pick whichever one

seemed to do better in your domain. One place you see the Floyd-Warshall

algorithm used quite a bit is for computing the transitive closure of a

binary relation. In a graph theoretic language you can

think of a transitive closure problem as computing all pairs reachability.

So that's a special case of all pair shortest paths where you just want to

know whether the shortest path distance is finite or infinite.

For each pair of vertices you want to know does there exist a path from the one

vertex to the other or not. And if all you care about is the

transitive closure problem, if all you care about is one bit of information for

each pair of nodes, connected or not, as opposed to more generally what the

shortest path length is, then you can do some optimizations on the Floyd-Warshal

algorithm to speed up the constant factors.

But I'll leave it up to you to read up on the web about that if you're interested.

3:02

Now, I realize that this cubic running time is, [SOUND] maybe, not super

impressive. In general, as we've been talking about

dynamic programming algorithms, we've been starting to see, for the first time

in these courses, a proliferation of algorithms.

Which, you know, while polynomial time, while way better than exponential time,

you wouldn't really call blazingly fast. We've seen a lot of quadratic running

times. Now we're seeing some cubic running

times. So, that's kind of a bummer, but, I am

still teaching you the state-of-the-art. As we mentioned this in the last video,

it remains an open question to this day, whether there are significantly.

[SOUND] Of cubic algorithms that solve the all pairs shortest paths problem.

So now, let's formalize the optimal substructure in all pair shortest path

which is exploited by the Floyd-Warshall algorithm.

A quick digression first though. At the beginning of the class I hope that

you know, if I had shown you this Floyd-Warshall algorithm, you would have

said wow, that's a really elegant algorithm.

How could IU have ever come up with this? Whereas now that you're approaching the

black belt level in dynamic programming and you see just how mechanically the

algorithm, this famous algorithm is going to fall out of the really quite

easy optimal substructure. I hope you say, how could I not have come

up with this algorithm? Now I don't mean to take any credit away

from the solutions of Floyd and Warshall. There is one really great idea in how

they phrase this optimal substructure. Remember we talked about this before the

Bellman-Ford algorithm, it can be tricky to apply dynamic programming to graph

problems because there isn't really an ordering in the input.

You just get this bag of vertices and this bag of edges.

And it's often unclear how to define bigger and smaller subproblems.

4:39

So the really nice solution in the Bellman-Ford algorithm was to introduce

this extra parameter I. I was a budget on the number of edges or

roughly equivalently the number of vertices that you are allowed to use in a

path. Between the origin and destination for a

given subproblem. This naturally induces an ordering on

subproblems, the bigger the edge budget, the bigger the subproblem.

We're going to do something similar but even a little bit more stringent in the

Floyd-Warshall solution. We're again going to restrict the number

of vertices that can appear in the middle of the path between a given origin and

destination and a subproblem. But more than just the number of the

vertices we're allowed to use, we're going to restrict exactly which vertices,

the identities of the vertices that the algorithms permitted to use to get from

the given origin to the given destination.

6:12

So let me start now setting up the optimal substructure limit.

In contrast to the Bellman-Ford case, I'm only going to prove the optimal

substructure limit of four input graphs that have no negative cycle.

We'll address the case of negative cycles after we're done with the algorithm.

Suppose for now there isn't one. So what then are the subproblems going to

be? Well it's going to be quite analogous to

Bellman-Ford. In Bellman-Ford we were solving a single

source shortest path problem so we needed to compute something for each

destination. So that gave us a linear number of

sub-problems and then for a given destination we had this parameter I

controlling the edge budget. So that was another linear factor, so we

had a quadratic number of subproblems. Here it's going to be the same except we

also have to range over our own origin. So we're going to get a cubic number of

subproblems, specifically a choice for the origin, N choices, a choice for the

destination, that's N more choices plus. Which prefix of vertices we're allowed to

use. So that's still another N choices.

So cubic total. So now we focus on an optimal solution to

this sub problem. By which I mean amongst all paths which

begin at the vertex I, end at the vertex J, and strictly in between I and J, as

internal nodes contain only vertices from one through K.

Amongst all those paths look at the one with the shortest length.

And because we're looking at the case of no negative cycles, we can assume that

this path has no cycles. It's a cycle free path.

So rather than state the conclusion of the optimal substructure lemma right now,

let me just make sure you understand what one of these sub-problems is.

So, lets suppose that the origin is the vertex that we've, aren't labelled

arbitrarily seventeen, the destination J is the vertex we've labelled ten, and

let's suppose K at the moment is only five.

Consider the following graph. Imagine this is a little piece of some

much bigger graph. Now I hope it's clear what the shortest

path is between seventeen and ten. It's the bottom two hop path, that path

has total length minus twenty. On the other hand, for the sub problem

we're dealing with, K equals five. So what that means is we have this extra

constraint on which vertices can we can use in the middle of our paths.

Now to be clear, this constraint K at most five that can't apply to origin of

the destination. Right?

I is 17, J is 10, both of those are bigger than five, but we don't care.

Obviously, any path from I to J, has to include both the vertices I and J.

So the constraint only applies to vertices in the middle of the path.

But, this bottom two hot path, unfortunately makes use of the node

seven. So that path does not obey the

constraint, it is not at most five, so that path is disallowed.

Can not use it. Therefore, the shortest path from

seventeen to ten, that makes use of only the first five The five labeled vertices

as intermediate ones would be the top three hot path, which has the bigger

length of three. Oay, so I hope it's clear what a c-

sub-problem corresponds to. It corresponds to a choice of I, the

origin. That's again just some vertex labeled

something from one to N. Similarly there's a choice of the

destination, some other vertex J which is labeled something from one to N.

And then there's a choice of the bound K. Which governs which vertices you're

permitted to use internal to your path. So again the constraint doesn't apply to

the end points, just the vertices in between the end points.

And in the subproblem the choice of K says you can only use vertices one

through K, internal to your paths. So hopefully that makes sense, let's move

on to the full statement of the optimal substructure lema.

9:42

All right, so this is actually going to be one of the simpler optimal

substructural lemma that we've seen in a while.

We're back to the glory days of only two cases.

The first is trivial, if the shortest path from I to J using only vertices one

through K as intermediaries doesn't even bother to use the vertex K, well then it

better well be a shortest path from I to J that uses internal nodes only from one

to K-1. Now it's in case two it's going to be

apparent what a great idea ordering the vertices looking at prefixes.

The vertices is when you're considering the all pairs version of the shortest

path problem. Suppose this path P from I to J does

indeed use the vertex K in the middle. Well than we can think of the path P as

comprising two sub paths. The first path P1 which starts at I and

goes to the vertex K. And then a path P2 which starts at K and

goes to the destination J. Now here's what's cool.

So on this path P the internal nodes between I and J all lie in one through K.

Moreover, this path P remembers cycle free so the vertex K appears exactly

once, not more than once. So therefore if we split the path P into

these two parts, P1 and P2, internal to P1, strictly in between I and K.

There's only vertices between one and K-1.

Similarly strictly between K and J and P2, there's only vertices between one

through K-1. Thus both of these paths, P1 and P2 can

be thought of as feasible solutions to smaller subproblems.

Subproblems with an even tighter budget of K minus 1 on the internal nodes that

are allowed and these aren't just feasible solutions for smaller

subproblems, they're optimal solutions as well.

So, how slick is that? That the maximum indexed, internal node

just splits this shortest path into two shortest paths for smaller sub-problems?

And, you know what? At this point, I am so confident in your

facility with dynamic programming, I am not going to even bother to prove this

statement. You are totally capable of proving this

yourself. The argument, it's really quite similar

to Bellman-Ford. I encourage you to do it as an exercise.

[SOUND]