0:00

We're now well positioned to state Johnson's algorithm in its full

generality. The input as usual is a directed graph G.

Each edge has a edge length C sub E, it could be positive or negative.

The input graph G may or may not have a negative cost cycle.

As usual, when we talk about solving a shortest path problem we mean that if

there is no negative cost cycle, then we have no excuse. So we better correctly

compute all of the desired shortest paths.

In this case between all pairs of vertices.

If there is a negative cost cycle, we're off the hook.

We don't have to compute shortest paths, but we.

need to correctly report that the graph does indeed have a negative cost cycle.

Step one is the same hack that we used in the example.

We want to make sure that there's a vertex from which we can reach everybody

else. So let's just add a new one that by

construction has that properties. So we add one new vertex to little s, we

add N new edges. One from s to each of the original

vertices and each of those new N edges has length zero.

We're going to call this new, bigger graph g prime.

The second step, just like in the example, is to run a shortest path

computation using this new vertex S as the source.

Since the graph G in general has negative etch cost, we have to resort to the

Belmont Ford out rhythm to execute the shortest path computation.

Recall that the Bellman-Ford Algorithm will do one of two things.

Either it will correctly compute the shortest path distances that we're after,

from the source vertex S to everybody else, or it will correctly report.

That the graph it was fed, namely g prime, contains [SOUND] a negative cost

cycle. Now, if g prime contains a negative cost

cycle, that cycle has to be the original graph, g.

Our, the new vertex, s, that we added, has no incoming arcs at all, so it

certainly can't be on any cycle, [SOUND] so any negative cycle is the

responsibility of the original graph, G. [SOUND] Therefore if Bellman-Ford finds

us a negative cost cycle, we're free to just halt and correctly report that same

result. So from here on out, we can safely assume

that G and G prime have no negative cost cycle, and therefore that the

Bellman-Ford algorithm did indeed correctly compute shortest path distances

from S to everybody else. As in the example, we are going to use

these shortest path distances, [SOUND] all of which are finite by construction.

As our vertex weights, our P sub Vs. We then form the new edges lengths, the C

primes, in the usual way. C prime of an edge E going from U to V is

the original length CE plus the weight of the tail P sub U minus the weight of the

head P sub V. In the example, we saw that the new edge

length C prime are all non-negative. We will shortly prove that, that property

holds in general. If you define your vertex weights in

terms of shortest path distances from the new vertex S to everybody else in this

augmented graph G Prime it is always the case that your new edge lengths would be

non-negative. Assuming that for now, it makes sense to

use Dijkstar's algorithm N times, once for each choice vertex to compute all

pairs shortest paths in the graph G with the new edge length C prime.

In a particular iteration of this four loop.

Say, when we're using the vertex u as the source vertex.

We're going to be computing n of the shortest path distances that we care

about. From u to the various n possible

destinations. And let's call those shortest paths

computed by Dijkstra in this iteration d prime of u, v.

3:28

So perhaps you're thinking that at this point we're done.

Right? We transformed the edge links using

re-weighting to make them non-negative and as we know once things are not

negative end Dijjkstra and you're good to go.

But there's one final piece of bookkeeping that we have to keep track

of. So, the shortest path distances that we

compute in step four, these D primes, these are the shortest path distances

with respect to the modified costs, the C primes.

And what we're responsible for outputting the shortest path distances with respect

to the original links, the C's. But fortunately there's a very easy way

to extract the true shortest path distances from.

The D primes from the shortest path distances relative to the modified costs.

We know that D primes are wrong, there computed with respect to the wrong costs.

But we know exactly what they're off by. So D primes are off by exactly P sub U

minus P sub V amount. So to go from the D primes back to

shortest path distances that we really care about, we just need to subtract that

term back off, and that's step five. [SOUND] That completes the description of

Johnson's Algorithm which, in effect, is a reduction from the all pairs shortest

path problem with general edge lengths to N plus one instances of the single source

version of the problem, only one of which has general edge lengths, N of which have

only non-negative edge lengths. [SOUND] Analyzing the running time of

Johnson's algorithm is straight forward. Let's just look at the work done in each

of the five steps. In step one, we just add one new vertex

and n new edges, so that takes o of n time to accomplish.

6:06

So why is this running time so cool? Well two reasons.

The first reason is that for sparse graphs this solution blows away the

previous algorithms that we had that could handle negative edge lengths.

The two previous solutions that we knew, that you could use with graphs with

negative edge lengths were either run Bellam Ford N times or used the

Floyd-Warshall. And in sparse graphs where M is big O of

N, both of those solutions run in cubic time or in cubed time.

This solution for sparse graphs, when M is O of N is of, of N squared Times a log

N factor, so it's way better than either using Bellman-Ford N times or using

Floyd-Warshall. The second reason this running time is so

cool is that it matches the running time we were getting already in the special

case when edge links were non negative. so this is very different than how we

were conditioned to think about the world when we talked about single source

shortage path problems. Remember, in those problems, we have Dexter's

algorithm which only handles non negative edge links but runs in near linear time

and there's no algorithm known for single source shortest paths that's near linear

that can handle negative edge links, the Bellman Ford algorithm certainly is not

near linear running time. That's Big O of N times N.

Johnson's Algorithm shows that while negative edge lengths are indeed an

issue, they can be handled in one shot using just one invocation of

Bellman-Ford. While the Bellman-Ford algorithm is

indeed quite a bit slower than Dijkstra's algorithm, it's only roughly a factor of

N slower. So one invocation of Bellman-Ford doesn't

change the overall running time when we already have to do N invocations of

Dijkstra's algorithm, even in the special case of the problem where all of the edge

lengths are non-negative. So that is why for all pairs shortest

paths we see a convergence in the performance of algorithms that solve the

problem in general, and algorithms that only solve the problem in the special

case of non-negative edge lengths. The missing piece to the correctness

argument is understanding why in general the modified edge links, the C prime E's

are guaranteed to be non-negative. We saw that they were non-negative in

this specific example, but we have not yet proved it in general.

We'll do that on the next slot. Assuming for the moment that, that is

true, that the modified edge links are indeed non-negative, and therefore, when

we invoke Dijkstra it will correctly compute shortest paths, we're pretty much

done. In particular, recall that we had a quiz

in the previous video where we analyzed the ramifications of re-weighting.

And we saw that if you re-weight using particular vertex weights, some P sub Vs.

8:34

Then, for a given choice of an origin u, and a given choice of a destination v.

Every single uv path has its length changed by exactly the same amount, by a

common amount. Namely, the difference between u as

vertex weight p sub u. And the destination v as vertex weight, p

sub v. So that means, when we invoke Dijkstra's

algorithm in step four. Indeed, gets the shortest paths correct.

The shortest path distances are incorrect.

But we know exactly how much they're off by.

They're off by Pu minus P sub v. And that is corrected for in step five.

So that's why assuming correctness of Dykstra's algorithm Which in turn relies

on non negativity of the modified edge links.

The end result of Johnson's algorithm is indeed the correct shortest path

distances. To finish the analysis of Johnson's

algorithm all that remains is to prove the following claim.

Which is that for every single edge of the graph, it's length after we do re

weighting is non negative. So the proof in fact is not hard.

It follows quite easily from properties of shortest path distances.

So fix your favorite edge E, let's say going from U to V.

The vertex weights are, by construction, just shortest path distances from the

vertex little s. Well remember S is the extra source

vertex that we added to the original input graph G.

10:38

The length of this path which goes from S to V is of course just the length

incurred going from S to U, Which is just P sub U, plus the length of the final hop

which is just the length of this edge, C sub U V.

Now this is just some old path from S to V.

The shortest path from S to V, can of course, only be shorter.

And only piece of V, the vertex way to V, is by definition, the length of the

shortest path from S to V. And now, we're good to go because the

modified length of this edge C. That is C prime sub E is just the

difference between the right hand side of this inequality and the left hand side.

So that's why the difference is guaranteed to be non negative.

Since this was an arbitrary edge it holds for all the edges.

That completes the proof in the analysis of Johnson's algorithm.