0:00

So now, let's talk about the final and most serious problem, which is that stuff

in the Internet, nodes, links, etc., is failing all the time.

You know its true that we just argued that the asynchronous bellman ford

shortest path rounding protocol is guaranteed to converge to correct

shortest path. that was assuming that the network was

static that no legs were coming up and down.

So one particularly simple problem in a presence of failure is what's known as

the counting to infinity problem. So I'm going to show you a particularly

contrived version of this problem just to kind of illustrate the form in the

simplest way, but it's quite easy to come up with more realistic examples.

And this problem is not just some kind of theoretical flaw in the distributed

Bellmanu2013Ford algorithm. This problem really was observed in

practice with those running protocols from the 1970's.

So imagine we just have a super simple network verticies s, v, and t by directed

arch from s to v and an arc from v to t everything has unit cost and we're doing

it routing to the destination t. You might if you want a more realistic

example, think about the arcs between S and V as standing in for longer paths

that have multiple hops. In any case let's imagine that we

successfully computed shortest paths on this basic network so that distance form

T to itself is zero, from V to T is one and from S to T is two.

Now, again, the issue is that links are failing all the time on the internet.

So, at some point, this link from V to T might go down.

So, at some point, V is going to notice that its link to T has failed and it's

going to have to reset its shortest path estimate to T to be plus infinity.

And in an effort to recover connectivity to t, it makes sense for v to ask it's

neighbors, do they have paths to t? And if so, how long are they?

So in particular, v will pull s, and s will say, oh yeah, I have a path to t of

distance only two. And v says, oh well that's great.

I, currently my estimate to t is plus infinity.

I can get to s with length one, s says it can get back to t with length two.

So that gives me a path of length three to t.

Now, of course the flaw in this reasoning is that s was depending on v to get to t

and now all of a sudden, v circularly is going to use s in its mind to get all the

way back to t. So this already illustrates the dangers

of naively implementing the asynchronous Bellman Ford algorithm in the presence of

link failures. Where does the name counting to infinity

come from. Well you can imagine that for some

implementations of this protocol, S would then say, oh boy, okay, V just revised

it's shortest path estimate to T to be three.

My next hop was to V. So if V takes two longer to get to T then

I'm going to take two longer to get to T as well.

So at that point S updates it's shortest path distance to be four.

And now this keeps going on. At this point V says, oh well, if S is

going to take two longer to get to T, I was depending on S.

So, I have, have to go up by two. And then, S goes up by two and B goes up

by two and so on, forevermore. So failures cause problems for the

convergence of the distributive shortest-path routing protocols.

Not just the counting-to-infinity problem, there are others as well.

And this is a tough issue. Much blood and ink has been spilled over

the relative merits of different possible solutions for dealing with failures.

Back in the 1980s people were largely proposing sort of hacks to the basic

protocol. I will the, give them credit, they had

some pretty ridiculous and fun names for their hacks, like split horizon with

poison reverse." but what eventually people converged on is moving from these

so-called distance-vector protocols to what are called path vector protocols.

3:40

And what happens in a path vector protocol is, each vertex maintains not

just the next top to every possible destination, but they actually keep

locally the entire path, which is going to get used from v to each destination t.

So there's definitely some irony here because this is essentially the solution

to reconstructing optimal solutions in dynamic programming that we've been

studiously avoiding this whole time. You might, you might recall when we first

started discussing reconstruction algorithms, this was back independent

sets of path graphs. We first said, well, you know, you could

in the forward pass through the array store not just the optimal solution value

but the optimal solution itself. But we argued, well, that's, that's not

really the best idea. It wastes time.

It wastes space. Much smarter is to just reconstruct an

optimal solution via a backward pass through the filled in table.

But what's happening here is this optimized version of the reconstruction

algorithm that doesn't use extra time or space, it's just not robust enough to

withstand the vagaries of internet routing.

So we are going to the crudest solution, where we literally just store optimal

solutions, not just the solution value it's self.

So that explains why this is called a path vector protocol.

Distance vector protocol you store a distance for each possible destination.

Here were storing a path for each possible destination.

So the drawback is exactly what we've been saying all along.

If you store the optimal solutions, and not just the optimal solution value, it's

going to take quite a bit more space. And indeed, doing this blows up the

routing tables in routers, that's just a fact.

There are two quite important benefits. The first we've already discussed,

they're more robust to failures. I am not going to discuss the details

about how you use this extra path information to handle failures better.

But certainly in our contrived motivating example you can see how if S and V

actually knew the entire path they were storing to T, they could make sure that

they didn't get into this counting to infinity problem.

But moving to a path vector protocol doesn't just add robustness, it also

enables new functionality. So in particular, passing around entire

paths allows. Vertices to express more complicated

preferences over routes than merely what their lengths are.

Imagine, for example, you were a small internet provider.

And you have a much more favorable contract with AT&T than you do with

Sprint. So it cost you much less money to send

traffic through AT&T as an intermediary than it does through Sprint.

Well, now imagine you're running this, distributive shortest-path protocol.

And two of your neighbors offer you two paths to get to some destination T.

The first path goes through AT&T, the second path goes through Sprint.

And, of course, the only reason you know this information is because it's a path

vector protocol, and you're passing around the actual path and you can see

what the intermediate stops are. Well then your free to say oh well I'm

going to use as my path to t and this is what I'm going to advertise to other

people I'm going to advertise and store the path that goes through AT&T.

I'm going to ignore the one that goes through Sprint.

So that is more or less how the border gateway protocol, aka the BGP protocol,

works, and that is the protocol which governs how traffic gets routed through

the core of the Internet. So that wraps up our discussion of how

shortest path algorithms dating back to the 1950's has played a fundamental role

in how shortest path routing has evolved in the Internet over the past half

century.