0:00

Having laid the groundwork in the previous video, we're now ready to follow

our usual dynamic programming recipe to develop a faster than brute-force search

dynamic programming algorithm for the traveling Salesman problem.

Editing through all those quizzes in the last video, we developed some healthy

respect for the traveling salesman problem.

We understood why the more natural approaches, which were sufficient for the

path problem. For example, the Bellman Ford Algorithm

are not good enough to solve TSP. Now no surprise, TSP is an NP complete

problem, so you're sort of ready for this, but we know understand it in a

pretty. Concrete way.

In the single source, shortest path problem, in a sub-problem, all you need

to know is where's the path ending and how long could the path be.

That's not enough for the traveling salesman problem because there's this

extra tricky constraint. But at the end of the day, we want to

visit every vertex exactly once. So, in particular, to make sure we don't

have repeat vertices show up in our sub-problem solutions.

We're going to need to remember not just where smaller sub-problems end.

But also the identities of all of the intermediate vertices in the path.

Here, then, is the final definition of the sub-problems that we're going to use.

We still want to keep track of where this path ends so we still have our

destination j, and rather than just specifying a number of vertices that you

visit on this path, we specify exactly which vertices get visited.

So that's going to be a subset S. A subset of 1 through n.

And, of course, it better contain the initial vertex 1, and the final vertex j.

And so, for a given choice of j, and a given set choice of capital s.

We define l sub s, j, as the length of a shortest path.

It starts at vertex 1, end at vertex j. Visits precisely the vertices in capital

s, exactly once each. Now, I can understand if your eyes widen

a little bit when you see this definition.

After all, there's an awful lot of choices for this said capital s, an

exponential number. So you'd be right to wonder whether this

approach could possibly give any saving whatsoever.

Overt naive brute-force search. One thing worth pointing out, and in some

sense the source of the computational savings, is that in a given sub problem

while it's true we're keeping track of which vertices are visited en route from

1 to j, we're not keeping track Of the order in which those vertices get

visited, and indeed, if we had 1 sub-problem for every order in which you

might visit vertices from a given source to a given destination, then we'd be

stuck with a factorial number of sub-problems.

As it is, since we forget the order, and we only focus on sub-sets, that gets us

down in more than 2^n range, and that's ultimately.

Where the savings over brute-force search come from.

So as usual once we've gotten the right set of sub-problems the rest of the

dynamic programming recipe pretty much writes itself.

But to fill in the details let me just tell you exactly what the optimal

substructure limit is, what's the corresponding recurrence and the final

pseudocode that gives us our better than brute-force search.

Algorithm for the traveling salesman problem.

So, for the optimal substructure lemma, we just focus on 1 particular subproblem.

So that corresponds to a choice of the destination J.

And a choice of the set s. It specifies 1, the starting point.

J, the ending point. Plus the rest of the intermediate

vertices that we've got to visit along the way, exactly once each.

So now we proceed exactly as we did before in [UNKNOWN], namely by focusing

on the last hop of this path P. So if the last hop is say from vertex K

to vertex J, then we can look at the previous part of the path, call it P

prime, what can we say about it? Well clearly it begins at the vertex 1.

[INAUDIBLE]. Clearly, it ends at the vertex k.

Because of the path p, it visited every vertex in s exactly once.

The path, p prime visits every vertex in the set s - j exactly once.

And, of course, the optimal substructure lemma says.

Not only is prime [INAUDIBLE]. Path.

From 1, to k, visiting exactly the vertices of s minus j, once each.

It is the shortest, such, path. The proof is an utterly standard, cut and

paste argument, like we've been doing for all our other optimal substructure

lemmas. I'll leave it as an exercise for you to

fill in the details. Tells, as usual the optimal substructure

lemma naturally induces a recurrence. recurrence just says well let's look at

all of the candidates when [UNKNOWN] solution identified by the lemma and the

boot force search among other candidates. That is, for a given sub-problem indexed

by a destination j and a set of vertices s, what the recurrence does is it takes

the best case scenario, that is the best choice of the second to last vertex k.

Of course, k should lie somewhere in the set S, also it should not be the vertex

j. And for a given choice of k, we know the

corresponding candidate solution has to be a shortest path from 1 to k.

Visiting exactly the vertices of s-j. Combined with the cost of the

corresponding final hop from k to j. And effectively, we're using the same

measure of sub-problem size that we were in the Bellman - Ford solution, just the

number of allowable intermediate vertices.

Of course, here, the sub-problem also specifies exactly which those vertices

are, but as far as a measure of size, we just use the cardinality of that set.

As always, the important property of the size measure is that to solve a

sub-problem of a given size, you only need to know the answers of sub-problems

with strictly smaller size, and staring at the recurrence, you see that that is

indeed the case here. You only care about sub-problems that

visit 1 fewer vertex, because in particular, j is excluded.

6:14

It's the same as the started vertex.Now if S happens to contain just the vertex

1, then the empty path goes from 1 to 1 with no intermediate stops that has

length 0. Otherwise, if S contains extra vertices

there's no way to go from 1 back to 1 with intermediate stops.

Visiting the vertex 1 only once. So we define the other set problems with

j=1 want to have plus infinity value. Now as usual we saw what the set problem

systematically using the recurrence. We want to make sure that rather smaller

set problems are solved before the bigger ones, so 1 out of 4 loops we iterate over

the measure of set problem size which member of the carnality of the set s.

Amongst sub-problems of this same size, we don't care what order we solve them.

So we just iterate through the sets S of [UNKNOWN] M in arbitrary order.

So these outer two for loops accomplish the following, they iterate through the

relevant choices of S. That is sets S to have cardinality at

least 2 and contain the vertex of 1. So that's the first index of our 2D

array. What about the 2nd index J, so this is

where the shortest path in a given sub. The problem is supposed to stop.Well sub

problems only make sense, are only defined if that final vertex J, is one of

the vertices S, that you are supposed to visit.So when we get to a given choice of

capital S, at that point we are going to iterate through all the relevant choices

of j.and if you recall our argument in the base case and the fact the this SS

has size at east 2, there is no point in trying j=1.

So now that we have a subproblem. We have our choice of S, we have our

choice of J. We just compute the recurrence for this

particular subproblem. So what is that? Well, we take a best

case choice of the penultimate vertex. So that's some vertex that we've got to a

visit. A vertex k and capital S.

Other than the one where we stopped, so it can't be equal to j.

And then, given a choice of k, what is the corresponding solution of that

candidate path from 1 to j. Well, it's whatever the shortest path is

from 1 to k. Which of course is visiting everything in

s, except for j. Plus whatever the cost of the final hop

is from K to J, so the recurrence just figures out what is the optimal choice of

K by brute-force search and then by our optimal substructure lemma we know that's

the correct answer to this sub problem. Now in almost all of our dynamic

programming algorithms, after we solved for the sub problems, all we did was

return the value of the biggest one. Here we actually have to do a tiny bit of

extra work. It is not the case that the solution we

care about. The minimum cost traveling salesman tour

is literally one of the sub problems. However, it is true that we can easily

compute the minimum cost of a traveling salesman tour given solutions to all of

the sub problems. So what is the biggest sub problem? Well

there's actually n of them. Remember our measure of sub problem size

is the cardinality of the set S, the number of vertices that you've got to

visit in that In that sub problem. So in the biggest sub problems S is equal

to everything, you have to compute a shortest path that visits each of the N

vertices exactly once. So 1 are the semantics of 1 of those sub

problems, for a given choice of the 2nd index J, there's N different choices for

that final vertex J. That sub problem was responsible for

computing the length of the shortest path.

Its starts at the vertex 1 it concludes at the vertex J, and it visits every

single vertex exactly once. Now hows is this different than a travel

salesman tour? Well all thats missing is that finally hop, that hop from J to 1.

So what this means is that, after we've solved all of the subproblems in our

triple 4 loop. We can complete the algorithm with 1

final brute-force search. This final time that brute-force search

is over the choice of J. The last vertex that you visit before you

return home to vertex #1. So you can skip the choice of J, for 1,

that 1 doesn't make sense. But for the other N-1 possibilities for

that final vertex J, you take the previously computed value of the shortest

path, it starts at 1 and goes to J. Exactly once, you tack on to that the

cost of the final hop, going home to 1 from vertex j, and of those n-1 different

solutions, you return the best of 'em. That is in fact the cost of the, minimum

cost travelling salesman tour. The correctness argument is the same as

it always is. The correctness of the optimal

substructure lemma Implies the correctness of the recurrence and then

the correctness of the dynamic programming algorithm follows from the

correctness of the recurrence plus induction.

The running time is equally straight forward.

So because of the sub problems we're keeping track of a lot of information.

Specifically the identities of all the intermediate vertices.

There are a lot of sub problems. So in particular there's roughly 2^n

choices for the set capital S, there's roughly n choices, for the final

destination j. So combining those, we get n*2^n sub

problems. How much work do you do per sub-problem?

Well you have to execute this brute-force search over the chase, choices of the

vertex k. The second to last vertex on the path, k

can be almost anything in the set capital S, capital S could have linear size.

So you're going to do o(n) work per sub-problem.

Thus, the overall amount of work, as promised is O of n^2*2n^2.

This is still exponential time, there's still sever limits on how big a value of

n you're going to be able to execute this algorithm for.

That said, the problem is mp complete and you have to respect its computational

intractability. And again, at least this shows there are

often interesting algorithmic opportu, opportunities to beat brute-force search,

even for mp complete problems like the traffic salesman.