0:26

heuristic of taking a random permutation gives a bad result.

We've seen a very reasonable approach that produces an optimal solution and

does this very quickly.

Namely, we started some node and at every iteration, we tried to visit

the node which is this close, which has not been yet visited so far.

And it is the closest such node, right?

So first of all, the good news about this heuristic is that it can be

implemented very efficiently, right?

We do not need to explore the whole search space,

which consists of roughly a minus 1 factorial at permutations.

But just at every iteration we need to find the closest node,

which kind of with every iteration we need to span through all possible neighbors.

So we definitely can implement everything

just roughly n squared steps, right?

So it is efficient and in many cases, particularly for

Euclidean TSP, in practice it produces quite a reasonable result.

So we are not guaranteed to find an optimal solution using

this heuristic of cost.

So let me remind you, we do not have any polynomial-time algorithms for

the traveling salesman problem.

And this algorithm is definitely polynomial, so

it works in n squared, so in polynomial time.

At the same time, it produces solutions that are in practice.

Usually are not much worse than an optimal solution, okay?

So for general graphs, not for Euclidean instances, but for general graphs,

it unfortunately may produce a solution which is much worse than an optimal one.

And we will see such an example in a minute.

For Euclidean TSP, it is it usually works better but

still it can produce a cycle which is roughly

a logarithm of n times worse than the optimal one.

So for example when n is equal to 1,000, it will produce

a solution which is 10 times longer than an optimal one, okay?

So first of all let's discuss how would you fool this heuristic or

this approach for the case of general graphs, right?

So to produce such an example, I assume that we're given

a graph where almost every edge have weight 2.

At the same time there are some edges that have weight 1.

So assume that there are just 5 edges.

And again, almost all of the edges have weight 2.

At the same time, there is an edge of weight 1..

So we start from some node and

we start constructing we start constructing our cycle.

So we go to that vertex and it turns out that from this vertex, or from this node,

there is another edge whose weight is also 1, which is good for us, right?

Because we know that almost all of the edges have weight 2.

Then we find another edge but then this point, we already have no choice, right?

We need to go to the only unvisited node.

And if we're lucky enough, this edge still have weight 1, but for

the last edge, we have absolutely no choice.

As for the previous one, actually, and it might turn out that its edge is just so

huge that it just doesn't pay out to use all these one's, right?

So in this case, and this is one of the situations where

the solution returned by this approach might be poor.

So recall so

that it might as well be the case that the weight of this edge is also huge.

It is something like, I don't know, 10 to the 6.

So we made some greedy choices at the beginning, but

it gives us no choice in the end.

And we need to take this last very heavy edge, okay?

But of course this cannot happen for the case of Euclidean TSP,

I mean exactly this picture.

Because if we assume that almost all the edges have weight 2,

then for example we should have an edge of weight 2 here.

An edge of weight 2 here, an edge of weight 2 here, an edge of weight 2 here,

and an edge of weight 2 here.

But this would mean that an edge from this node to this

node should have weight at most 8, right?

Just because it is a Euclidean graph, and in this case just by triangle inequality.

We would know that if we go like this, this, this, and

this, it should not be shorter than just going

directly from this node to this node, okay?

But as we have mentioned already, even for Euclidean instances or

for the Euclidean TSP, this heuristic might produce us a suboptimal result.

And let me just show you some small examples.

So if we have this four points on the plane,

then the optimal way of visiting all of them is shown here on the slide.

And its length is roughly 26.42.

Okay, so we assume that the lengths of these in this grid is equal to 1.

So for example, the length of this edge is something

like 2 squared + 3 squared and square root of this.

Okay, so if you compute carefully that the sum of all four

edges shown here on the slide, you will get something like 26.

At the same time, if we start to construct a solution greedily using

our nearest neighbor heuristic, it will give us the following cycle.

So I assume that we started with this node,

then the closest one is that one, okay?

So we go to this node, then from this node, we go to that node.

So it happens that this one is closer than the only remaining one.

Then we go here, and we go here, and

it gives a solution whose total length is 28.

The difference is small, but still.

So first of all,

this is an example where this heuristic produces a suboptimal result.

There are other examples where this heuristic, even for Euclidean TSP,

produces a much worse result than an optimal one.

And still let me remind you that in practice this heuristic works quite well.

So if you're okay that your fast algorithm is going to produce you a result.

Which is probably not optimal, but at the same time not much,

much worse than an optimal one.

Then this could be a reasonable choice in practice, okay?

In the next videos, we are going to consider algorithms that are guaranteed

to compute optimal result but at a cost, of course, of a longer running time.