0:04

After helping Tang Lao Jun and Ne Zha defeat their enemy,

Zhuge Liang was transported through the time and space tunnel to the Tang Dynasty.

He had to lend a hand to Tang Seng and Wu Kong on their journey to the west.

At the time, they were passing a flaming mountain where fires were constantly burning.

Tang Seng could not stand the extreme heat,

so Zhuge Liang suggested that Wu Kong head to the Ciuyun mountain to borrow

a palm-leaf fan from Princess Iron Fan to blow out the fire.

Wu Kong and Zhuge Liang flew to Ciuyun mountain by a somersault cloud,

and had to find the Princess's cave,

the lowermost spot in the area.

They found the surrounding landscape had been frozen by the princess's magical power.

And this affected the usefulness of their cloud.

They were forced to walk on the ground and explore the landscape instead.

Wu Kong wanted to quickly find

the fastest path towards the cave in order to save his master,

Tang Seng, from the intolerable heat.

Zhuge Liang summoned the dragon of dimensions for assistance from the professors.

So Wu Kong is taking his master to the west,

but he's come across this flaming mountain with

huge amounts of flame pouring out of it and it's too hot to pass.

So Zhuge Liang suggested he talk to Princess Iron Fan.

So Princess Iron Fan has a huge fan,

which Wu Kong will be able to use to fan

the flames back so he can safely pass with his master.

Now, the princess lives in a cave in the bottom of a valley.

And she doesn't like visitors,

and so she has magic protection around her valley.

So once you enter the valley, you're disoriented.

Now, Wu Kong is very powerful magical being himself.

But even he is disoriented inside this valley,

and can really only move in these cardinal directions when he's there.

So, here's our problem.

Wu Kong's lost all his special powers for entering the valley,

and he wants to find the Princess' cave.

He can only walk in an east-west direction or a north-south direction.

And there's a function that defines the landscape,

the height of the landscape throughout the valley.

And we know the Princess' cave is at the trough,

the lowest part of this landscape,

and that's the entrance from the underground to her actual cave.

So Wu Kong needs to find the Princess.

So, let's step out of our problem for a moment to

talk about some things we've seen in this course.

So up to now in this course,

we've been talking about complete search methods.

So, complete search methods look for all possible solutions.

And we've seen how they typically search by having

a partial solution and extending that partial solution.

So, some variables are fixed and then they

going to pick another variable and fix it and try and eventually get to a solution.

And they're very strong in the sense that they can prove

optimality or unsatisfiability of a problem because they can basically,

say, look at all possible solutions,

and find there are none, and it's unsatisfiable.

If I look at all of them and I find there's none

better than this one, we know it's optimal.

And that's really what we've been concentrating on in these courses,

because many of the problems we're looking at,

these constrained optimization problems,

complete search is the best technology.

But there is a completely different way of tackling discrete optimization problems,

and these are incomplete search methods.

So, incomplete search methods look at only some possible solutions,

and typically, what they do is move from one possible solution to another.

So we're jumping around looking at

possible solutions where all the variables take a fixed value.

I can never really prove

optimality or unsatisfiability except in some very special cases.

But they can be much faster, and in fact,

they can handle larger problems where complete search methods fall down completely.

And so, incomplete search is

a very important class of approaches for handling discrete optimization problems.

All right, so, our first example of using local search is a pure optimization problem.

So we really just have this objective here,

which is the altitude of parts of the valley,

and we've got some variables which take in values between minus four and four,

and we're just trying to find the minimum possible way of this altitude.

So we found the entrance to the Princess's cave.

So this is a very good case for local search, by the way.

Okay, so we need some local search bases, basic ideas.

So a state is a value for all the decision variables.

So X equals minus four,

Y equals minus three, that's an example of a state.

And then, another important concept for local search is a move.

So a move is the change of some of the decision variables' values.

So, we might have that X is minus four now,

we might move to X is minus three,

where everything else stays the same.

So in this case, Y stays at minus three.

And then, we can think about, in any state,

we have a neighborhood which is a set of moves that we could consider at that state,

which basically tells us what

the other possible neighborhood states we could move to from this particular state, A.

So once we have that, we can basically build

a local search minimization algorithm that looks like this.

So we've got the function that we're trying to minimize, and a domain,

D, which is basically telling us what are

our decision variables and what possible values can they take.

And then, we're going to pick some initial value to start off with, all right?

And there's some science that goes into that.

But often, we just pick a random initial value,

we just pick a variable,

we pick a random value that it could take,

and then we're going to repeat this loop.

This is the local search loop.

And we're going to keep going around this loop until we reach some stopping condition,

which might be that we've found the best solution we can,

or more likely, it's because we've run out of resources.

We're going to give up and stop at this particular point. So what do we do?

We look at the current state we're in now and we work out the neighborhood,

so all of the possible places we can move to.

And then we move to the state by picking one of our neighbors to move to.

So we move to a new state, E. And then,

we look at the function that we're trying to evaluate here

and see if we've reduced that value by moving to E. If that's the case,

then we're going to actually make the move, right?

We set our current position, state D,

to the E. Otherwise,

we're just going to look for another neighbor

and see if we can find a better way of moving.

So, if this condition doesn't hold,

we're just going to stay in the same place,

we're going to choose another neighbor and try to find a better way of going.

And we'll keep going around this in this loop until we've run out of resources,

basically, and then we'll return the solution,

the best solution we found so far.

So, let's look at Wu Kong's search.

So he's trying to minimize this altitude

and he's searching in this grid from minus four to four for both variables,

and let's say his initial state is this position here,

minus four, four, which is there on the graph.

So we can evaluate the objective here,

it's 36, so not very small.

All right, so he can move either in a horizontal or a vertical direction.

So these are all the possible moves he can make.

And we're going to pick a random one.

Let's say, we pick this one here,

and we can see the value is less,

so we're going to actually make the move.

All right, once we're here, these are all the possible moves we can make.

We'll pick a random one, let's say, this one,

and we can see it's higher, so we're not going to make the move.

So let's pick another random one, let's say,

that one, and here,

we can say we're way lower,

so we're going to make that move to there.

Now again, now we're in this position.

These are the possible moves we make.

We pick a random neighbor, let's say this one, it's equal.

So we're not going to make a move.

Then we look at another neighbor, let's say this one,

so now that's smaller, so we will make that move.

Now, these are the possible neighbors we could look at.

Let's pick this one that's smaller, we'll make that move.

And now, let's say we've run out of resources.

So, this is the best solution we found.

Then that's what we're going to return,

because we know that we're not guaranteed to actually find an optimal or stop.

So, we've given up.

The Monkey King stops at this point.

Of course, it's no good for the Monkey King who really does need to

find the entry to the Princess' cave.

So, we didn't find the optimal solution.

There is actually a better one.

In fact, where the value of this thing is minus six.

So, many times, we only want to find a good solution.

That's wasn't good enough for Wu Kong in this case.

But in many optimization problems,

we're not really looking for the optimal.

We're just looking for a good solution.

In which case, basically,

we can spend as much resources as we want to,

and then give up and hope that we've got a good enough solution by using those resources.

So we can do a different kind of local search.

So here, we'll do a steepest descent kind of local search.

So, before we were just taking a random neighbor

and evaluating them and seeing whether we should move here.

So, steepest descent local search just says, "Well,

we're going to look at all the possible neighbors,

and we're going to pick the best of those in terms of objective value."

So we're basically taking the quickest route

to a good solution. So we have this tradeoff.

We evaluate all the neighbors,

so we have do a lot more work because we have to

look at all the neighbors instead of just one,

but we can improve the objective faster because we're

basically looking at the best possible move we can make in every position.

So, if we try, Wu Kong,

using this kind of search instead, what happens?

So we're at the first stage here.

We look at all the neighbors here,

evaluate them all, you'll find, in fact,

that this is equally good to the other move here,

so we could make either of these moves,

so we pick one of those.

Let's pick the different one this time,

so Wu Kong moves to here.

Now, we can evaluate all our neighbors.

We can see that the best ones is one of these two here.

We can move down to this one, let's say.

Now, we evaluate all of our neighbors.

We find that the best one is over here, minus five.

And here, we evaluate all our neighbors,

we moved to this minus six.

Now, if we evaluate all of our neighbors,

in fact, none of them is better.

And we found an optimal solution.

So we have found the entrance to the Princess' cave.

All right, so you can see this is much better.

It seems it got us to the actual optimal solution.

Is it always guaranteed to do that?

Certainly not, okay?

And if you have very certain classes of problems,

basically convex problems, are we able to get to the global optimal solution?

And it also depends on the definition of neighborhood.

We can actually be trapped.

So remember, we only look at the states that are in our neighborhood,

and we can actually have problems where we just can't get

out of certain regions of the state space.

All right, so let's look at that.

So the tradeoff with our steepest descent was we had to

evaluate lots and lots of potential moves.

So, a better way, typically we do in steepest descent,

is we have more restricted neighbourhood,

because we're going to have to evade all of them.

So let's look at doing the same search now with this restricted steepest descent where I

just consider my neighbors to be the states that are next to me, all right?

So the Monkey King, as usual, starts here,

he evaluates his two neighbors that he only has at this point,

and picks the better one to go to, which is that one.

Again, he evaluates all his neighbors,

picks a better one to go to.

So you can see, we evaluate.

So here, we're only evaluating four neighbors,

so it's much less cost than the other example where

we could evaluate something almost 16 neighbors every time.

And if we keep following this neighborhood search because this is a complex problem,

in fact, we're going to get down and find the same optimal solution as before.

All right, but here,

notice we didn't evaluate so many neighbors as we

did in the first version of this search.

All right, so, the effectiveness of local search

can critically depend on this definition of neighborhood.

So we had two different neighborhoods we've looked at,

we could change the variables' value arbitrarily but then if we have to evaluate

the possible effect on those moves,

all the possible moves of changing a variable's value arbitrarily,

there's very, very many of them.

Particularly, if we have variables with very large domains like thousands.

Or we could just change the variables' value by one and then,

of course, we've got a much smaller domain.

All right,so a more restricted

neighborhood gives you this fact that it's cheaper to compute,

but it also gives you longer paths in the exploration.

And which is going to be better is going to depend on the search you end up doing.

Now, the nice thing is that the steepest descent local search

will reach the optimal solution in a convex landscape,

so it's guaranteed to do that.

The problem is that almost none of the problems

we've looked at in this course have a convex landscape because,

of course, here, we're still talking about

pure optimization problems with no constraints.

Most of the concern of this course about discrete optimization is handling constraints.

And once we had constraints,

we were almost certainly going to get non-convex landscapes.

So, the problem with a non-convex landscape

is that if we follow this approach of taking the steepest descent,

then that's all very well but we end up finding a local minima.

So we end up here and then it's clear that any local move we can do will not escape,

and so, we can do no better.

And you can see in this landscape here,

it's very simple, two-dimensional landscape.

There's lots and lots of local minima.

How do we escape these things?

Because our steepest descent,

once we get here, gives us no possible moves.

How do we make sure we get to the global minima which is this one?

And actually, this picture here is really

simplifying things because we also have these plateaus, where maybe,

that all or many of our neighbours have

the same value of the objective function that we're trying to optimize,

and so, we can be stuck in this plateau and every move sort of looks equally good.

So at least, in our algorithm,

we said, "Well, don't move unless it's better."

So in a plateau, we're also stuck. We can't move.

And so, plateaus can also make the search harder, right?

So, how do we eventually find this blue point here,

when we can be stuck in plateaus or local minima?

So, local search methods are very important for discrete optimization,

and there's basically a number of parameters that define that local search method.

So there's this initial valuation,

so where do I start?

Then there's the neighborhood,

so what are the possible states that I can move to from this state?

Then I have to choose a neighbor,

so I could just randomly choose a neighbor or I can evaluate all the neighbours and

then pick one which has the best under some valuation condition?

And then, there's the stopping condition,

so how many resources am I going to put into this total search?

Because in the end,

almost never, as a local search method,

going to be able to prove optimality.

So basically, we want to say,

how many rules are we going to willing to spend before we say,

"Okay the best solution we've found so far, we'll stick with that."

So, two big challenges for local search are,

how do we handle constraints effectively?

Because we haven't seen that at all yet so far,

and that'll make a big difference to what's going on.

And how do we escape these local minima or plateaus?

And we're going to examine these problems in the next sections.