0:01

Good morning. Okay, back to discrete optimization local

search. And so, so far ,one of the things that we

have done is looking at all these neighborhoods and you know, we saw a

variety of them. You know, some from very simple to very

complicated one. What we want to do today is step back a

little and actually look at how we search these neighborhoods.

Okay? So we have talked about the neighborhood

but not really how we use them. For searching.

It was very implicit. So what do we want to do in the next

three lecture, is take a systematic look at how we can exploit these neighborhoods

and do local search on top of that. Okay?

So more systematic presentation, and then we going to, you know, study heuristics

and metaheuristics, okay? And so, that's going to be a focus of

essentially the lecture today and the next two lectures.

So today mostly are heuristics, the next two lectures, metaheuristics.

Okay? So remember local search work with

states, okay. At any point, we are in a state inside a

resolution, if we only explore you know, feasible configuration.

Or it's a configuration where there may be violations, some of the constraints

are violated. Okay?

Now, essentially, the local search is going to move from the, from state to

state, okay? And essentially, from one state, you can

only move to a subset of states, which are called the neighborhood, okay, of

that particular state. So, we're going to use the function N(s)

to denote the neighborhood of state s. That's all the other states that you can

reach when you are at s, okay? Then we're going to introduce this legal

function, which basically tells you, you know, that from all these neighbors that

you can go, which ones are legal and which ones are not.

Okay? So there would be the subset of the

states that you can move to. Otherwise, you will be basically

preventing the search from going there, okay?

So this is, this is kind of a being implicit to what we have done before,

okay? So always moving to something which is a

better place. Okay, but we'll many other ways to do

that in this next three lectures. Okay, and then once we know which

neighbors are legal, what we are going to do is select one of them.

And once again there will be many many ways of actually selecting the neighbors.

Okay, so what you get is this crazy function, you know, S of L of N of s and

then are all the other S's that you can use in the other function.

But essentially, what you do, you select a legal neighbor, okay, from the state

you are in. Okay?

So, the algorithm, and of course, what we want to do is of, is of use in some cases

is minimizing an objective function. Almost always, either it's minimizing

violation or it's actually optimizing an objective function from the problem

specification. Okay?

So, this is the general pattern that we're going to use in the next couple of

lectures. Okay?

So, you have the local search. You know you have the objective function

f. You have the neighborhood N and so we'll

give you a definition of that. We'll pass that to, to the local search.

You have the function L and the function S, for which neighbors are legal and then

what is the selection function. Okay.

And most of the local search, you know we'll adapt this a little bit from time

to time. But most of the local search, I'm

going to start from a configuration that you generate and you can have different

ways of doing that. And of course, that's the best solution

you have at that point particular point. So, S star is like, keeping the best

solutions at this point of the search. And then you're going to do a number of

iteration, every one of them is going to go to a neighbor and so and so forth,

okay? And obviously if the neighbor sees

satisfiable and he has improved the current best solution you will, and you

have found a better solution, we have found a better solution and we store it.

And otherwise, what we do, we move to the next neighbor, and how do we do that?

Well, basically, we select a legal neighbor in the neighborhood of s, okay.

So and at the end of the search we returned a best solution we had found,

okay. So this is schema that we going to use.

And one of the goals today is give you some example of what L can be or what S

can be or what you know and most, mostly we'd seen what the neighborhood can be.

Okay, and so that's the goal. The goal is to look at these L and S

function and so all the variety of things we could do there.

Okay. So this is an illustration.

Graphical illustration of what we do. So we start at a particular state.

Oooo over there. Okay.

and then essentially what we're going to do is define the neighbor of that

station. This is the regions that you see here.

That's all the neighbor. Eventually we can move to all of them.

But then we put these restriction. Okay?

So I only want to look at these neighbors which satisfy this legal condition.

And now you see all these three neighbors there in this particular case.

And then we select one of them. This is you know, the selected neighbor.

Okay? And then we move.

Okay? And so now we have moved from you know,

the original note to the selected note. Okay?

We restart the process. These are all the neighbors.

Okay? Some of them are legal, some of them are

not. Okay?

And the, you select one of them, okay? So, you know, here and then you move to

that place and you continue doing that and you see the search moving [SOUND] all

over the place until it reaches the final state.

Isn't that beautiful? We have to do it again, I mean, it's so

beautiful, you can't imagine how much time we spent doing this so I want to

show it again. So we start from this, we selct the

neighbor, okay, the legal ones, we move, okay?

We do the same thing, we selEct the neighbor, the legal ones, we move.

Okay? And then we keep moving, okay?

And so this is essentially, what we going to do today.

Okay. Looking at, not the neighborhood, because

we have already seen many of those things.

But looking at what this legal function is, and how we select a particle and

neighbor. Okay?

And there are many, many ways to do this. People are really creative in this feat.

Okay? So, for instance, one of the things we

have seen so far, many, many times is only looking at local improvements.

When you are in a state you only want to move to another state and in this

particular case which is better than the current case.

Okay, so you always want to improve. Okay.

In a sense, you always want to go up, up, up, up.

Okay. Better and better.

Actually we are minimizing so down, down, down, down.

Okay, so and the selection function many times what we have done.

Is looking at something which is greedy. You look at all the neighbors that are

improving, and you take the best one. Because somehow, you know, you believe

that this is going to give you the best solution quickly, okay?

So these two things are things that we have seen, you know, kind of implicitly

so far. Okay, now in this field okay, in the, the

local search we typically distinguish between heuristic and what is called

meta-heuristic. Okay, so one of the things in, in

optimization that you have seen so far right, is that we are always very good at

choosing nice names. Like heuristic, metaheuristic, and you

see many of very fancy name today, okay? So we are very creative people, okay?

A heuristic is something that, essentially, tells you how to choose the

next neighbor, okay? It's basically telling you, okay, this is

the heuristic, this is the way you want to choose the next neighbor to go

to, okay? And typically, you use only local

information, the state s at which you, you are and then the neighborhood

obviously. Okay and the goal of this heuristic is

essentially to drive you down to a local minimum.

Okay? To a local minimum, a good one hopefully.

Okay? Now, the metaheuristic is a little bit

different. Okay?

So, the goal there is going to be escaping local minima, okay.

So, so if you only do local searches let's say using greedy improvement,

you're going to get to a local minimum. Most of the time you have no guarantee on

the quality of it. Okay?

So, the metaheuristic is going to say ooh, you know, I'm stuck here but, I

want to try to get out of there and see if there are other local minima that are

good. Okay?

So, that's what you're going to do. Okay?

So, essentially what you want to do is use the metaheuristic and we see a bunch

of them. Right?

We will see how they can derive the such all of local minima and trying to get

closer to the global optima, optimum in, in, a by using kind of non local

information, okay? So what typically these things have, they

have some kind of memory, there is some kind of learning component, okay?

So this is essentially what the difference between this two things, one

of them is going to guide you to the local minima and the other one is going

to try to allow you to escape from the local minima.

And, and tries to get to global optimality.

Okay? And so, today, we're going to talk mostly

about heuristics. And then we'll spend the next two

lectures talking about metaheuristic, okay?

So, so, one of the ways that you define legal moves, okay?

So, and that drives also the heuristic. Is by lose, use, looking at condition of

the objective function. There will be other things that we are

going to look at. And you'll see some of them today.

And you know, and in the next lectures. But essentially, what I'm going to show

you now, are just condition of the objective function, okay?

So local improvement, we already mentioned this, essentially, you want to

stay such that you improve the value of the objective function, okay?

So that's a strong condition. You have always to move to someplace

which is better. Okay?

Now, sometimes, there is no such neighbor, but there are neighbors that

are different from where you are. Okay?

Okay, obviously and these neighbors may not be you know, better than, than, than

the current state, but they can be you know, equally good.

And so it means, still be advantageous to move there because maybe from there, you

can move to a place which is better, okay?

So that's called No Degradation essentially.

And what you do is you basically select something which doesn't degrade the value

of the objective function. Okay?

So very useful, used in many conditions. Okay?

And is, it, you can also use that to in, introduce randomization.

And we'll talk about that a little bit later.

Okay? And then you can be completely you know,

brutal and say ooh, you know, I can [INAUDIBLE] all the neighbors.

Okay? So and we'll see some search heuristics

here. That would actually do that.

Okay? And so essentially, potentially

everything is a neighbor. Okay?

Of course we, we're going to, you know? Once you have something with that

consider everything. You will have to have a selection

function that does the right thing. Okay?

So it's going to be a combination of what is legal and how does selection function

work? Okay so with this some kind of properties

are what you can do, okay? So now you have also one of the things

that you will have to see is that you will have to, once you have the

neighborhood and once you define what is, what is legal, you will have to decide

how to explore this neighborhood. Okay.

And sometimes the neighborhood is tiny, sometimes it's huge.

So the way you explore this neighbor is, is important.

Okay. So we will see various kinds of things.

Okay. So one of them is best neighbor.

You want to look at the neighborhood in it's, in it's, in it's entirety.

And you want to select the best neighborhood.

Lot of local search are doing things like this.

But of course that means that you have to scan the entire neighbor route.

And select the best, you know for selecting the, the best neighbor, okay?

But this is one of the things that is very popular, okay?

The next thing which is, which you can do is to say ooh, I'm not interested in

exploring the entire neighborhood. I want to stop as soon as I find

something legal, okay? And sometimes we call that you know the

first, the first neighborhood. Okay?

So you want to find something which is legal and then you know, that, that's

sufficient. You stop at that point.

The hope is that you don't have to explore the entire neighborhood.

You can very quickly find the neighborhood that is legal and then you

move there. And then you stop.

Of course, you know, when you at, at some point you will have to explore the entire

neighborhood to find out that there is nothing legal.

But in a sense, what you, most of time you hope that you will only explore a

subset of the neighborhood. So if the neighborhood is big.

You can still use something like that. It will take more time if you try to do

the best neighbor. And then we have this thing which is

called Multi-stage Selection. And the key idea there is that, you want

to preserve some of the greedy aspect, that's really important for you.

But you don't want to explore the entire neighborhood because it's very big so

what you do is basically split the neighborhood exploration in two parts.

The first one is going to select part of neighborhood, so you won't explore the

entire things. You will only select a subpart of it.

And the second one, the second part is going to explore in it's entirety the

rest of the sub neighborhood. Okay.

And I'm going to give an example of this because it's very interesting and very

useful in practice. Okay, now, now lets look a little bit

about the best neighbor. Okay, so in practice you may have many

different neighbors that are, that have the best value okay.

And in those cases, randomization is very important.

So you don't want to be in in many cases in local search.

You don't want to be completely deterministic.

You want to introduce this randomization component.

And that allows you to explore the search in a more, in a, in a, in a more

interesting fashion. Okay?

So we'll talk about this much more at some point.

But at this point here, considering the best neighbor there is something very

easy to do. Okay?

So you basically you know, you basically take all the best neighbors, so this have

the, this are really the legal move that you want to go.

And then essentially you want to select one of them, but you want to do that

randomly, okay? And so what you do, if they are

essentially, if the number of neighbor is cardinality of N and star, which are the

best neighbors, you want to select one of them with the probability which is one

over the size of those best But, this, this, best this best neighbor,

okay? In the sense, what you do here is you

compute all the best neighbor and then you, you basically choose one of them you

know, in a, in a uniform random fashion, okay?

And so, we will, we often do things like this because it gives you ways to

actually not going always to the same place but it gives you more variety.

Okay? So essentially you want to see best

improvement you can prioritize the local search that I have given you at the

beginning. By simply using an improved essentially

using improvement as, you know, improvement as the, as the, the legal

moose and then selecting the best one. Okay?

So, if you select these things, you will have a local search.

the legal moves are going to be the one where you improve the value of the

objective function and then you always select the one which has the best value

with the definition that I've given you here.

Which basically means your computer's set and then you choose one of them randomly.

Okay, so that's one of the things. So you could do similar things for the

first neighbor in that particular case. What you do is essentially you take the

first neighbor in the neighborhood exploring it in some kind of

lexicographic order. Okay, and that's what you return, okay?

And so once again the function can be very easy to compute it.

What you do is you, you say all the legal moves are the one that improve the

objective function. At the current state.

And then, You want to select the first one which is

legal, in a sense. Okay?

So that's how you can parameterize this search.

The generic local search that I gave you before, okay?

So let me talk a little bit about the multistage heuristic.

Because, you know, as I said, you know? Some of them are really useful in

practice. And the motivation, as I said is you

still want to keep the greedy aspect but you don't want to scan the entire

neighborhood because for some reason, it's too large.

And it's not a good compromise between, you know, the quality of the solution and

the efficiency of the algorithm, okay? So we have seen one of those, okay.

So one of them was essentially the way did the search for the Queens problem.

Remember, when we did the Queens problems, every Queens.

Could be associated with a violation, that's the number of all the queens that

are currently attacking. And then we were basically looking at the

queen that is the most aggressive, and we were basically moving it to a position

where it was much less aggressive. Okay, so basically the position where it

min, which minimizes the number of violation okay?

So, what we need was two step, right? So, we selected only one queen and then

we moved this value to the best, to its best position as far as decreasing the

violations are concerned, okay? So in a sense, you know, this heuristic

is, is sometimes called max/min-conflict. You select a variable which appear in the

most violation. It's greedy.

Okay, so greedy choice of the variables, and then you're setting the value for

that queen/g. So a row, okay, for the column as a

queen/g. Okay?

Which result in the smallest amount of violations.

Okay? And that's also a second step, which is

greedy. But the first step, has already selected

one queen/g. Okay, so in the second step we only look

at one queen/g. Not all the queens, okay?

Now a variation on this is is the original min-conflict heuristic.

That was proposed by Steve Minton, and what you do here is you just select the

first screen randomly. It has to have some violation, but you,

you select it completely randomly, okay? And the you select the value for that

queen which minimizes set of violations. So that's typically called the

Min-conflict Heuristic. So the first step is randomizes, the

second step is greedy. So two examples of Multi-stage

Heuristics. You first select some part of the

neighborhood, you know, Queen, and then you select all, you look at all the

possible values and you select the one which is the best in the second stage for

that particular selected Queen. Okay, now what would be the alternative

to that? The neighborhood that you could explore

are all the queens and all the possible move for the queens so you would require

a quadratic neighborhood. Okay.

And the next state, okay, the set of state obtained by looking at the current

state taking any queen q, and any value v and looking at what that state results.

It result into. Okay.

And the key point is when you do that. You, you may have solution that are

actually better than what I've just shown you, right?

It's not necessary the queen which has the most violation which is going to

decrease the violation the most. It may be a queen which has, you know,

fewer violation but then you can reduce the number of violation tremendously,

okay? So that's basically, you know, this is a

quadratic neighborhood. It would, gives you the best local move.

What we do is an approximation of this, right?

By using essentially a first selection of the variables and then a selection of the

value. Instead of considering the entire thing

globally. Okay?

So, the complexity here would be quadratic.

Okay? All the pairs, for, all the columns and

the rows. Okay?

So, and, and, therefore, you have a, there is a price to pay here.

It's essentially you know, you, you are proportional to the number of parts in

the chessboard. Okay?

One, we look at the min-conflict of the max min-conflict heuristic, this is a

neighborhood which is used to explore in linear time, right?

So we basically select a queen, okay that may take at most linear time and

min-conflict, it's constant time. And then you have to select the value

which is also looking at all the values right?

All the values in the column, that's linear time.

So, you know, constant plus linear or linear plus linear, it's linear time,

okay? So, what you see here is a trade-off,

okay? So, we won't have the best neighbor but

we'll have a, you know kind of a greedy neighbor anyway.

But we take only linear time instead of quadratic time so we have.

We can explore many many more neighbors in a sence.

Okay, and sot the kind of trade off that you often have in local search.

Okay, so this multi-heuristics as I said, you know are useful.

Okay, very useful. Okay, now think about it's you know, we

saw you the consequening example. And remember we had all these cars on the

assembly line, there were contraints on how many you know, how many units how

many cars. How many options a production unit can

put on a car, okay, in a row. And then essentially, we was basically

scheduling all these cars on the slots, okay?

And you know, the stars here are all the cars that are basically violating some of

these capacity constraints, okay? And the neighborhood we selected was, oh,

we want to take two cars, one of which has to be in violation and we would just

swap them, okay? Now, if we ac-, if we want to have a

completely greedy procedure, okay, for this particular neighborhood, okay.

So, and so it, it's going to be a quadratic neighborhood, right?

So we going to take, we can scan every slot in the assembly line and then swap

it with all the other slots in the assembly line.

And that's a lot of swap right. So if we have an assembly line with say,

let's say 300 you know, cars. This would be exploring about 45,000 you

know, configurations. So it's going to take awhile to do that,

okay? So a multistage neighborhood would just

say ooh, I'm going to select one car, okay.

So It's kind of a linear kind of your assembly line, okay.

And then you're going to say ooh, which other car can I swap?

It's another linear you know swap of the, of the neighborhood.

And that's essentially a much smaller neighborhood, right.

It's like you know, 600 you know, at most 600 you know, operations that you need to

do. So it's linear in the size of the

assembly line. So once again you have this tradeoff

here. You know, I can get the best move or I

can get a very good move. But much faster.

Which one is the best? Well, that will depend on the problem but

you have to consider these options when you're [INAUDIBLE] designing a local

search procedure, okay? So that's, you know, to give you a sense,

you know? you can take the best neighbor, you can

take the first neighbor, you can take something which is in between, in a

sense, which is, you know, taking something which is greedy but we've a

multi-stage approach. So, what I want to talk about now is

random work. Which is a completely crazy idea in a

sense, but their are, there are applications which, this is the thing to

do. Okay?

So randomization is what? You want to select a neighbor A

completely at random. Okay?

And then you have to decided whether to accept it, okay?

And so, some of them are going to accept the move if its improved or you can use

the Metropolis algorithm that I will talk about in a later one.

But essentially what you're going to do is select a neighbor randomly and then

decide if you accept it. Using some time of criteria, right?

And so essentially let's look at the first one, which is random improvement,

what you do here is Is that, you basically look at the neighbor, okay?

The entire neighborhood and you take one neighbor with you know, some probability

which is, essentially you've seen that all the neighbors can be selected with

same probability. You take one randomly, completely

randomly, and then you look if it improves the current solution, if it

does, you move to it, okay? If it doesn't you stay in the same state.

And you iterate these things, okay? So, so this can be actually very

effective, okay? In some part of your, kind of local

search, okay? And there are var, various cases where

this is effective and one of them is when the neighborhood is really, really large.

Okay, so remember this traveling tournament problem, okay?

So we had all these teams, kind of the baseball teams.

That had to meet with you know, play against every other team twice, you know,

it was a round robin tournament. And so you would play some games at home,

then move, you know, play some games away, then come back and play these

things. And the goal was to minimize,

essentially, the travel distance. And so you had these constraints making

sure that the team don't play too many games in a row at home or away.

And then you had basically, this optimization, this objective, you know,

this objective function, which was global and therefore difficult to optimize.

Okay? And so what we saw is that there were

plenty of moves that we could do. Right?

So we could swap homes, swap teams, swap rounds and then the partial version of

these guys. Okay?

And some of these neighborhoods are actually pretty complicated.

They are you know, let's say end cube end cube moves the, in, in some of these

neighborhoods where n is the number of teams.

Okay? And, and, more, more, more, more

important you know equally important in a sense /g.

Some of this move were affecting the entire schedule.

So, you had a lot of operation to carry it out.

Okay? So, this was essentially swapping

swapping the [UNKNOWN], swapping the teams here.

And so you know you would affect the large part of neighborhood there and then

you need flip some of these to the rest of feasibility and so what you see here

in color essentially in all the things that have been moved during that

particular, that have been changing that particular move.

When you look at somethin glike this you have many, many ,many neighbors and

everyone is going to affect the entire scheduleso they take long to evaluate.

So you just can't really do very effectively a, a best neighbor.

you know, a, you know a best neighbor exploration.

Because that would take a lot of time, and the benefits are not really very

clear at that point, okay? So what you can do instead is just select

one of these move randomly, see if it improve your schedule.

If it does, yes, you take it. If it doesn't, you just say in the same

state, pick another one, and so on. Okay, so the best research that I've

shown you, we talked about it earlier, you know the, the, the TTP, the traveling

time on problem, were based essentially on a random log.

And I will tell you also what is the metaheuristic and what is a heuristic to

decide whether we accept a move or not. But it was basically using completely

random move at every iterations of the search, okay?

So this is essentially the highlight of the presentation I want to give you

today, so we saw essentially the generate search, and how you can parametize it

with heuristic and metaheuristic. But I haven't shown you the metaheuristic

yet. And that's what we're going to talk about

next time. But I haven't shown you the metaheuristic

yet. And that's what we're going to talk about

next time. Okay?

See you next time. Thank you.