This course is for experienced C programmers who want to program in C++. The examples and exercises require a basic understanding of algorithms and object-oriented software.

Loading...

来自 University of California, Santa Cruz 的课程

C++ For C Programmers, Part B

90 个评分

This course is for experienced C programmers who want to program in C++. The examples and exercises require a basic understanding of algorithms and object-oriented software.

从本节课中

Hex and the use of AI and C++ Move semantics

This module explains Min-Max and the Alpha-Beta algorithm for game playing. Its programming topics include C++ 11 Move semantics and a detailed example of referential garbage collection.

- Ira PohlProfessor

Computer Science

So in alpha-beta, we again have a maximizer and minimizer tree.

We again have leaf nodes.

And they're evaluated on some scale where the

maximizer is going for the largest positive score.

The minimizer is going for a largest negative score.

and then the values are backed

up alternatively between the maximizer and minimizer.

And that's the value of the game.

So if you're thinking of hex or chess.

If I have a large positive score, white's going to win.

If I have a large negative score, we're expecting black to win.

And we pick those ranges we could assign probabilities.

We could say, white winning should be one, black

winning should be minus one, a draw should be zero.

And that would be a way to normalize.

But generally speaking, there's some kind of natural scale.

So, with chess, for example, you have this point

count chess.

And historically, you would evaluate a position.

And you might be up a few points, because you have more.

pieces, more material advantage. Maybe there is some partial points

for space advantage. Again, in these tic-tac-toe games

there may be advantage in who has the longest path, and so

you would use a natural scale within.

those games rather than necessarily pure probability of winning.

So we're used to this for min-max.

And here's how we're going to show alpha-beta.

I'm going to show it on detail again.

If you want pseudocode, there's very good

pseudocode on the internet typically in the Wikipedia.

Though I'm not going to show you pseudo code, but I'm

going to give you a practical example, and we're going to

see how this works.

So again, min-max, and in min-max you

can see what would happen, and let's step through it.

This is a minimizer.

These are the, again, these are the leaf nodes.

So, you can imagine those intermediate values and the value of the game isn't

known yet. So the minimizer here is going to, let

me use blue. Minimizer goes over here, minimizer

goes over here, minimizer goes over here,

minimizer goes over here, over here, over here.

And a minimizer would go over here. And a minimizer would go over here.

So if this was an ordinary mini max tree, these would be the backup values.

You would five, four, three, six, etcetera.

Notice that, on this side, there are these.

gray (color) nodes, I'll show them.

We're going to explain that in a second.

Now we look at the maximizer.

Maximizer has a choice between five and four.

Minimizer here has, a maximizer here has no choice.

Maximizer here has a choice of sixes no choice.

Maximizer here has a choice.

The maximizer here has a choice.

Minimizer. Again, a minimizer's going to pick the

smallest value, that's three, that's a six, that's a five, and

the maximizer at the top is going to pick six, so the game value is six.

So what's this red cutoff, and why did I circle this node five?

So let's think about this.

At this point, what do we know about this maximizer value?

We've evaluated this subtree. Its value five.

That means that,

at least five

is available to the maximizer here. At least five.

Here, we come down and we find that there is an alternative for the minimizer,

and that alternative is a four. Well,

does it matter if that was a 0? It doesn't matter.

Why doesn't it matter?

If that was a zero, the minimizer would pick yet a

smaller value so the minimizer would say, I'll go to zero.

But from the maximizer's perspective, once he knows

that the minimizer can get at least four (he wont go there).

So think as the alpha-beta cut-offs as, what can I do as a maximizer or minimize?

At this point, at least. What's my, at, at least value.

That's my cut off. That's my alpha-beta cut off, in other

words, four is cut off, because five can be grabbed over here already.

So it's uninteresting, and indeed, if this was some

large-scale evaluation, even a subtree, we wouldn't need to examine it.

It's unnecessary to know what this value is.

Drawing in five there, is of no interest.

So, we don't need to do an evaluation there.

That evaluation, which is typically the most expensive

thing done in mini max, the backing up of scores, is not that hard to do.

It's the evaluations, think about, computing longest pass in a hex game.

That's going to be the hardest thing to do.

So that was a cutoff.

That was a cutoff at, at, by four, and we can cut off that tree.

Now, we still need to go here.

And so, here, we can see that at this point, the minimizer guesses three.

So that makes this at least three.

And then we go and pursue this middle tree.

And that changes to at least six.

because the maximizer can change its alpha-beta cutoff value.

And if we notice that the minimizer having six and the minimizer seeing

six here and this is at least six,

for this minimizer, this node is no longer of interest.

Now, when we get to six and we find that there's a path to five here.

So, the minimizer gets at least five.

So, it's going to be five or less, then

this whole sub-tree is not of interest.

We no longer need to know any of this.

So all those nodes need

not be evaluated to get the true value of

the tree based on what we were calling cut offs.

Notice the highest up the cut off, the more work is saved.

So that's the alpha-beta improvement. Noticing where at different

levels, there is at first we have no known cutoffs.

We don't know what the value of anything is.

The minute we move some score up the tree, we get an at least value.

Once we move a score up, we get an at least value.

because, there are contingent opportunities.

And we know at different heights in the tree what those possibly are.

So, that's alpha-beta, that's the alpha-beta improvement.

And that as I say was discovered in the 1950s by,

independently by a number of researchers,

when they were implementing Min-Max algorithm.

Maximizer is saying, I can at least get alpha, minimizer is saying, I

can at least get beta, if you want you can

think of that as starting at minus infinity plus

infinity and you could give an alpha-beta value at all the nodes.

And then as you implement alpha-beta, and make the partial evaluation, the different

nodes get an alpha and beta. And, if there's a

cutoff, if you already have an existing value that

can't be changed, you don't have to evaluate anything further.

So let's try and see what the alpha- beta cutoffs are in the following.

Just to test you, again I

recommend if this is not enough information to go and read

the, the Wikipedia treatment. So again, we have all of these

terminal nodes, static evaluated leaf nodes.

And we have the little places we want you to try and fill in.

And we also want you to do this in an

alpha, beta way not just in a Min, Max way.

So Min and Max should be easy but you should be able

to see what kind of work you're going to save in alpha, beta.

Okay. So, let's see what the answer is to this.

When we first start out,

we're just doing what we would do normally in the, in the, mini max mode.

We have a maximizer.

Selecting between four and five.

So, he's going to pick five. And that means that at this point, we

know the minimizer at least value here becomes at least five.

And when

we come down here The maximizer starts, and gets a seven.

Woops. So that's at least seven.

And that seven is greater than this five. So the maximizer knows

he doesn't want to look anymore at this. So that becomes a cutoff.

So at this point, once you evaluate that node, this sub tree,

all the other nodes hanging off there, are cut off.

Going to the next part of the analysis, we have

an eight and a four so that gives us eight.

So that's at least eight.

Since that's at least eight, the maximizer,

by the way, already is at least five.

The maximizer is still interested in that. The minimizer doesn't want

an eight. Sees that there's a three, because this is

a three. So that changes from eight to three.

at least three. Well,

since that's at least three, the maximizer has no

longer any interest, because it can at least get five.

And this whole subtree is cut off.

We don't really care what its value is.

So we still are at least five. We come down here, we get a four.

This is a four.

And again, since this is at least

four, the maximizer's going to cut that off.

And that is going to be, yet, another cutoff.

So if you notice, you're getting increasingly deep trees

that are cut off in this alpha-beta search.

And is a characteristic of alpha-beta search, that if you

examine, best

moves first, for each player,

you get the greatest computational saving.

[SOUND]

And the mathematical details of that are in, Don

[INAUDIBLE]

papers. And also in, papers by Judea Pearl.

another very famous algorithmic computer scientist who's

examined how to build intelligence into machines.

Okay, that's alpha-beta. That's detail.

That's what I'm going to expect you to use, if

you want to use alpha-beta search in your hex game.

So that's what you should be able to implement.