This course is for experienced C programmers who want to program in C++. The examples and exercises require a basic understanding of algorithms and object-oriented software.

Loading...

来自 加州大学圣克鲁兹分校 的课程

C++ For C Programmers, Part B

66 评分

This course is for experienced C programmers who want to program in C++. The examples and exercises require a basic understanding of algorithms and object-oriented software.

从本节课中

Monte Carlo Hex Program, Further advanced C++ Topics and Patterns

This module shows how to use Monte Carlo evaluation in complex games such as Hex and Go. This had led top championship level play in both these games. The module discusses C++ assertions and exceptions for error handling and the new C++ 11 feature static asserts. Finally, the last part of the module introduces the idea of “Design Pattern”, a feature of modern OO programming.

- Ira PohlProfessor

Computer Science

So what does Monte Carlo bring to the table?

Critically, Monte Carlo is a simulation where we make heavy use

of the ability to do reasonable pseudo random number generations.

So it's not truly random obviously to provide a large number of trials.

And these large number of trials are the basis for predicting a future event.

You could do a Monte Carlo to decide in the next 100 years,

is an asteroid going to collide with the Earth.

You'd have to know some probabilities.

You'd have to know some facts and figures about the solar system.

And then you can probably make an estimate that hopefully would be that very,

very small likelihood that we're going to have that kind of catastrophic event.

Indeed, people do risk management using Monte Carlo,

management of what's the case of getting

a 100 year flood or a 100 year hurricane.

And we're discovering that these things are getting more likely because

we're understanding more now about climate change.

So probabilistic trials can let us get at things and

otherwise we don't have ordinary mathematics work.

So it can be used to measure real world events,

it can be used to predict odds making.

We've seen us doing a money color trial on

dice games, on poker.

I think we had an early stage trying to predict

what the odds are of a straight flush in poker for

a five handed stud, five card stud.

And if you run enough trials on five card stud,

you've discovered that a straight flush is roughly one in 70,000.

And if you tried to ask most poker players what that number was,

they would probably not be familiar with.

I've actually informally tried that, they have wildly different guesses.

Of course, you could look it up in the table and

you could calculate, it's not that hard mathematically.

But with very little computational experience, you can readily,

you don't need to know to know the probabilistic stuff.

You readily get abilities to estimate all sorts of things.

So you can use it heavily in investment.

So it's a very useful technique.

So what about Monte Carlo and hex?

Here's our hex board, we're showing a five by five, so

it's a relatively small hex board.

And indeed, when you go to write your code and

hopefully I've said this already, don't use the bigger boards right off the bat.

Use a small board, make sure everything is working on a small board.

A small board would be much easier to debug,

if you write the code, the board size should be a parameter.

So it's not going to be hard to scale on it.

But it will be a lot easier to investigate

the quality of the moves whether everything is working in their program.

So here's a five by five board.

And we want to examine what is a good move in the five by five board.

So we make every possible move on that five by five board, so

we have essentially 25 places to move.

And we'll assume that white is the player who goes first and

we have those 25 positions to evaluate.

So we're not going to do just plausible moves, we're going to do all moves, so

if it's 11 by 11, you have to examine 121 positions.

Now you could get fancy and you could assume that really

some of these moves are quite similar to each other.

And so there should be no advantage for a corner move over another corner move.

And there should be no advantage of

making a move on the upper north side versus the lower south side.

So you could restricted some that optimization maybe the value.

But for the moment, let's forget the optimization because that

goes away pretty quickly when there's a position on the board.

Once having a position on the board, all the squares end up being unique

in relation to pieces being placed on the board.

So it's really only in the first move that you could use some mathematical

properties of symmetry to say that this move and that move are the same.

So we make all those moves and now,

here's the unexpected finding by these people examining Go.

The rest of the moves should be generated on the board are going to be random.

No possible moves, no examination of alpha beta, no nothing.

We're going to make the next 24 moves by flipping a coin.

So black moves next and black moves at random on the board.

White moves at random on the board.

And we fill out the rest of the board.

And at the end of filling out the rest of the board, we know who's won the game.

So here is a wining path at the end of this game.

This white path, white as one here.

So we could stop earlier whenever this would, here you show that there's still

some moves to be made, there's still some empty places.

But I'm going to explain today why it's not worth

bothering to stop an examine at each move whether somebody has won.

I'll explain it now, it's worth explaining now and repeating later.

Turns out you might as well fill out the board because once somebody has won,

there is no way to change that result.

Filling out the rest of the board doesn't matter.

Why?

Because once somebody has made a path from their two sides,

they've also created a block.

So there's no way for the other player to somehow also make a path.

That's the character of the hex game.

So it's a very trivial calculation to fill out the board randomly.

It's not a trivial calculation to decide who has won.

Why is that not a trivial calculation?

Because that involves essentially a Dijkstra like algorithm,

we've talked about that before.

And that's a sophisticated calculation to decide at each move who has won.

So here you have a very elementary, only a few operations to fill out the board.

And then by examining Dijkstra's once and

only once, the big calculation, you get the result.

So you might as well go to the end of the board, figure out who won.

And you do it again.

So for this position, let's say you do it 5,000 times.

Sometimes white's going to win, sometimes black's going to win.

That's what you expect.

And you're going to get some ratio, white wins over 5,000, how many trials?

And that's now going to be some assessment of that decision.

That's going to be how you evaluate that board.

You're not going to have to know anything else.

You're not going to have to do a static evaluation on a leaf note

where you can examine what the longest path is.

You're going to do this quite simply, your evaluation function is merely

run your Monte Carlo as many times as you can.

Given how efficient you write your algorithm and

how fast your computer hardware is.

So this is going to be a test for

you of how clever you're going to be in your C++ program,

because you do want maximum efficiency.

And then, if you get a relatively high number,

you're basically saying, two idiots playing from this move.

One idiot seems to do a lot better than the other idiot.

Maybe that means implicitly this is a preferrable move.

And that's the insight.

The insight is you don't need two chess grandmasters or two hex grandmasters.

Who have sophisticated ways to seek out bridges,

blocking strategies, checking strategies in whatever game or

Go masters in the Go game, territorial special patterns.

Instead, the character of the position will be revealed

by having two idiots play from that position.

And the one that wins more often

intrinsically is playing from a better position.

That's the answer.

Okay, take a second and let's think about using random numbers again.

Rand gives you an integer pseudo random number,

that's what rand in the basic library does for you.

How can you turn this integer into a probability?

This should be a review.

So here's a way to do it.

We manufacture a probability by calling double probability.

You can actually get probabilities out of the standard library as well.

And you just say return or rand divided by the maximum range,

MAX_RAND, which is a library defined max number that's

rand can be.

And in this case I use 1.0, the double and that turns this expression into a double.

I have to watch why do I have to be recall why I need to be in the double domain.

All right, I have to be in the double domain

because I want this to be double divide.

It's int divide.

So if I left out this,

probability would always return 0.

That's not what we want and

the other way to do this is to make this a static cast.

So if we static cast it, that's just another recognizable way to change domain.