0:07

In this lecture, we'll learn about Minimax search.

Â What is Minimax search?

Â It's a technique to pick the best move in an adversarial two-player game.

Â Why do we care?

Â Because this is a big help when we're

Â developing artificial intelligence in a turn-based game.

Â Let's look at an example.

Â This example comes from data structures for game programmers by Ron Penton.

Â He uses rocks, but of course,

Â I'm using teddy bears.

Â So the idea behind this game is that you never want to take the last teddy bear.

Â On your turn, you are allowed to take any number of

Â teddy bears from a single bin and you have to take at least one teddy bear.

Â So this tree is rooted at the configuration shown

Â at the top of the two teddy bears and the left bin and one teddy bear on the right,

Â and its Player ones turn.

Â And the next level of the tree,

Â and this is often called a ply,

Â Laken plywood, there are different layers of wood,

Â this is a ply in the tree.

Â So, one step down is

Â all the different configurations that could occur from Player one taking their turn.

Â So on the left side,

Â we have a Player one taking one teddy bear from the left bin.

Â In the middle, we have Player one taking two teddy bears from the left bin.

Â And on the right, we have Player one taking one teddy bear from the right bin.

Â So, to generate our Minimax search tree,

Â we're basically going through

Â the different game configurations that occur as people take turns.

Â Now it's Player two's turn.

Â And on the left sub-tree,

Â you can see Player two can either take

Â the teddy bear from the left Bin or the teddy bear from the right bin.

Â In the middle, the only thing Player two can do is

Â take the single teddy bear from the right bin,

Â and that leads to Player two losing because Player

Â two lost by taking the last teddy bear.

Â On the right hand side,

Â Player two has two choices.

Â On the right hand side,

Â Player two can be particularly stupid and take both teddy bears out of the left bin,

Â thereby losing the game,

Â or they can take one teddy bear out of the left bin.

Â And then on the next ply down,

Â it's Player one's turn,

Â and either they've won in that middle sub tree or all the way in the right,

Â or they have a configuration that only contains one teddy bear,

Â so they're going to have to take that teddy bear which leads to them losing the game.

Â So this is the game tree for a complete play of don't take the last Teddy.

Â That's the new name and that's the programming assignment for this module.

Â You'll, of course, extend what we're doing here.

Â But once we have the game tree,

Â we need to help Player one figure out what they should do for their turn.

Â Now, I've marked the game tree with the Minimax scores.

Â So, at the very top at the root of the tree,

Â Player one is the maximizing player.

Â They want to get the highest score possible,

Â so the way we're applying the Minimax scores down on

Â the leaf nodes of the tree is if player one lost, they get zero.

Â And if player two lost, they get one.

Â So, Player one wants to make a move from

Â the route that gives them the best chance of scoring one.

Â Player two on the next ply down is a minimizing player.

Â They wanted to try to minimize the score that

Â Player one can achieve so they'll make the worst move for Player one.

Â And then the next play down,

Â Player one takes their turn and they're trying to maximize their benefit, and so on.

Â So that's why this is called a Minimax search is because we're alternating

Â between minimizing and maximizing even though we start with a maximizing move.

Â Even though I've marked the game tree all the way up to the root,

Â we need to understand how that works.

Â So assume that I have now marked all these leaf nodes with

Â the appropriate scores and we'll see in

Â the next lecture that we implement this with recursion,

Â so now we back out of the recursion on

Â the maximizing apply just above those bottom leaf nodes.

Â So here, as we back out of the recursion from Player one lost,

Â Player one picks the maximum score in all its children.

Â It only has one child,

Â so it just has to pick zero.

Â Zero happens to be both maximum and minimum,

Â but it picks the best number.

Â It also does this on the next sub-tree.

Â And again, no choices.

Â This one, we're actually lucky Player one is going to score one at this ply of the tree.

Â Here, Player one picks the maximum of all its one child.

Â So it has to be zero,

Â and here's another place where Player one wins.

Â So we're backing up out of the recursion from doing our Minimax search,

Â and we just did the max level,

Â so now we need to do the min level.

Â So, the minimizing player is going to pick the lowest score of all its children.

Â Now on this left sub-tree,

Â they're both zero, so it picks zero.

Â On this middle sub-tree,

Â It only has one child,

Â so it has to accept that score.

Â And finally, we see an actual decision,

Â so the children have scores of zero and one.

Â And because we are on the minimizing ply,

Â we pick the lowest score between zero and one,

Â so we pick zero.

Â Because remember, Player two is trying to do whatever is worst for Player one.

Â But now, Player two is done,

Â this level of the recursion is done,

Â and so we backup to the root,

Â and Player one will decide between these three children.

Â One child has a score of zero,

Â one child has a score of one,

Â and one child has a score of zero.

Â So Player one, because that player is a maximizing player,

Â picks the highest score of the children.

Â And that's a one, and that makes it clear that player

Â one's next move should be to take two teddy bears out of the left hand bin.

Â Don't lose sight in the mechanics of figuring out how all this works.

Â Our end goal is to tell the root of the tree,

Â what should you do?

Â And of course, that means that this is great for AI.

Â We're trying to help AI make decisions,

Â but that's how the Minimax process works In the small example.

Â To recap, in this lecture,

Â we learned about Minimax search.

Â In practice, we usually limit the depth of search that we perform,

Â because in most cases,

Â the fan out, the branches out of a particular game state are

Â so high that we can't go all the way to the end of the game.

Â So what we do is we limit the depth of

Â our search and we use something called a heuristic evaluation function

Â to assign a score based on whatever criteria we come

Â up with for a particular game configuration.

Â Common more advanced techniques like

Â alpha beta pruning and iterative deepening are built on top of Minimax search.

Â So alpha beta pruning is a technique that uses many Minimax search,

Â but when we start exploring a branch that has

Â no possibility of being better than our current best solution,

Â we just prune that branch and we don't explore it any more.

Â Iterative deepening is a technique where we

Â perform Minimax search to one level and saved that result,

Â then perform Minimax search to two levels and save that result, and so on.

Â And this is a really useful technique when we have

Â time constraints on how long we can execute the search.

Â So in our peak game studios,

Â game the cat, we actually used iterative deepening.

Â Because based on the selected difficulty,

Â the AI could spend a certain amount of time looking for their next move.

Â And if we used iterative deepening, which we did,

Â we could actually always have some solution when time ran out,

Â but we could keep looking for a better and better solution until time ran how.

Â