1:36
if Player 2 were to choose like this if Player 2 was to make the decision that in
this situation he would decline the offer.
In this situation, he would accept it, and then again, in this situation he
would decline it. That's one pure-strategy and that's
different from this pure-strategy here. And so, the number of such
pure-strategies is 8. So now, let's try to make that a bit more
general. generally speaking, a pure-strategy for a
player in a perfect information game completely specifies how that player
would play the game for anything that could happen in the game.
specifically, it says what actions to take at every choice node that that
player gets to make a decision on it. So intuitively, the way I like to think
about pure-strategies in an extensive-form game, is in terms of
giving instructions to another person to play the game for you.
Imagine that Player 2 wants to send her friend off to play the game.
Then she would need to tell everything to her friend that her friend might ever
need to know in order to play the game properly.
And specifically, she'd have to say for every choice node that her friend might
encounter, what her friend would need to do.
So, kind of think of a pure-strategy as proxy instructions that, that you give to
someone else to play the game for you. then when we're counting the number of
pure-strategies, we really are asking how many different sets of proxy instructions
is it possible to have. If we say this formally in Math the
pure-strategies of a player in a given perfect information extensive-form game
are the cross product of the action sets for that player.
So, if we look at the sets of actions that are available at every choice node
for the player, the set of pure-strategies is then the
cross product of those sets across all of the choice nodes at which that player
gets to make a decision. Let's do an example that is a little bit
more complicated than we saw in the sharing game.
so first of all, I want to ask you here to think what are the pure-strategies for
Player 2? Here, I'm not asking you to count them, but I'm asking you actually
to say what they are. And again you may like to pause the video
and think about this for both players before I give you the answer.
So, we'll start with Player 2. Well, Player 2 has 2 choice nodes, this one
here and this one here. And so, the pure-strategies for Player 2
are going to be the cross products of the action sets at each of those different
choice nodes. So, here they are written out.
So, for example, the pure-strategy CF is saying that if this choice node Player 2
will play C c and if this choice node Player 2 will play F.
And because they are two sets of size 2, there's a total of four peer strategies.
Now, for Player 1, things are little bit more interesting.
What are the peer strategies for Player 1? Well, again Player 1 has two choice
nodes, this one and this one. And so again, the pure-strategies for
Player 1 are the cross products of those 2 two sets.
So, again, there are four pure-strategies for Player 1.
Why is this interesting? Well, if Player 1 takes this action,
then Player 1 knows that he will never reach this choice node.
Because his own action has made it impossible to reach that choice node.
Nevertheless, our definition of pure-strategies says.
That the pure-strategies AG is different from the pure strategy
So, there are still four pure-strategies for Player 1, rather than three
pure-strategies. So, pure-strategies are a bit different
in extensive-form games than they were in normal form games.
However, there's something great. Once we've got this new definition of
peer strategies, we can actually leverage it in order to use all of our old
definitions of all kinds of other concepts.
So, in normal form games, we define mixed
strategies as probability distributions over peer strategies and in an
extensive-form game, we can use exactly the same definition word for word.
A mixed strategy in an extensive-form game is a probability distribution over
mixed strategies. All the changes is that the underlying
peer strategies themselves are different. They're now policies of what to do at
every choice node in the game. Likewise, a best response in an
extensive-form game is, a mixed strategy that maximizes expected utility, given a
mixed strategy profile of the other agents. So again, that's exactly the same
definition we had in the normal form. Finally, Nash equilibrium is again a
strategy profile in which every agent is best responding to every other agent.
So, all three of these concepts are really just the same as they were before.
Now, something we might wonder is whether Nash equilibria exist, how we can reason
about them just having the definition doesn't, of course, give us that.
but there's an even tighter connection to the normal form that gives us more.
And that is that we can convert an extensive-form game into the normal form
and there are a couple of reasons why this is interesting.
The first is, because there exist in normal form game, we can leverage results
we have about the normal form, like the existence of equilibrium just by virtue
of the fact that there's a corresponding game.
also, if we find it easier to reason about the normal form game, we can
actually construct it and look at it, rather than looking at the
extensive-form. So, I'm going to show you how to do that
conversion. Here's the extensive-form game we just
thought about. The conversion is actually really straightforward.
Here's the corresponding normal form game.
And what you'll see is, we've just listed all of the pure-strategies of each agent
as the actions in the normal form game. So, you will remember these because we
went through them before in this game already.
Here are the four pure-strategies for Player 1.
And here are the four pure-strategies for Player 2.
Now, to fill in the payoff values, what we do is we just kind of simulate
play of the game. So if, for example, I wanted to figure
out how to put the numbers in this cell of the game,
I would look at the pure- strategy BG by Player 1 and the pure-strategy CF by
Player 2. So, BG means playings like this and CF
means playing like that. And then, I look at what node I would
actually reach in the game and I would get down here and follow the treat like
that and so I write in the number 2,10. And so, this whole table has been filled
in that way. And this is what we call the Induced
Normal Form of this extensive-form game. Now, one thing to notice about this
Induced Normal Form, is that it has more numbers in it, then there are leaf nodes
in the extensive-form. You'll notice there are repititions. So,
for example, 3,8 get's repeated four times here even though it only
corresponds to one payoff value. Similarly, 8,3 gets repeated four times
even though it only corresponds to one payoff value and that's not an accident.
That's because there are four pure-strategy profiles that leads to that
same leaf node in the tree. So this can be a problem because this
blowup is actually exponential. It doesn't look so bad here because the
game we're looking at is very small. But as the size of the game tree grows,
this blowout can really be profound. It can mean that in practice, it might be
very difficult for us to write down this Induced Normal Form.
Another thing that's important to notice is that we can't always do a
transformation in reverse. So, you might be curious and wonder,
if you give me a, a normal form game, can I make a perfect information
extensive-form game out of it? And the answer is, in general, no.
this kind of special structure that you see to the game where payoffs are
repeated is kind of important. And general extensive-form games so in
general, normal form games can't be turned into extensive-form games.
an example of that is matching pennies. Intuitively, in matching pennies, it's
really important that the two players play simultaneously.
And we don't really have a way of talking about two players playing simultaneously
in a perfect information game because one of the players would have to move first,
the second player would then get to see that move, and there's just no way of
representing a simultaneous move game like matching pennies that way.
So intuitively, we shouldn't expect a transformation from matching pennies into
a perfect information game. Seems like something would have to be
lost. Indeed, there's a theorem that says, that
every perfect information extensive-form game always has at least one
pure-strategy Nash equilibrium. That's not something that's true in
general of normal form games. Matching pennies, which we just talked
about, doesn't have a pure-strategy equilibrium.
it's easy to see why this theorem is true.
intuitively randomization often serves the role of confusing the other player
and there's really no reason that it, that could ever worked in a perfect
information game. If Player 1 randomized about this choice
Player 2 would nevertheless get to see what Player 1 had done.
And so it, it can't gain us anything to have randomization in the game that, that
can't create the opportunity for an equilibrium that wasn't there before.
So lastly, I want to look at this game and reason about what its three
pure-strategy equilibria are. Now, we can do this by actually looking
at the game tree and trying to think about what would make sense as an
equilibrium in that game. But that can be a little bit hard to do,
in part, because we can't quite so easily read the pure-strategies off the game
tree. instead, what can be more convenient for
a small game like this, is to construct its pure-strategy its Induced Normal Form
here, which list the pure-strategies directly.
And then to just reason about the pure-strategies directly on this game, so
let's do that. So I'll let you again, pause the video
here if you like, and try to find those equilibria for yourself and and then,
then I'll tell you what they are. So, the three pure-strategy equilibria,
are AGCF, AHCF, and BHCE. So, let's talk about how we're able to
see that those are equilibria. Recall always the way that we test for a
pure-strategy equilibrium in a normal form game is to, for each player,
check and see whether there's any deviation that would give that player
greater utility. So, let's, for example, look at BHCE.
If player 1 was to deviate here, you can see there's no other action he
could take that would give him more than 5.
And similarly, if Player 2 was to deviate here,
you can see, there's no other action she could take that would give her more than
5. in both cases, there's something that
would tie but that's okay. Because a best response just says that
there, there isn't anything better. So that confirms that it is in
equilibrium. In contrast, if I looked at something
like this, which I have claimed isn't in equilibrium, you can see that it isn't in
equilibrium by checking for each player. So, Player 2 indeed, can't do anything
better than ten. So, this is a best response, CF is a best
response to BG for Player 2. But on the other hand, Player 1 could deviate from
BG to AG and get a payoff of three instead of a payoff of two.
So BG is not a best response to CF for Player 1,
and so this is not a Nash equilibrium. And that's it for this video.
Thanks.