The second view point of DPC is game. Just like optimization, a game can be used as a common English word. But in this case, we got a very unambiguous, formal definition of a game. Now, power control clearly is a competition among the different transceiver peers. Now, we can model competition as well as cooperation with games. So, the kind of formal language of games that we will be looking at, will extend our intuitive understanding, where we have a set of players, and players, and transceiver peers in this particular case, and each has a certain strategy set. It's a set of possible strategies that we can play. And then, there is a certain payoff function that says, if you play a certain strategy and other players play other strategies, then these will lead to a particular reward coming to you or a cost coming to you. So, we define a game formally as a three tupple, so one, two, three things. First, is very simple, set of players. So, index to buy i, one, two, three up to N. And then, for each of these players, say, index by i, we have a certain set, A sub i. And there are many possible elements that could be living in this set. In each of these elements, is a possible strategy, for example, in soccer, if I have a goal here and a goal keeper, okay, and a goal keeper thinking should I go left, should I go right to catch the ball, that would be a strategy set with two strategies. And then there would be a payoff function. And this usually maps some element in the set into a particular real number, say , five or, you know, -three and so on. And if these numbers model something desirable, then it's called a payoff function, or in general, a utility function, as we will see in future chapters. If this representing something bad, undesirable outcome then will be a cost function. So, of course, you would like to maximize your payoff function or minimize your cost function for, for each player, varying only the strategy that is in your own control that is vary pick a particular point in this set. For example, the goal keeper going left and the ball turns out and come over here then, you lose. And this may be, you know, a -five, say, of [unknown] of cost. Whereas, if he catches it, then you would get a reward of, say, ten. So, that is the function in that map. The strategy that you choose to number. Of course, it's determined, in part, by your decision, but also, in part, by whoever is kicking the ball. So, there comes the coupling of everybody's strategy. Your strategy and her strategy, together determine the outcome. Now, we're going to later come back to our power control and view it as a game. But first, let's take a look at two simple but very famous examples. Perhaps, by far, the most famous example of simple game theory is the so-called Prisoner's Dilemma. It's a two-user game, modeling competition, and each user has two possible strategies. So, we can plot this in a two-dimensional space, in a table, on a piece of paper with two columns and two rows, okay. Let's say, the two rows represent the two strategies for player A, and the two columns represent the strategies for player B. And players A and B are suspects of a crime, and the police is going to investigate into this crime and examine both of them individually. So, they are going to know the entries in this table later on, but they cannot communicate, because they are being questioned separately. And let's say, they have two choices to make. One is you do not confess, no confess. The other is you confess. Similarly, B can also say, I don't confess or I confess. So, there are, altogether, four possible outcomes. For each of these outcomes, jointly determined by A strategy and B strategy, there will be a pair of numbers. The first number denotes the payoff to A, and the second, the payoff to B. So, if they both decide not to confess, then the police can only charge them with a lesser crime. And they will each get, say, one year in prison. Whereas, if they both confess, then, they will be sent to prison for longer terms, say, three years. Whereas, if one confess, but the other not confess, then the one that confess will be rewarded and walk away free for cooperating with the investigation, and the other one will be left with, most severe penalty of, say, five years in the prison. And this is a symmetric game so we also have, this cell filled out exactly like this one. Okay, B confess, then she walks away with nothing, zero years. And A, do not confess, then she will be getting five years, okay? So, this defines the game, the Prisoner's Dilemma game. Because there are two players, each has a strategy set consisting of two possible strategies. And this table quantifies the payoff for each player, jointly determined by their respective strategies. So, let's take a look at this famous Prisoner's dilemma. Why is it called dilemma? Because, imagine you are, suspect A and you say, I don't know what B will do. But suppose, she chooses not to confess, then what should I do? Well, if I don't confess, I guess my [unknown] is zero so I should pick, confess. We call this the best response strategy by player A in response to player B's picking no confession. But B may also choose to confess. I don't know. And, in that case, what should I do? Well, I'm basic comparing -five and -three here, still confess is the better choice relative to -five, -three is better. Therefore, confess is also the best response strategy if B chooses to confess. Now, B can only choose between two choices. So, A would say, hey, I have the same best response strategy no matter what. That is called a dominant strategy. If it is the best response strategy for all possible strategies by the other players, in this case, that is confess. Now, by symmetry B is thinking in exactly the same way. And B will also pick confess as the dominant strategy. And therefore, the outcome of this game will be that both A and B will confess and get three years each. What's wrong with that? Why is it called dilemma? Because, they could have got, one year each. If you think about a summation of the payoffs and you want to, for each i, maximize this quantity, then -two would be much better than -six, except, A and B cannot collaborate or communicate. So, in their each strategization process, they end up picking a socially sub optimal solution. And that's why it's called a dilemma. So, in what we just talked about, we have defined the best response strategy in a game and the dominant strategy in a game. Now, there may not be a dominant strategy as we'll see right away. This is another famous example called the coordination game. Suppose there are two users, again, you and your friend. And one might decide that maybe you should go watch a movie. And let's say, there two kinds of movie, action and romance. So, again, each player has two choices in action strategy set, Action versus Romance movie. For you player A and also for player B, two strategies, action or romance. Now, suppose you two decide different kinds of movies, then there is no agreement, and you decide not to watch a movie and you get zero payoff. So, both of you, okay? This for you, this for your friend, this is for you, this is for a friend. But suppose you both pick the same kind, then you get some strictly positive return. Except the return's value are different, to you and to your friend. Say, you prefer action so if you two both agree on action movie, then you get two units of reward and B only get one unit. And conversely, let's say, your friend B likes romance movie more than action movies so if the outcome is going to see the romance movie then, B gets two, and you get one. So, obviously, you both have the incentive to try to reach some kind of consensus, okay? But you still would like to compete, because depending on what kind of consensus it is, you'll be getting different kinds of payoffs. So, again, now I've defined a game, a coordination game. The set of players, which is A and B, for each user, there is a strategy set, two possible strategies. And the payoff functions basically map action, action to two for A, one for B. Maps romance, romance to one for A, two for B. And it maps romance, action or action, romance to zero for both players. That's the payoff function. So now, we have completely specified this game. Let's understand the behavior here. Now, clearly there are best response strategies for A. If B chooses action, then the best response is action cuz two is bigger than zero. But if B chooses romance, then A's best response strategy is romance because one is bigger than zero. So now, you see that there is no dominant strategy. Because the best response strategies for different B's choices will be different. Similarly, B does not have a dominant strategy either. However, we see that their best response strategies actually click the match. What do I mean by the match? Well, if A chooses action, then the best response by B with respect to that is action. Similarly, if B chooses action, then the best response would be, with respect to that, for A, is also action. So, the action, action strategy choices by A and B is one that matches. Same story for romance, romance. So, you see that these, these two best responses are actually clicking with the two best responses respectively by B. And in general, we can define what we call the Nash equilibrium. Basically, it is an equilibrium that describes strategic players, unilateral lack of incentive to move away from this choice. Notice this word, equilibrium, is not the same as the equilibrium we saw earlier which is used to describe the convergence of iterative algorithm to a particular point. In fact, we'll later see yet another notion of equilibrium. So, more precisely, let's say that there are two users, okay? And user one got the utility u1, user two got the utility u2, and user one got a strategy set script A, and user two got strategy set script B. Now, we say that a particular point called an a star in a particular point b star in A and B sets respectively, is called a Nash Equilibrium, if the following definitions are true. First, the first users utility at this point a star, b star will be no smaller than the utility where user two sticks to b star, and user one is free to pick any a in the set of possible strategies, that is for all a in the set A. This simply says that, hey, if the other player doesn't move away from b star, I have no incentive to move away from a star. End, that's the key word, end, at the same time the other user feels the same way. You, too, at this, a star, b star, is no smaller than the utility for user two. If a star remains there and b is free to be chosen anywhere in the set, okay? So, this b can be any point in the set. It doesn't matter. What it says is that for user two, again, I have no unilateral incentive to move away from b star. If the other player sticks to a star, I'm going to stick to b star. And the key thing is that both states, statements must be true for the single point a star, b star. That's what we mean by matching best responses. And if so, we call this a star, b star point, a Nash equilibrium. It says, there's no unilateral incentives to change my strategy away from a star, b star. So, in this case, there are two Nash equilibria. The action, action is a Nash equilibria, because neither a nor b has a unilateral incentive to move away from this choice. And there's another Nash equilibrium, romance, romance. And back to the Prisoner's Dilemma, this confess, confess is the unique Nash equilibria. Even though both realize it's better to go there, but there's no unilateral, that's the key word, incentive to move away from this point. Basically, A says, if you don't move, I don't move. B says, well, if you don't move, I don't move. And they both have stuck in this socially sub optimal unique Nash equilibrium. So, we see that Nash equilibriua may not exist, both examples we just saw, they exist. It may not be unique, if exists, we saw the examples with two Nash equilibria. And, none of the Nash equilibria might be socially optimal, maximizing the summation of payoffs, or, even be Pareto optimal. However, by a very fundamental research by John Nash, in his PhD dissertation at Princeton, he showed that as long as you allow each player to flip a coin and probabilistically decide which strategy to choose, then in that so-called mixed strategy scenario, there will always exist at least a Nash equilibrium. Now, back to our power control problem. Distributed power control is competition and competition can be modeled as a game, so what kind of game are we talking about here? Well, let's identify the set of players first. Well, that's the logical link, or the transceiver peer there at the end of that. Let's identify the strategy space here. It becomes a little more complicated. In the two examples we just saw, the strategy space is pretty simple. It's just two choices. But here, is a continuum of the power levels for each user i, such that the target SIR gamma is achieved. Now, as you can see, the strategy space said, so defined, is a set of Ps, thus, determined, in part, by what the other users' Ps might be cuz that would determine how easy or hard is it for you to achieve the target SIR. So, it's through the coupling of strategy space instead of the coupling of payout function that we see the interference phenomenon reflected in the power control game here because the payoff function is very simple, you basically want to look at a cost function, that is Pi and you want to make this small. So, that is the definition of the game. And here are two simple, but important facts. Fact number one, you can easily to verify for yourself, that the DPC algorithm, what is the algorithm again? It says, that to transmit power for each transmitter i at time t + one is the current transmit power times the ratio between the target SIR and the current SIR at time t, and this goes on for all the users. This actually is the best responsive strategy in response to all the other players' strategies. Whatever they might be, they are reflected through the SIR value. And by changing my power level this way, changing my strategy this way to the next time slot, I'm basically executing the best response strategy. The second thing you can check is a monotonicity of the strategy space. What I mean by that? By that, I mean, that if all the other users make their powers pj. J not equal to me on user i. Smaller, okay, make these smaller. Then, my strategy space, the set of Ps such that the gammas can be achieved, will be bigger. Now this is obvious, once I state it, right? Because if you reduce your transmit power, I have less interference. So, have more choices of P, smaller values of P that can still give me my target Gamma. And this is what I mean by monotonicity of the strategy space. It turns out that there's a special name for this kind of game and subject to some technical condition that holds, in this case, we say that best response strategies, iterations of best responses will converge to an equilibria, to a Nash equilibria provided the strategy spaces for all the users are monotonic, in the sense that we just described. So, we won't have time to go through the proof of that, but this is one way, in addition to the optimization method that we'll go through in the advance material part of the lecture to prove convergence of distributed power control algorithm.