In this video, we will virtually play a game to introduce a Bayesian approach to inference. Throughout the video, we will be making use of Bayes' theorem, properties of conditional probabilities, as well as probability trees. So here's the setup. I have a die in each hand. One of them is a six-sided die, looks something like this. And the other one is a 12-sided die, looks something like this. The ultimate goal of the game is to guess which hand is holding which die, but this is more than just a guessing game. Before you make a final decision, you will be able to collect data by asking me to roll the die in one hand, and I'll tell you whether the outcome of the roll is greater than or equal to 4. Before we delve further into the rules of the game, let's pause for a moment and think about what it means to roll a number greater than or equal to 4 with the two types of dice we have. We're going to ask two questions. What is the probability of rolling a value greater than or equal to 4 with a six-sided die? And what is that probability with a 12-sided die? With a six-sided die, the sample space is made up of numbers between 1 and 6. We're interested in an outcome greater than or equal to 4, the probability of getting such an outcome is then 3 out of 6, or 1 out of 2 or, 50%. With a 12-sided die, the sample space is bigger, number is between 1 and 12. And once again, we're interested in outcomes 4 or greater. The probability of getting such an outcome is 9 out of 12, or 3 4ths, or 75%. Say you're playing a game where the goal is to roll a number greater than or equal to 4, like the one we're playing right now. If you could get your pick, which die would you prefer to play this game with. The six-sided or the 12-sided die? Hopefully, your answer is a 12-sided die. Remember, we already worked through the probabilities here. The probability of rolling a number greater than or equal to 4 is much higher, 75%, compared to the 50% with the 12 sided die. So what we're going to call this die is the good die. This is the ultimate goal. You're going to try to figure out which hand is holding the good die or in other words, the 12-sided die. Here are the rules. Remember, I have two dice. One six-sided, the other 12-sided. I keep one die in the left hand and the other in the right. But I won't tell you which die I'm holding in which hand. You pick a hand, left or right. I roll it. And I tell you if the outcome is greater or equal to 4 or not. I won't tell you what the outcome actually is, since that could actually give away which die is in which hand. Think about it. If I tell you that you rolled an 11, you'd know that you had picked the hand holding the 12-sided die, since it's impossible to roll an 11 with a six-sided die. Then, based on that piece of information, you make a decision as to which hand holds the good, the 12-sided die. You could also choose to try again, in other words, collect more data. You could ask me to roll again, and I could tell you one more time if the hand you picked resulted in a roll that's greater than or equal to 4 or not. But each round costs you money. So you don't want to keep trying too many times. You want to make a call. This is obviously just a game. And we're kind of making up some rules to make a point. But if you think about data collection, it's always costly. And while we love large sample sizes, it takes a huge amount of resources to obtain such samples. So the rules we're imposing aren't haphazardly made up, and they reflect some reality about conducting scientific studies. Before we move on to the game, let's first evaluate the possible decisions we might make. There are two possibilities for the truth. Either the good die is in the right hand or the good die is in the left hand. If you guessed that the right hand is holding the good die, and the good die is indeed on the right, then you win the game. However, if the good die is on the left but you picked right, you lose the game. Similarly, if you picked left and the good die is on the right, then you lose. Otherwise, you win. To avoid losing the game, you might want to collect as much data as possible, but remember, we said that's costly. So at some point before you're entirely sure, you'll have to just go ahead and make a guess. If there're no consequences to losing the game, like in this scenario, you might not care much whether you win or lose. But say you had lots of money running on it, then you might be conservative about calling the game too early. Here we're basically talking about balancing the cost associated with making the wrong decision and losing the game, against the certainty that comes with additional data collection. Before we collect any data, you have no idea if I'm holding the good die, the 12-sided die, on the right hand or the left hand. Then what are the probabilities associated with the following hypotheses? The first hypothesis is that the good die is on the right, and the second hypothesis is that the good die is on the left. While this is a somewhat subjective question, the chances are you answered 50% chance that the good die is on the right, and 50% chance that the good die is on the left. These are your prior probabilities of the two competing claims, the two competing hypotheses. That is, these probabilities represent what you believe before seeing any data. You could have conceivably made up these probabilities, but instead, you have chosen to make an educated guess. What would be a situation where you might not pick this answer but pick something else? Say, you know that I tend to favor my left. If you knew this about me then you might put a higher probability of me holding the good die with my left hand, but if you don't have any additional information like that about me, 50-50 is going to be your best bet. Now that we have sufficient background information on the game, we can finally play. Say you pick the right hand for the first round. I roll the die in that hand, and voila, you roll a number greater than or equal to 4. Remember, I won't tell you which die is on the right hand and I won't tell you what the outcome is, but at least I'm telling you that you rolled a high number. Now we revaluate our stance. You chose the right hand, and you got a number greater than or equal to 4 as a result of rolling the die on the right hand. Having observed this data point, how, if at all, do the probabilities you assign to the same set of hypotheses change? The first hypothesis was that the good die is on the right, and the second was that the good die is on the left. The calculation of the specific probability will take a few steps, and we're going to get to that in a minute. But first, let's try to think whether the new probability for H1 the first hypothesis should still be 0.5, less than 0.5, or more than 0.5. Hopefully your answer is b. The probability of the good die being on the right should now be slightly more than 0.5. Because we just rolled the die in that hand and got a high valued outcome. We know that this is more likely to happen with 12-sided die. So the probability that the right hand is holding the 12-sided die should be a little higher than what we had initially assigned. Let's actually calculate that probability. We started with two hypotheses, good die is on the right or the bad die is on the right. And we said that initially, we're going to give these equal chances of 50% chance of being true before we actually get started with the data collection. Remember these were our priors. Then, we think about the data collection stage. If it is true that the good die is on the right, the probability of rolling a number greater than or equal to 4 is going to be 75%. And the complement of that, rolling a number less than 4, is going to be 25%. If, on the other hand, you're actually holding the bad die on the right, and you're picking the right hand. The probability of rolling a number greater than or equal to 4 is only 50%, and the compliment, rolling a number less than 4, is also 50%. Usually in probability trees, the next step is to calculate the joint probabilities. So we multiply across the branches, there's a 37.5% chance that the good die is on the right and you roll a number greater than or equal to 4. There is a 12.5% chance that the good die is on the right and you roll a number less than 4. There is a 25% chance that the bad die is on the right and you roll a number greater than or equal to 4. And there's a 25% chance that the bad die is on the right and you roll a number less than 4. Remember, we did indeed roll a number greater than or equal to 4. So these are the two outcomes that we're most interested in, the very top branch and the third branch, good die on the right and roll a number greater than or equal to 4. Or, bad die on the right and roll a number greater than or equal to 4. We had earlier asked you guys to kind of think about, now how does the probability change for the hypothesis, the first hypothesis being true? Well, that probability could formally be written as probability that the good die is on the right, given you rolled a number greater than or equal to 4 with the die on the right. If we want to find this probability, it's a conditional probability, we can make use of base theorem. Which basically says, if you're looking for A given B, find the joint probability of A and B, divided by the marginal probability of B. So in the numerator, we have good die is on the right and you roll a number greater than or equal to 4, divided by simply the probability of rolling a number greater than or equal to 4 with the right, with the die on the right hand. The joint probability is a probability that we're grabbing from the first branch, the 37.5%, and the marginal probability of rolling a number greater than or equal to 4 is going to be simply the 0.375 plus the 0.25. You may be rolling a number greater than or equal to 4 with the die on the right hand. Because it was the good die, or because it was the bad die. And because we're saying or, for these two disjoint outcomes, we got the two probabilities. The result comes out to be 60%. Earlier we had guessed that the probability of the hypothesis being true should increase from 50%. And, in fact, now with the one data point we have observed, we can indeed see an increase up to 60%. The probability we just calculated is also called the posterior probability. It's the probability that the good die is on the right, given that you rolled a, you rolled a number greater than or equal to 4 with the die on the right. Posterior probability is generally defined as probability of the hypothesis given the data. Or in other words, it's the probability of a hypothesis we set forth, given the data we just observed. it depends on both the prior probability we set and the observed data. This is different than what we calculated at the end of the randomization tests on gender discrimination. The probability of observed or more extreme data, given the null hypothesis being true, in other words, the probability of data given the hypothesis, which we had called a p-value. We'll see a lot more of those throughout the rest of the course, but this time, we're making our decision based on what we call the posterior probability as opposed to the p-value. In the Bayesian approach, we evaluate claims iteratively as we collect more data. In the next iteration, the next roll, if we were to play this game one more time, and you had asked me to roll a die on either the right or the left hand again, and we had done the calculation of the posterior one more time. We get to take advantage of what we learned from the data. In other words, we update our prior with our posterior probability from the previous iteration. So, in the next iteration, our updated prior for the first hypothesis being true is going to be the 60%, the posterior from the previous iteration. And the compliment of that, 40%, is going to be the probability of the competing hypothesis. So to recap, the Bayesian approach allows us to take advantage of prior information, like a previous published study or a physical model. It also allows us to naturally integrate data as you collect it and update your priors. We also get to avoid the counter-intuitive definition of a p-value, the probability of observed or more extreme outcome, given the null hypothesis is true. And instead, we can base our decisions on the posterior probability, the probability that the hypothesis is true, given the observed data. A good prior helps, but a bad prior hurts. Remember that when we set our priors, the 50-50 chance for the two hypotheses being true, we said that we were taking an educated guess, so we don't want to just make up our prior probabilities. But, the prior matters less, the more data you have. So you, even if you didn't have a great prior to begin with, as you collect my, more data, you're going to be able to converge to the right probabilities. In this course, the Bayesian inference examples that we're going to give will be much simpler. However, they will provide a solid framework, should you decide to continue your studies with statistics and work with more advanced Bayesian models.