Just like we did with prior predictive distributions. We can use these posterior samples that we just created to get Monte Carlo estimates of things that interest us from posterior predictive distributions. For example, we can use from the posterior distribution of mu and sigma to simulate the posterior predictive distribution of the mean for a new location. The number of simulations we'll use will be the number of posterior samples we've been using. So that will be the number of rows in our combined model simulations, should be 15000. Now using the posterior distribution of mu and sigma we can draw Monte Carlo's samples for the predictive distribution of lambda. So we'll call this lambda pred for prediction. We will draw these from the gamma distribution in our model. Let's bring that up really fast. We're going to use the posterior samples of alpha and beta, to simulate a posterior distribution, posterior predictive distribution, for lambda, for a new location. So rgamma, we're going to create n_sim, simulations. The shape parameter, this is the alpha parameter here. Let's call this post alpha. We'll need to assign it in a moment. We don't have that variable yet. And we also need the rate parameter which will be post beta before we can run this, we need to get posterior samples of alpha and beta. Remember, right now we only have posterior samples of mu and sigma. So post_alpha that is mu squared divided by sigma squared. So we want to get the simulations of mu, square each one of those. Divided by our simulations for the sigmas. And we need to square those. The relationship between beta and the Mu and sigma is that the Mu up here is no longer squared. So Alpha is Mu squared over sigma squared and beta is Mu over sigma. So we are just transforming are posterior samples for mu and sigma to get us posterior samples for alpha and beta. That allows us to draw samples now for the posterior predictive distribution of lambda. Let's look at a histogram of this. This distribution tells us what lambda might look like if we were to draw a new lambda from a new location. If we created a new location, what would its lambda parameter look like? We could also compute posterior predictive probabilities like what is the predictive probability that a new location would have a lambda greater than 15 for example. A pretty small probability, less than a 2% chance. Now using these predictive distributions, for the lambdas, we can now go down to the observation level and simulate the actual number of chips per cookie. Which takes into account the uncertainty in lambda. So we would say the posterior predictive distribution for numbers of cookies will come from the Poisson. Or we'll create n_sim of the simulations and lambda will be our draws from the posterior predictive distribution of lambda. Let's look at this distribution. This distribution represents our posterior predictive distribution of the number of chocolate chips per cookie from a new location. Which averages over our uncertainty in that new location's lambda parameter. So we could say what is the posterior probability that the number of chocolate chips produced from a cookie at a new location would be greater than 15. More than 15 chocolate chips. The posterior probability is about 6%. We can also compare what y at future locations might look like based on the model to the original data. Let's look now at a histogram of the original data that chips. And you can see the distribution is not too different. Finally, we could answer questions like the following. What is the posterior probability that the next cookie produced at location one will have fewer then seven chips? We need to simulate predictions at location one specifically. We'll draw this on the Poisson distribution We'll do n_sim, simulations of those. And the lambda parameter here will not be our predicted values for lambda for new locations. We want to do it specifically for location 1. So we'll take our posterior samples of lambda from location 1. Now let's look at the histogram for y pred1. So this is the posterior predictive distribution and the number of chips in a future cookie produced at location one. We also need to calculate the posterior probability mean(y)parad1. We wanted the posterior probability that the next cookie contains less than seven chips. We run that, it's about a 19% chance that the next cookie will have less than seven chips. We've now seen several advantages, of using Bayesian hierarchical models. One of those advantages, is that these models allow us to answer multiple questions with a single model. Not only multiple questions, but multiple kinds of questions about for example different locations and how we might predict for a new location. Or within an existing location we can make predictions about future chocolate chip cookies. These models provide a way to answer these multiple questions through a unified modeling approach that combines all of the data. All of the data for example, in those five locations. For this reason, hierarchical models are well suited for meta-analyses, which combine data from different disparate sources, collected possibly at different times. With Monte Carlo's samples from these hierarchical models, it's easy to estimate posterior probabilities and predictions from these kinds of models.