[MUSIC] Let's now look at how we might use hierarchical modeling to extend a linear model. Recall that our linear model for infant mortality looked like this. Response yi given the explanatory variables for country i, the beta coefficients and the variants came independently from a normal distribution with mean beta naught plus beta 1 times x1i plus beta 2 times x2i and variance sigma squared, for country i in 1 up to n. Here, yi is the log of infant mortality for country i. x1i is the log of per capita income for country i. And x2i is an indicator that the country exports oil. We haven't yet incorporated information about where the country is located, which may also help explain some of the variability in infant mortality. There's another variable in the data set called region. And it seems reasonable that countries in the same region may be correlated with one another rather than independent which is what the current linear model assumes. One way we can do this is through the intercept. We're going to use what's called a random intercept model. Rather than place a standard prior on beta naught e intercept, we can let each region have its own intercept. And then the group of intercepts will come from a common distribution. So, to illustrate, we're going to change the name of beta naught, the intercept, to alpha. We'll write the model like this, yi given the region of country I, as well as the other explanatory variables, the alpha parameters, the beta parameters and sigma squared. In fact we only need to condition on the one alpha that corresponds to country i. But given all of these, yi is independent normal with mean alpha sub ri plus beta 1x1i plus beta 2x2i, variance sigma squared. Now, the r sub Is will be one of the numbers one up to capital R, the number of regions. And as usual, I will go from 1 up to n. Here alpha is indexed by R. Which is a number that indicates which region country i is in. So for example if country i is in region four, then our sub i will be equal to 4 and the intercept for y sub i will be alpha sub 4. In the next level, we have the alpha sub r, given hyper parameters Mu and Tau squared, will be iid from a normal distribution with mean Mu and variance Tau squared. This will be for region 1 up to capital R. To complete the hierarchical model, we need prior distributions also for the betas, for sigma squared, for Mu, and for Tau squared. Mu represents the mean of the intercepts across all regions and tau squared represents the variability of the intercepts across regions. How much they differ from one region to the next. In addition to inducing correlation and why among the countries in the same region. These parameters can be estimated from their posterior distributions. They may be interesting to researchers above and beyond just the beta coefficient parameters. The graphical model representation of this hierarchical model could be written like this. At the top level we have Mu and Tau squared and for simplicity, I'm going to write all of the alphas on a plate together. So this is alpha r for rn1 up to capital R. The alphas depend on Mu and Tau squared. And at the same level as the Taus, or, I'm sorry, at the same level as the alphas, we have parameter beta 1, beta 2, and sigma squared. Underneath these, we have the plate for the data, so we have y sub i, which is observed. Then we have the region for country i and I'm going to make this a square because it's not random, we don't place a prior distribution on r or either of the xs. So I'll put double squares on those. And this is for country i and 1 up to n. Now, the yis depend on all of the observed covariants, as well as on alphas, betas, and sigma squared. In this drawing, it would be more correct to draw the alpha sub r parameter separately. Each affecting their own respective set of yi variables. Here we drew arrows from all the the alphas affecting yi, but of course only one of them actually does. This drawing at least helps us to visualize the levels of the hierarchy. We could easily envision other hierarchical models, such as ones that place hierarchy on the beta coefficients. For example, if we use the region variable to index beta 1 as we did with the intercepts, this will indicate our belief that the coefficient or the effect of income on infant mortality is different for each region. But these effects are related and come from a common distribution. For now, we're going to focus on the random intercept only model. [MUSIC]