Next we discuss Bayesian inference and multiple regression. We will use a reference prior that provides a default or baseline analysis that provides a correspondence between Bayesian and frequentist analysis. We will illustrate this using a data set from earlier on kids cognitive scores, where we predicted the value of the kids cognitive score from the mom's high school status mom's IQ score, whether or not the mom worked during the first three years of the kid's life, and mom's age. Where the betas are the unknown regression coefficients, and epsilon is the error term. For Bayesian inference, we need to specify a prior distribution for epsilon. Since the cognitive score is continuous, we'll use a normal distribution with mean 0 and assume that the variance of the errors is constant across all of the individuals. We will also need to specify a prior distribution for all of the unknown regression coefficients and the unknown variant sigma squared. The conjugate family for this model is a multivariate normal distribution. Given sigma squared for all the regression coefficients and a gamma distribution for the inverse of the variants. For an informative prior distribution, we would need to specify the prior mean, variance and covariances of all other regression coefficients. We also need to specify the prior parameters for the gamma distribution, such as the prior guess of sigma squared and how many degrees of freedom the distribution should be based on. This elicitation can be quite involved because I do not have prior information about the prior variances and co-variances or other prior parameters in this distribution. I'm going to adapt a reference prior that is a limiting case of the conjugate family. The reference prior that we will use has a uniform, or flat distribution for all of the coefficients in the regression function, similar to what we used in the simple linear regression example. We also use the reference prior on sigma squared that is proportional to 1 over sigma squared, or 1 over the variance. The reference posterior distribution in the multiple regression case parallels simple linear regression, where the marginal posterior distribution for each of the coefficients is a student t-distribution. The degrees of freedom for the student t-distribution are n- p- 1. From the residual degree of freedom where we lose p plus 1 degrees of freedom for estimating the intercept and the p coefficients associated with the p predictors. The posterior mean is equal to the ordinarily least squares estimate. And the scale in the t-distribution is given by the standard error of beta from the OLS regression. These can be obtained from r using the lm function. Let's look at the posterior distribution for the four predictors in the cognitive scores example, based on the marginal t-distributions. These are all centered at their respective OLS estimates of beta, with the spread of the distribution related to the standard error. Now it may not always be convenient to show the posterior distributions, and instead posterior summaries may be more useful for reporting. Here we have the table of the point estimates given by the posterior means, the posterior standard deviations, and 95% credible intervals for each of the predictors. As in the simple linear aggression, the posterior estimates from the reference prior, that are in the table, are equivalent to the numbers reported from the LM function in R, or using the confident function. And these are given by the posterior mean plus or minus the appropriate t quantile with n-p-1 degrees of freedom times the posterior standard deviation. For example, given this data we believe there's a 95% Chance that the kids' score increases by 0.44 to 0.68 with an increase of the mom's IQ score. The mom's high school status has a larger effect where we believe that there's a 95% chance that scores of 0.55 up to 9.64 points higher for moms that have three or more years of high school. In summary, we've provided a Bayesian analysis for multiple regression using a default reference prior. The quantities of the posterior distributions are easy to obtain from any program that provides OLS estimates. And while the values are the same for the reference Bayesian analysis in OLS, credible intervals and confidence intervals have different interpretations. The credible intervals for the coefficients for work and age both included 0, suggesting that one or both variables could perhaps be dropped from the model. In the next video, we will explore model selection using the Bayes information criterion.