[MUSIC] Anova is used when we have categorical explanatory variables so that the observations belong to groups. It's stand for, analysis of variance, where we compare the variability of responses within groups to the variability of responses between groups. If the variability between groups is large, relative to the variability within the groups, we conclude that there is a grouping effect. Categorical explanatory variables are often referred to as factor variables. Factors can have two categories or levels. For example, low and high or true and false. Or they can have many levels. They often result from experiments where we carefully assign values of explanatory variables. Two randomly selected subjects, and look for a cause and effect relationship. There could be one, or many factors in an experiment. For example, if we are going to conduct an online marketing experiment, we might experiment with two factors of website design. Let's say these factors are sound. And font size. We might use two levels with sound. For example, having music or having no music. And for the font size factor, we might do something like small, medium, and large. This two by three design would result in a total of six treatment combinations. For example, we could do a website with music in small font, or no music in large. There are six possible ways to do that. Let's write down the model with one explanatory variable that has g, capital g levels. That would be yi given a variable that tells us which group individual I belongs to. All of the means for each group and sigma squared would be independently distributed normal where the mean is mu, the mu that corresponds to the group of subject i. And, of course, variance sigma squared. This g variable right here is just a group indicator, so it would be a member of one of this set, one of these numbers right here, from 1 up to g. So for individual i of gi is 2, that means that yi is a member of group 2. And, of course, we have our observations 1 up to n. Alternatively, we can write the mean in terms of an intercept representing a baseline group and then affects for each of the g minus 1 other groups. So we could write an alternative model, Alternative model formulation would be to set the expected value, right here, (yi) = beta0 + beta1 X1i like we had before up to betaG-1*XG-1i. Here the X variables are what are called dummy variables. They're indicators that the group of individual I, in this case, is 1. This X here would be an indicator variable that the group of individual i is G minus 1. That is at most one of the Xs will be turned on or have a value of 1. Otherwise, if we're in group capital G, the last group, the only parameter active is this one here, so beta naught corresponds to the mean of group g. Again, these Xs here are what are called dummy variables, they indicate the membership of the observation into group k. This formulation right here shows how Anova is related to linear regression. This formulation is also the default in r. These two formulations for the mean are equivalent and both of them have capital G parameters for the mean. If we have more than capital G parameters, the parameters would no longer be uniquely identified since infinite combinations of parameter values would result in the same mean. We must be careful not to overparameterize. Again, the assumption of constant variance across all groups might be too restrictive for your data. If that's the case, you might also consider having multiple sigma squared parameters for different groups. We're now going to consider the model where we have two factors. Let's call the factors A and B. They might correspond to sound being A, and font size being B, from our website design experiment that we discussed earlier. One option for a model is that now against to the first model we used for one factor. Where we have a different mean for each stream and combination. This is called the cell means model. So if we have factor A which in this case will be sound we have two levels, and for factor B which could be represented by font size would have three levels. So this might be representing small, medium, and large. In this model formulation, we need a different mean for each one of these combinations, so Mu 1 sub 1 would represent the mean for this group with music and small font size. We can fill in the rest of these. So each of these means corresponds to a treatment group. It is written, the model itself is written, the same as before, except that Mu now has two indexes, one for each factor. With six treatment groups, we can have at most six parameters describing the means. However, we might get away with fewer parameters. For example, we can use the following additive model. This model, which has only 4 parameter instead of 6, would be appropriate if we believe there are no interactions. Let's write this model out really quick. The expected value for yi would be equal to a baseline, mean plus we'll say, alpha-2 times the indicator. That the level of factor A, which we'll call little a for individual I, is 2. In other words, this is an indicator that individual I is in level 2 of factor A. Then we could have a beta 2 times an indicator. That the level for individual I of factor B is equal to 2 plus beta 3 times an indicator that BI = 3. This Mu here represents the mean for this group right here. And then, if they are in different group, for example, a factor A, we would add alpha-2 which represents the effect of going from this group to this group. Same thing goes for factor B. If we want to move from level one of factor B up to level 2, we would add this coefficient here. It's an adjustment to the mean. I mentioned earlier that this additive model would be appropriate if we had no interaction between the variables. Or in other words, if we believe that going from level 1 to level 2 of factor A, has the same effect on the mean regardless of whether we're in level 1, 2, or 3 of factor B. If the effect of factor A on the response changes between levels of factor B. Then we would need more parameters to describe how that mean changes. For example, with this cell means model. This phenomenon is called interaction between the factors. In the coming segments we will look at an example of a one way Anova model. We'll work with the two-way Anova in the honors section. [MUSIC]