My goal with this video is to show you where the predictive model we calculate using computer software comes from. This predictive model is called a least-squares model. And these models are widely used in companies. You've certainly seen them if you've taken a basic math or statistics class. Quickly watch this video, even if you understand least squares. If you have limited experience though with least-squares, take a moment to see the extra resources we've posted for you. We certainly want to give you as much help as we can. Now in the videos in the prior module, we were looking at popcorn. And I'm going to use that example again in this class. In the popcorn experiment, our objective was to maximize the amount of popcorn created. Our outcome variable was the number of popped kernels. Here is the cube plot, and the corresponding predictive model that we created. The predictive model has four parameters: 67, 10, 4, and -1. 67 was the baseline amount, the average of all four experimental outcomes. We also refer to that as the intercept, and you'll see why in a minute. "10" is the effect of factor A, the cooking time. This is what we call the main effect for factor A. "4" is the effect of factor B, the kind of popcorn we used. And lastly, the "-1" is the two factor interaction term. Do you recall how we calculated these numbers by hand? Go back to the videos in the previous module if you are not sure. The most general form of the least squares model for this system is y equals b_0, plus b_A times x_A, plus b_B times x_B, plus b_{AB} times x_A times x_B. The x_A is the coded value for factor A, and it represents the amount of cooking time. If x_A = -1, that represents 160 seconds of time. And x_A = +1 represents 200 seconds of cooking time. The "-1" and "+1" are called coded units and the 160 seconds and 200 seconds are called real world units. Note that we can not use real world units in this equation only the coded units. Similarly for x_B. It is coded so that "-1" represents white corn and plus one represents yellow corn. Similar to the x_A case, the -1 and +1 are the coded units, while white corn and yellow corn are the real-world units. Recall that with categorical variables we assigned the -1 and +1 arbitrarily. The sign of the coded unit will not change the model's interpretation. Now take a look at what happens if I write that equation down, for each of the four experimental points in the system. We can substitute in values for the coded units into this prediction equation. For the first experiment, for example, we would have y_1 equals b_0 _ b_A times x_{A-}, plus b_B, times x_{B-}, plus b_{AB} times x_{A-} times x_{B-}. That's because x_A is at the minus level, and x_B is at the minus level, for the first experiment. We can repeat this process for the other three points in the cube, as shown here on the screen. Now let's go substitute in -1, or +1, for the factors A and B, and we will get four equations. Notice that the 4 equations have 4 unknown parameters. b_0, b_A, b_B, and b_{AB}. If you have some mathematical background, you will recall that four equations with four unknowns represents a set of equations that we can solve. These equations are linear, and so they're very efficiently solved using matrix methods. Let me show you how. In matrix form, the equations are written as shown here on the screen. Three things quickly become apparent. Firstly, we notice a column of 1's in the first column. That corresponds to this parameter: b_0, the intercept. Next we notice that the second and third columns, in other words, the columns that correspond to the parameters for A and B are simply the columns from the standard order table. And finally the last column corresponds to the two factor interaction for AB. You'll notice that this is simply the column for A, multiplied by the column for B. This comes from minus minus is plus; plus times minus is minus. Minus times plus is minus; and finally, plus times plus is plus. This entire set of equations can be written as vector "y" equals matrix "X" times vector "b". Now for those of you with some background in least-squares, will realize that the solution to this set of equations is b = (X^T* X)^{-1} multiplied by (X^T * y). If you don't have that experience, don't worry. The computer software, will solve these equations very efficiently for us. That's what computers are good for. All we require is the "X" matrix and the "y" vector. And we have these, the "X" matrix is assembled from the standard order table, and the "y" vector is simply the four experimental outcomes. The software will calculate these four parameters, In other words, the four entries in the vector "b". Those corresponds to b_0, the intercept, b_A, b_B, and b_AB for the two factor interaction. So now we are ready to use the computer software. Please watch the next video to see how those 4 parameters are calculated.