This is an assignment intro video for the item-based collaborative filtering assignment. In this assignment, much as in your user-user assignment, we're going to have you take a spreadsheet based on a ratings matrix and compute various forms of item association and user predictions and recommendations from that spreadsheet. So the spreadsheet itself looks very much the same as we start. We've got a user row format of the spreadsheet. But it has one more thing you'll see at the bottom here called L2 which is a Euclidean distance for the vector defined by each of these columns. It's the square root of the sum of the squares of everything that's in the column. And we're going to use that for normalization. So the second tab we have down here is the same spreadsheet but normalized. You can see from the cells that are in here, we've said if the corresponding cell on our previous sheet was blank, we're going to put this in as a 0. Otherwise, we're going to take what was in the cell minus the average. So we have our distances here. We've normalized in this case. I should point it out out here by this column V. Column V, which is the average. So we've normalized it by 1 average worth. We have a blank matrix. That's where you're going to do your work. And we have a filtering matrix which will be helpful to you later, which just takes what you put in the blank matrix and get rid of anything that's negative. Because we're not interested in negative product association. You'll notice this blank matrix is movie by movie, and we're going to fill it with our similarity scores. So let's go back to the assignment sheet. Your core task in this assignment is to compute the item similarity and you're going to do this twice. You'll create two versions of that sheet. One using the cosine similarity between items using raw un-normalized ratings. And once using cosine using the adjusted normalized ratings. So you're probably going to want to use your sum product function again. And you've got the provided L2 norms, the provided vector lengths that you can use for your denominators. When you're done with this, following the formulas that you've gotten in the lectures already, you're going to return two parts as answers in a quiz. Part one is the top five movies, For Toy Story, in order of similarity. Again, not including Toy Story, because Toy Story is, of course, perfectly similar to itself. The second is going to be the top five recommended movies for user 5277, Using the average of the users weights, sorry the users ratings weighted by similarity to come up with a score for every candidate movie. You do not need to exclude movies that people have already rated. You can consider all movies with non-negative similarities. You don't need to limit the neighborhood size to the top five or the top seven. Frankly, that just makes it harder to compute in a spreadsheet. The reason we've clamped things down and gotten rid of the negative similarities is that, that's something that's harder to do in a spreadsheet. To say, only do the dot product of the things that we want to be relevant. There's example data for you to check your work as you're going along. And we hope you'll find this as something that not only is relatively straightforward given what you've learned but that gives you a little bit more of a feel for how normalization affects the process and the results of computing item-item recommendations. You'll also get a sense as to how the size of the item neighborhood for any given item determines how much input really goes into each of the predicted scores that leads to those recommendations. As always, if you're having trouble as you go along, we encourage you to use the class forums. And we're looking forward to seeing you as we move forward, either into the honors track assignment, or we hope you'll move forward past the quiz and into course three where we take a look at metrics.