[MUSIC] But a question is how are we gonna make these recommendations? How are we gonna guess what rating a person would give to a movie that they've never watched? Well let's imagine for a moment that we have some set of features about each movie and each user. So for example, we know that. Movie v, which, in this case, is The Shawshank Redemption, Is about some set of different genres, like action, and romance, and drama. And so we have this vector for each of these things, which says it's 0.3 about action It's 0.01, about romance. It's 1.5 about drama. And we have the set of things, so we know what the movie is about. And likewise, for every user, like this user you. We know which of these different genres got user likes, so for this user, we have a vector that says that for action, this user really, really, really likes action, really does not like romance. Kind of likes drama and so on. And we're gonna call this first vector the movie vector, RV. We're gonna call this user specific vector, LU. So let's say we know this, then what would you do to make a prediction of a rating? Well, one thing that might make sense is to take this movie vector and this user vector and see how much they agree. If they agree a lot, then we'd guess that the person would rate that movie very highly. If they don't agree a lot then we're gonna say that it's probably very likely that they will not like that movie and give it a low rating. So we're gonna estimate our rating. That's why we put this hat over it to denote that this is an estimate of how much the user, u, is going to like some movie v that they've never seen before. And so the way we're gonna do this is just like when we're measuring similarity between two documents, we're gonna take the two vectors, in that case that might have been a vector of different topics for the document, in this case we're talking about a vector of different topics about the movie. And so here we're gonna take these vectors, so 0.3, 0.01, 1.5, and so on. And we're gonna multiply it, element wise, by this vector. So this is our Rv. This is our Lu. And what we're gonna get out is we're gonna get out 0.3 x 2.5 + 0 + 1.5 x 0.8 and so on. And let's just say this ends up being some number like 7.2. Just made that up. But if the user vector really disagreed with what the movie was, so let's choose another color. So this is some other user. Let's call this user Lu-prime. And let's say their vector said they really don't like action, they love romance. They really hate drama. And so on. Well here, the score is gonna be much lower, so we're gonna get 0+3., sorry. Just for consistency +0.01 x 3.5 + 1.5 x 0.01 + all these other numbers that really don't agree with one another, and maybe this would come out to be some small number like 0.8. So the point here is that when the movie vector and the user vector agree a lot, we'll get a much larger number than when they don't. So we're gonna estimate a much larger rating then in the case where they disagree. And then when we think about making our recommendations, what are we gonna do? Well, we'll just sort over all movies that we've predicted for that users. We'll sort by their predicted rating and then we'll recommend those with the largest ratings. And I wanna highlight one thing here. So if you remember the rating scale was between one and five or rather zero and five we could provide no stars if you really hated a movie. But the maximum score was a five, but note here that one of our predictions is 7.2 which is clearly greater than five. So, with this type of model that we're talking about here We're not restricted. There's nothing enforcing that we're gonna stay within a score of zero to five. But we can still use this to make recommendations because we just look at the movies with the largest scores, even though those scores aren't necessarily representative of exactly how many stars a movie would get. [MUSIC]