Hi there. Now, as you understand how basic collaborative filtering models work, I hope you are ready to move to more advance topics. In the following few lessons, you are going to learn about some of the state of the art dimensionality reduction algorithms using many real life applications. But first, let's look why would we need them at all? You have already seen a few examples of simple item-based and user-based collaborative filtering models. These models are easy to implement, allow for intuitive interpretation of recommendations, and provide good accuracy. This also makes them a good baseline for evaluation of other recommendation models. Maybe you want to add that another benefit of the neighborhood-based models is the ability to generate new recommendations almost instantly. However, as we will see soon, it's not the unique property of these models. Finally, there are several challenges which you might face when working with these models. First is scalability. In the worst case, the complexity of the algorithms, both in terms of the required storage and module precomputation time, depends quadratically on the number of users or items. However, in many real applications the complexity can be significantly decreased due to data sparsity. And it can be improved even further with additional heuristics and incremental techniques. On the other hand, sparse data may lead to another kind of problems known as limited coverage. For example, if there are too few items rated by users in common, the correlation measure between these users becomes unreliable. More than that, even if items with close characteristics are consumed by like-minded users but never actually intersect, then these items will never be recommended together. This results in a weak generalization of the methods. Which finally leads us to our new topic, dimensionality reduction, that to some extent helps to mitigate the problems I've described. Dimensionality reduction evolves to describe any user preferences and any item characteristics in terms of a small set of hidden parameters, also called latent features. Along with a compact representation, it also helps to uncover non-trivial patterns within data and use them to generate meaningful recommendations. There are various ways to perform this task, such as neural networks, mark-of-decision processes, latent digital allocation, and some other algorithms. We will focus primarily on a matrix factorization approach. It's one of the most popular in the field of recommender systems. As you have already seen, interactions between users and items can be represented by the so-called utility matrix. The type of interactions doesn't matter for now. For example, in the case of an explicit feedback, it can be the rating value assigned by a user. In the implicit case, it can be simply a binary value representing the fact that the user has somehow interacted with an item. Search for it, view this page, purchased, and so on. The goal of a matrix factorization task is to approximately represent sparse utility matrix in the form of a product of two other matrices, P and Q. These matrices are typically dense. Each row Pi, of the matrix P, would reflect preferences of user i described in terms of some latent features. Similarly, each row Qj of the matrix Q describes the association of an item j with those latent features. Vectors Pi and Qj are also called an embedding of users and items onto the latent features space. The size of the second dimension of this matrix is P and Q, a response to the number of latent features denoted as r. This number is typically much smaller than the number of items or users. Such a representation of a matrix as a product of two other matrices of smaller sizes is also called a low rank, or rank r approximation. From here, the utility of any item j for any user i can be estimated simply by a scalar product between the latent representations according to the matrix multiplication rule. In many cases, you will be interested not in the exact prediction of a utility but rather in a correct ranking of a list of top and recommended items. The simplest way to do this is to just sort your recommendations according to the predicted score. So, what are those r latent features? Here's an oversimplified but useful example based of movie genres. Imagine that every user's taste is perfectly described by a combination of some genre preferences. Likewise, every movie can be decomposed into a mix of distinct genres. In this perfect scenario, each latent feature of the movie would just correspond to a particular movie genre. In other words, every item vector Qj simply describes the level of association of each genre with the item. In a similar fashion, the user vector Pi would represent the level of interest of user i in every genre. In the real world, human behavior is, of course, much more complicated. And the learned latent representation will not necessarily correspond to any real conceivable features. This, however, is not a problem, as it doesn't prevent us from making predictions. And you can infer relations between any user item pair including previously unobserved ones, just by the virtue of this scalar product. Also note that once you compute your matrix factorization model, you can apply some of the neighborhood-based algorithms in the new, lower dimensional latent feature space, instead of the original latent space. This can often produce more meaningful and reliable results comparing to the straight forward approach. You can also apply some clustering techniques there, or even feed your latent feature vectors into another machine learning algorithm. Before we proceed, let me summarize what you have learned so far. You can now explain the advantages of dimensionality reduction approach over the standard neighborhood-based models. You are now familiar with the concept of latent features and have some intuition behind it. You have seen one particular example of a dimensionality reduction technique called matrix factorization. And you know that it is even possible to build a neighborhood-based model on top of a latent presentation.