Hi there, we continue our dive into machine learning. In this video, we will meet machine learning basic concepts. First we will discuss mathematical objects and concepts which we will operate. And then we will see how to work with these concepts on the examples. When we speak about machine learning problem in general, most often we work with the following abstractions. There are some objects for which we need to predict and estimate a known value. Because it's really an answer or a target. We denote object by x symbol and we denote the answer or target by y symbol. So the X is the space of the object and the Y is the space of the answers or target. In other words, X is an input space and Y is an output space. Let me quickly remind you the age and gender recognition problem from the previous video. In this example the object is a photo with my face and the answer is age and gender. The amount of all the photos used for the age and gender recognition in this service is a X, and the unknown age and genders are Y. Here is the model prediction is fully correct so the real answer is equal to the model answer y*. Eash result is represented by the vectors of features. So object x is an n-dimensional vector or numerical features that represents an object. It's features can be not only numerics, but anything that you can process with the help of the computers. In our example is age recognition raw pixels of the photo could serve as the features. The next concept to discuss is a dataset. A dataset is a collection of objects and answer pairs where each object is represented by a vector of features. And each answer is a value we should predict for this object based on the features. Each dataset consists of a number of objects and answer pairs. To solve machine learning problems in general means to create a model an algorithm which will give us answers to the objects. in terms of face recognition, we are looking for the algorithm that will give us the correct age answer based on the given photo object. What we are trying to create is during the process solving machine learning problem is the algorithm. The algorithm should help us to create some function that will be able to get the answer based on the features of the object. Mathematically speaking, we are looking for the method from the object space to the answer space. So a is an algorithm which is mapping from X to Y. A is a space of algorithms that belongs to the same algorithm family. For instance, linear models or decision trees. This algorithm space is also called hypothesis space. We will discuss details later on. What you should get for now is the endurance and model train. Basically, we are looking for the exact algorithm, a, from the family of algorithms A. In an ideal scenario, this algorithm should give us the correct answer for each object. But in the practice, this is hardly possible. Very likely our algorithm will make mistakes for some objects. From this point of view, we are working with optimization tasks. We know that we are probably not finding ideal algorithms, but most likely we can find the optimal one. An algorithm which will makes as few mistakes as possible. In other words, an algorithm that will have a minimal loss. How to figure out whether this algorithm is good enough for our problem solving? First of all, they have to measure it numerically. And for this purpose we use the loss function. The loss function is the function we use for measuring the loss of algorithm a on X dataset. The more mistakes our algorithm does, the bigger loss we have. So the idea behind loss function is the usage of such function to measure algorithm quality. For example, when you saw in the age recognition problem, our algorithm sometimes makes mistakes like you can see in the example. Here we use loss to measure how correct the algorithm is. So in the first photo I am 24 years old instead of 27, which is nice, but it's still a mistake. In the second I am 29 instead of 27, which is quite sad, but the mistake or loss is less. Thus we have learned the basic concept. Now it's time to discuss what we need to find the optimal algorithm. How exactly do we train machine learning models? This is pretty simple, basically all we need to do is to choose the exact algorithm from the family of algorithms. Actually there is the small logic trick here, because for choosing the exact algorithm from a chosen family. First we have to chose that family, but let us postpone this question now. If you summon give us a good family of algorithms or even easier assume we know only one family of algorithms. That is why we do not need to make any choice. Okay, how do we choose this algorithm? We used a loss on our dataset as a selection criterion. The lower loss we have, the better our algorithm is. So the logic behind module training is the following. First you have to prepare an X dataset, which is a set of object and answer pairs. The object should be represented by the vector of the features. Then you have to choose the family of the algorithms. And minimize the loss function Q to get a lump sum algorithm for the A family. And for loss minimization you can use suitable optimization technique you like. Now you got the idea of the machine learning problems, basic concepts and model training on logic. Let's try to apply it to the exact tasks. We start from the credit scoring problem where banks should decide whether they should give a credit to the. For this problem, we can take the end client as an object. And as the answer, we can use the binary value 0. If a client is reliable enough and bank should give him a loan and 1 if bank shouldn't. In this case, reliable means that client will pay loan back in time. So what is the machine learning problem here? We have to find the algorithm, which will predict whether the client is reliable or not, for each client based on the clients future representation. It can be combination of age, gender, profession, education, loan history and many more. Based on the values of these features, the algorithm should give us the better answer 0 or 1 for each customer. Our next machine learning task to discuss is handwritten digit recognition. This is a very popular task, the idea is to recognize the digits that was written by hand. There are some open datasets with pictures of hand written digits, here you have an example of one of them. Each picture is represented by 64 pixels, this is 8x8 pixel metrics. The objects here are pictures, which are pictures of the original digits written by someone. Here, features are all pixels of each picture, and the answer is a digit from 0 to 9. Our model should be able to recognize the initial digit by the picture, or in other words, produce the digit by its picture. Moving on, the next example is an exchange rate forecast. I bet many of us will be happy to know beforehand what exchange rate will be for any pair of the currencies in the future. Can help this to some point. In this case, the object will be pair of the currencies, for example euro and dollars. As for our features we can use the data about the previous exchange rate. For example, for each day of the previous year. Such kind of data is called the time. It's answer will be the exchange rate for the pair of the currencies for a specific period of time in the future. Here model should be able to predict the exchange rate for a given pair of currencies for the future. Query-based search is also an example of a machine learning program. Here is an adjunct you can use to query and documents layer. And as an answer, we use the rank, because the documents first are given query. Here we can use any data regarding your documents or query as the features. Query language, document language, words presented in the documents, query words, and many more. The model should estimate a reasonable run for each document query pair. By reasonable here, we mean that better documents would have a higher rank. Thus, the exact rank is not that important. The order is much more important than the exact rank here. It turns out that you have just supervised learning problems. Supervised learning is a class of machine learning problem whereas there is a subset of objects where you have the right answer for each object. Such a subset of objects we call a training set. We will discuss it in the next videos in details. Credit scoring is in the example of binary classification problem. Here we have objects from two classes only, reliable and unreliable customers. This model should choose one of two classes as an answer. Digit recognition is the example of the multi-class classification problem. Here we have more than two classes of objects and for each object, the model will choose 1 of 10 classes. The exchange rate forecast is the example of the regression problem. In case the model should predict not 1 class from the limited set of class but the real number, because it is problem regression. And the document ranking is an example of the ranking problem. Here we also do not have any class, but the answer here is not a real number. You should find the optimal rank for the object you have in mind since the order of the object sorted by rank is much more important than the exact rank. You have done a great job. In this video you have learned basic concepts of machine learning. You have learned what we call objects and answers, features and datasets, algorithm and loss function. You also have got the intuition about how to train models and met with the supervised learning concept. In the next video, you'll learn standard types of the machine learning problems. And go deeper into the supervised learning problem, the big part of machine learning problems. Meet you there.