[MUSIC] In this video, we will derive the formulas for training Gaussian process. So here's our set up again. We have the probability of our new point given all previous points. And it equals to the ratio between the joint probability and the conditional probability. So we have here the denominator, the joint probability over the known points. And in the numerator we have the known points and the unknown point. Let me denote the unknown point as f star. So this would be equal to f star. This factor here would be f. And so as we saw in the previous video, we'll have the ratio between two normal distributions. The one in the numerator would have the f star and f vector, given the mean 0 and the covariance matrix as follows. And the same would be in the denominator. We'll normal distribution over f. The mean is 0 and the convergence matrix is C. All right, let's write them down. So it will be proportional to exponent of minus one-half. This virtual transposed, it would be f star, f transpose, convergence matrix. K tanspose K, C, inverse and this vector again, so f star f. So this is a term from the numerator, and we have also a term from the denominator. We'll have plus, Plus f transposed C inversed f. All right, now let's see what this term equals to. So we'll have exponent of minus one-half times the following thing. We'll have to see what is the inverse matrix for this term. Let's write it down as some arbitrary matrix. We'll have components d, b transpose, b, A. So this is just a inverse matrix of this matrix, but we don't know the components of it. So we can plug it in here, and we'll have, D times f star squared. Plus 2 times, We'll have b transposed f times the f star. And we also have some constants that not depend on f star. So let me write down some const. All right, now we have to take the full squared. So it would be exponent of -1 over, let's take this term, d here, out of the brackets. So we'll put it here. This would be 1 over 2d power -1. We'll have f star here, plus b transposed f over d squared. And times some multiplicative constant, let me write down here as proportional to. Okay, so from this we can easily see that the mean value is -b transposed f over d. So mu equals 2- b transposed f over d. And the variance is simply d. So those are our formulas. And the result would be the normal distribution over f star, given mean mu from here, and variance sigma squared. All right, so we haven't finished yet, since we have d and b here and we don't know the values for them yet. So those are just some parameters that should be equal to the inverse matrix of this term. So let's derive them. Okay, so the simplest way to derive them is to remember, since this is an inverse of this matrix, and then their product would be identity matrix. So we have k of 0. K transposed K, C trans inverse in this form. So it is d b transposed b A, should be equal to the identity matrix. So we'll have scalar 1 here, Two vectors of 0 here, and indent matrix here. All right, so actually we have many equations here. We can multiply this matrices explicitly, so we'll have K of 0 times d + k transpose b equals to 1. We have K of 0 b transpose + K transpose A equals to 0. And two more terms, those are kd + cb equals to 0, and finally kb transpose + CA equals to identity matrix. All right, so we're interested in values for b and d, and we don't have to find A actually. So let's see what we have then. So in this term, we have b here and d here, b, b, d, b. All right, actually we need to find the two equations from those two, from those four. We need this equation since it has both d and b. And this one, since also we have here d and b. And we can notice here that the number of equations equals the number of known parameters. So we can hope that we will be able to solve it. So let's start from the second equation. We can see that from this equation, b equals to -c inverse kd. We can plug in b from this formula here. And then we'll have k of 0 times d plus k transpose c inverse kd equals to 1. So from this formula, we can see that, We have d here. And so the value for d would be, d equal to 1 over K of 0 + k transpose c inverse k. The final step is to plug in the value of d from here to b. So we'll have b equal to -c inverse k over K of 0 + k transpose C inverse k. Now we need to plug in d and b into the formulas for mean and the variance. So the variance would be the inverse of d. So it would be K of 0 + k transposed C inverse k. And the mean would be, so we have b here, we should write down. I'm just going to write it down more carefully, so it would be b transpose f over d. So b transposed is simply k transpose c -1, since c is metric, times f over K of 0 + k transpose C inverse k and, or d. So it would be K of 0 + k transpose C inverse k. And here we have two similar terms, we can cancel them out. So these two. And so, finally we have that the mean value equals to, K transposed c inverse f. And so, those are our final formulas. So sigma squared equals to this term. The variance K of 0 + k transpose C inverse k. And for the mean we have k transpose C inverse f. [MUSIC]