And so what we just discussed it gives the full form of the neighbourhood

predictor or at least our full form of the neighbourhood predictor and what we

will be doing. And so basically if we just apply that

scheme of you know, taking the error and then adding it to the base line the

neighbors error and then adding it the base line prediction, we'll come up with

this table as the neighborhood predictor. And, so, I just wanted to, thought it

would be good to, just, formalize the discussion on the neighborhood predictor,

just so, you have, sort of a scheme to follow.

And you can obviously refer back to the example and just go through that table.

But just just so we can just walk through this really quickly.

for the Neighborhood predictor, basically you take user U, and movie M.

And the first thing you do, is you find M's nearest neighbor.

Right. so, [UNKNOWN] say that's movie M, right.

So if it's Movie 5 we might have nearest neighbor being Movie 3.

Right so, and. And we see whether U has rated N.

So if it's some user, right, where it's just looking in this user's row right

now. And we have M, and N over here maybe.

So we see first whether or not there's an entry in N that we can use.

If so, right if he has so we'll say that he has.

If he hasn't then you just keep, keep the rating where it is, you don't change it

at all. But if he has, then then you ask yourself

the next question, then you look at the table of correlations between movies and

movies, so you look from M to N, and you try to see what that correlation value

is. So if the correlation between M and N is

positive then we add the Baseline error for U and M to the baseline error for U,

to the baseline prediction for U and M so basically take the value that we had for

the baseline predictor and then we add whatever error there was.

And for U and N, and if it's negative and we subtract the error that we see for U

and N, and from the one that we see for U and M.

And that's how we change it. So we add, if it's a positive

correlation, we subtract if it's a negative correlation.

But remember that we're adding based on errors and subtracting based on errors to

the original base line. Because, we're doing, we're augmenting

the original predictor so it's incremental on that.

So now this is a little hidden right now, but so for the RMSE let's try to figure

out what it was for the test now and what it is for the training set.

So we want to see how much better we're doing now that we've done the entire

neighborhood predictor than what we were doing before.

And so and so we'll do again for the test set.

We have the 4, 2, 5,3 and 4 and we're comparing that to 2.8, 1.5, 5, 3.33, and

3.27. So take the square root of the sum of the

squares again, 4 minus 2.8 squared, plus 2 minus 1.5 squared.

Plus 5 minus 5 which will just be 0 squared, plus 3 minus 3.33 squared.

Plus 4 minus 3.27 squared. And we divide that by 1, 2, 3, 4, 5

because there's 5 of those turns again. And if we do that out, I'll just write

out here the terms again. We have 1.44.

We have. 0.25.

We get 0, we get 0.11 and we get 0.53. Divide that by 5.

Do that all out, and we get 0.6826. So this is the RMSE for the test.

That is 0.6826. And then for the training set we can

similarly find the RMSE to be 0.4660. So now if you compare those with the

values that we had back in the, just the simple raw averaging we did in the

beginning, this is an improvement of 33% about and this is an improvement of.

like 68% and that is about a 5 to 10% improvement, the test set is about 5%

improvement and the training said it's about a 10% improvement over what we had

for just the baseline predictor. If you remember those values before they

were like 28 and then 56 or something like that.

And so, we are doing better. But again, you won't always do better but

you expect that you will do better. And in this case, we've seen, that we do

do better when we add this neighbor prediction on top of it.