Okay, we've arrived at the finale of the lecture we going to answer the question

what does Hebbian Learning do anyway. We going to start with the averaged Hebb

rule so as you recall the averaged Hebb rule is given by this differential

equation where Q is the input correlation matrix.

And what we would like to do is solve this differential equation defined wt.

So what is w as the function of time when its being changed according to this

differential equation. So how do we solve this equation?

Any ideas? Well, if you guessed eigenvetors, you

would be right. We can always rely on our dear friends,

the eigenvectors. So, as before, let's write our vector wt

in terms of the eigenvectors of the correlation matrix.

Now recall that the input correlation matrix is going to be a real and symetric

matrix which means that the eigenvectors are going to be orthonormal, which means

that we can write any vector including the vector wt.

As a linear combination of the eigenvectors.

Now if we substitute our expression for wt in the differential equation for the

average Hebb rule, then we can simplify as before and we can get this

differential equation for the coefficients.

And when we solve the differential equation for the coefficient, let's say

ci, then we have this solution. And when we substitute this solution into

our expression for wt, then we get this solution for the weight vector as a

function of time. So, what is this equation telling us

about the synaptic weight vector w as a function of time?

It's telling us that the synaptic weight vector w is a linear combination of the

eigenvectors of the input correlation matrix.

And furthermore, it's telling us that the coefficients for these eigenvectors have

terms that are exponentially dependent on the eigenvalues of the correlation

matrix. So what do you think will happen to w as

time goes on? So when t becomes very large, what do you

think will happen to w? When t becomes large, the largest

eigenvalue terms so that one that has the largest eigenvalue.

Lets say it's the eigen value lamba 1 is the largest eigen value then that term

dominates this linearly combition so what we get.

Then is the result that the rate vector turns out to be proportional to the first

eigenvector or the principal eigenvector of the input correlation matrix.

And furthermore, if we're using Oja's rule as you know, the length of the

weight vector then converges to 1 square root of alpha.

So in that case, the weight vector approaches the value e1 divided by square

root of alpha. We've actually shown something very

exciting. We've shown that the brain can actually

do statistics and that's in addition to what we showed last week which was that

the brain can do calculus. There seems to be no stopping the brain.

Well, let's look at why we think the brain does statistics.

So it turns out the Hebbian learning rule that we just analyzed implements the same

thing as the statistical technique of principal component analysis or PCA.

So to understand what principal component analysis is all about let's look at a

simple example. So here is some two dimensional data.

We have these points which represent the values you want and u2, which comprise

the input vector u. And if we start the Hebb rule with an

initial weight vector that's given by this dashed line, then the Hebb rule

rotates this initial weight vector to align itself with the direction of

maximum variance. So here is the.

Cloud of data and the final weight vector is going to be parallel to this line

which is the direction of maximum variance.

Now when we apply the Hebb rule to some data that has been shifted so the data

from here can be shifted to a different location.

Let's say with input mean. Two and two.

So, in that case we find that the Hebb rule does not do what we want it to do.

which is, it finds this direction as the direction of maximum variance going

through the origin of this two dimensional plot.

And that is really not the direction of maximum variance, the direction of

maximum variance is given again by. This direction but luckily when we apply

the equal variance rule we find that it does indeed find the direction of maximum

variance. So it's taken care of the fact that the

input mean is no longer 00 but it's 2 and 2 and that is accounted for by the equal

variance rule. So the equal variance based Hebb rule is

able to find again the direction of maximum variance.