0:00

Hello, neuromusketeers from around the world.

In the previous lecture, we learned that neurons don't use

Facebook or LinkedIn to connect among themselves.

Rather, they use synapses.

We learned how to model a synapse on a computer and we were even entertained by

two neurons that performed a ballroom dance by following each other and then,

danced in synchrony by inhibiting each other.

In this lecture, we will learn how to model larger networks of neurons.

When modeling networks of neurons,

one of the choices you'll have to make as

a computational neuroscientist is deciding whether you should

use spiking models of neurons or firing rate based models for neurons.

Let's look at the advantages and disadvantages of these two choices.

If you decide to use spiking neurons for your network model,

the advantages that you get include the fact that you can now

model computation and learning based on spike timing.

So, you can model phenomena such as synchrony between neurons,

which you would not be able to model if you did not have spikes as the output of neurons.

On the other hand, the disadvantages include the fact that now

you have to model these differential equations and so if you have a very,

very large network of these neurons,

than it might be computationally expensive or in

some cases maybe even impossible to simulate, very,

very large networks of perhaps millions and millions of

neurons because you have to simulate these differential equations on your computer.

On the other hand, you might decide to use neurons with firing-rate outputs.

So, this means that the output of the neurons is not zero or one or a spike,

but real valued outputs denoting the firing-rate of the neuron.

The advantages include the fact that now you can have

greater efficiency and so you will be able to simulate for example,

very large networks of neurons.

On the other hand, since you are using firing rates as the output,

you will not be able to model phenomena based on spike timing or synchrony.

The question that we can ask now is,

how are these two approaches related?

The firing rate model can be related to the spiking model by

using the linear filter model of a synapse that we covered in the last lecture.

If you recall, we had a synapse b that was

receiving an input Spike train that we called Rho_b,

and the Rho_b was just a representation of a spike train such as this one.

And if we can model the synapse b using a filter,

which we call K(t), so for example,

K(t) could be an exponential filter such as this,

then the synaptic conductance at b could be modeled using

a linear filter equation that is given down here.

And so, this particular equation just implements the filtering operation,

so you simply replace each spike by the filter for

the synapse and you get a conductance that looks something like this.

We can now go from the basic synapse model that we had on

the previous slide to the model for multiple synapses.

So, here's a cartoon of a neuron receiving inputs from n different other neurons.

And so each of these synapses has a synaptic weight that we're calling W_1,

W_2 all the way up to W_N.

And each of these synapses now gets spike trains given by Rho_1,

Rho_2, all the way up to Rho_N.

And if we assume that there are no known linear interactions

between these synapses then the total synaptic current,

the total input to the neuron,

is given by just a summation of each of

the individual inputs coming in from each of these synapses.

And so the total input to the neuron then is

just the summation given by this particular equation.

So are you ready to make that leap from the spiking model to the firing rate model?

Some people say it's almost a leap of faith

and they compare it to converting from one religion to the other.

But I think that's a little bit extreme.

In any case here's one network and here are these spike trains from the input neurons.

And here's the equation that we had in the previous slide which

characterizes the total amount of input that the neuron is getting from all its synapses.

Now the leap that we make is going from these spike train functions,

the Rho_b that we have at each of the individual synapses

to the instantaneous firing rate that we have for each of the input neurons.

Now when can this replacement fail?

Well this would fail if, for example there are correlations between all the input neurons

or when there's synchrony for example between pairs or more than pairs of neurons.

So in those cases we cannot make this replacement.

But it turns out that in many,

many cases it is quite possible to replace the spike train function,

Rho_b, with the firing rate function,

the instantaneous firing rate for the individual input neurons.

Now I know many people don't like integrals in their equations and some people even

get nightmares when they have integrals such as this one in an equation.

So for those integral-phobes among you.

Is there any way to simplify

the input current equation so it does not contain this integral?

Well it turns out that if you choose the synaptic filter to be

an exponential filter then when you substitute this value for K(t)

this particular exponential expression for K(t) in

our equation for the input current and when you take the derivative

of this equation on both sides

then you have an equation that does not contain an integral.

So here's how the input current, Is,

changes as a function of time.

And you'll notice that it's very similar in form to

the equation we had for the RC circuit.

Except that now this equation tells you how

the input current changes as a function of time and there's a particular time constant,

Tau_s, that determines how rapidly the input current reacts to changes in the input.

And the input component itself is nothing but

just a linear weighted sum of

all the inputs weighted by the corresponding synaptic weight.

And if you want to simplify this even further you can write this weighted sum

as just the dot product of the rate vector with the input vector.

Okay, it's time to uncork that champagne bottle.

We've arrived at our first firing rate based network model.

Here's a picture of the network we're going to look at.

The network has a single output neuron.

The firing rate of the output neuron is given by V. The neuron receives

inputs from these input neurons whose firing rates are denoted by this vector,

U, and the synaptic connections between

the input neurons and the single output neuron is given by the rate vector,

the synaptic rate vector,

W. We're going to assume that the firing rate of the output neuron follows an equation

that is quite similar to the equation we had for

the membrane potential from the previous lecture.

The upward firing rate has a particular time constant Tau_r which determines

how rapidly the firing rate is able to follow

inputs that the neuron is getting from other neurons.

And there is an optional non-linear function, F,

that we can pass the input current through.

And finally the input current due to

the synaptic inputs is given by the same equation as we had on the previous slide.

Now let's look at how this network behaves when you compare the magnitude

of Tau_s with the magnitude for Tau_r.

If Tau_s is much smaller than Tau_r this means that the synaptic input converges

quickly which means that we are going to have the input current be equal to just w.u,

as opposed to the network dynamics which takes a long time to converge.

And so the equation that we have then would basically

replace the Is that we have here with w.u.

And so the equation we have then for the network is

just Tau_r dv/dt for the change of the firing rate as a function of time

is equal to minus V plus the potential non linear function and then just F of w.u.

On the other hand if the synapses are slow compared to the output dynamics.

In other words if Tau_r as much lesser than Tau_s then we can set V equal to F of

Is(t) because the upward dynamics is much

faster compared to how the synaptic current is changing over time.

And so we have an equation where the upward firing rate is equal to

some potentially non-linear function of

the input synaptic current where

the synaptic current is given by this differential equation.

And finally in the case of static inputs and by static we mean

that the input does not change as a function of time for long periods of time.

Then we can look at

the steady state output of the network so Vss denotes the steady state of the network.

So how do we get the steady state output of the network?

Well we can set dv/dt equal to zero as well as dIs/dt equal to zero.

So when we set dvdt equal to zero we get V steady state is equal to f of

Is and since Is is also not changing as a function of time Is is equal to w.u.

So those of you who are aficionados of

artificial neural networks you should find this equation quite familiar.

This is in fact the equation that one uses an artificial neural networks to

model neurons where we replace F with a threshold function or a sigmoid function.

And now you know that this equation that people in

the artificial intelligence community have been using for a very long time is in fact

a simplification of the rich dynamics that one has in

the synaptic current as well as the dynamics of the output firing rate.

Let's move to the case where there are multiple output neurons.

So in the case of a single output neuron as in the previous slide we use

an equation that looks like this to model how

the output firing rate changes as a function of time.

And if we make the assumption that the synapses are relatively fast then we can simply

set the input current due to the synapses to just w.u.

So weighted sum of the inputs.

And so this equation captures how the output

firing rate changes as a function of the inputs.

Now what do we do if we want to extend this to multiple output neurons?

So here's what the network looks like.

And in this case instead of having a single output we have an output vector.

And so the equation then looks something like this.

So instead of a rate of vector,

w, we now have a rate matrix,

W. And the equation is now

a differential equation that includes a vector output so V is now

a vector and this product instead of being

w.u is now just the matrix W multiplied with the input vector.

The matrix W has the components W_ij.

W_ij denotes the synaptic rate from neuron j to neuron i,

so if this is neuron j and if this is neuron i,

then this connection here would have a weight that is given

by W_ij and the synaptic weight matrix

W would capture all of these connections from

the input layer of neurons to the output layer of neurons.

We have so far been considering networks that take

inputs and feed them to a layer of output neurons.

These type of networks are called feedforward networks.

Now if our brain had only feedforward networks then we would be

constantly reacting to stimuli and saying things that we're not supposed to say.

Luckily the networks in our brains have recurrent feedback connections.

This means that for any particular layer of neurons or neurons in a brain area,

the neurons make connections with each other.

So for example, in this network we have

an output layer of neurons and we can allow connections between

the output layer neurons and we can characterize

the strength of these connections between the output layer of neurons using

a set of synaptic weights which we are going to call M. So

this matrix M denotes the strength of

the connections between the layer of output neurons.

So now we have a new equation that characterizes how

the firing rates of the output neurons change as a function of time.

We have as before a time constant and as before

we have a feedforward input given by w times u.

But now we also have the feedback given

by M times V. So the matrix M is multiplied by the vector V to give

you a new vector that characterizes the feedback

from past activities of the output neurons.

You'll notice that if you set the matrix M to a matrix of zeros,

which means that there are no feedback connections then we

have any equation but you don't have

the M times V component and that is equivalent to

the equation for a feedforward network that you saw in the previous slide.

Okay let's begin a journey into the land of networks by

looking at a simple linear feedforward network.

So here is the equation for a linear feedforward network and for the sake of simplicity

let's assume that the input doesn't change as

a function of time so if we have a static input, u.

And so we can look at the steady state value of

the output and so that means we just set dv/dt equal to zero.

And so we get the equation for steady state output of

the network as Vss equal to w times u.

Now suppose the feedforward weights are given by this matrix W,

and suppose I give you the input u to be this vector here.

What is the output of this linear feedforward network?

I'll give you a couple of seconds to think about that.

Or maybe you want to pause the video at this point?

Okay, I hope you found the answer. Here it is.

So what we had here was a six by five matrix multiplied by a five by

one vector and as you might expect you'll get a six by one vector that looks like this.

Now the question that I would like to ask you

is: what do you think this network is doing?

In other words how is it transforming the input u to the output Vss?

What is the function that this network is implementing?

As you might have guessed this network is performing linear

filtering in order to detect edges in the input.

So what do we mean by detect edges?

Well if you look carefully at the Matrix W,

you'll notice that the rows of the matrix W contain

shifted versions of this particular filter and the filter looks something like this.

And if you take the input,

so this was our input 1,

2, 2, 2, and 1.

Here is a picture of what the input looks like.

The operation of multiplying the matrix W with

the input gives you an output that looks like this.

And you'll notice that the output has detected wherever there are transitions from,

in this case, 1 to 2 and then back from 2 to 1.

And so it has detected this sudden change in the input.

And so if we imagine that we have an input image then

the filter is essentially detecting changes in the brightness values of the image.

In this case is going from a brightness value,

or a pixel value, of one to a pixel value of two.

And then again from a pixel value of two back to a pixel value of one.

Now we can also apply the linear filtering network to

two dimensional images such as the one shown here.

So how would you transform a two dimensional image to a one

dimensional vector u that you feed to the linear filtering network?

One way you can do that is by collapsing each row of this two dimensional image.

And so if this is row number one,

row number two, row number three, and so on,

you would simply collapse all the rows to form

one very big long vector where the first part of the vector is row number one,

the second part of the vector is row number two, and so on.

And so there you have your input vector U.

And here is an example output.

So in this case we are not using the same filter as the one in

the previous line but something that is used in this particular website.

And so you can see how the input has been transformed to

enhance or detect the edges of this particular image.

At this point you might be wondering does the brain detect edges?

And you already know the answer to this because in week one

we discussed the concept of receptor fields and we noted

that cells in the primary visual cortex or

V1 have receptive fields that look something like this.

And this type of receptive field is indicative of edge detecting neurons.

For example this receptive field suggests that the neuron detects,

or gets excited by,

a transition from a dark to a bright edge in the visual image whereas

this neuron here would be sensitive to dark bars embedded within a bright background.

The brain is in fact not just detecting edges it is actually doing calculus.

The brain was doing calculus even before Newton and Leibniz invented calculus.

Sorry Newton and Leibniz.

Here's what I mean when I say that the brain is doing calculus.

Here is the receptive field of a vival neuron and this

as you recall is the linear filter that

we used in our feedforward matrix W. So if

this is some particular location X this would be the next location,

X plus 1, in the image.

And now if you look at the definition of a derivative of a function, f,

with respect to the value X,

then here is the definition of the derivative from your calculus textbook.

And if you look at a discrete approximation of this derivative you'll find that it

amounts to just a difference between f of X plus 1 minus f of X.

And if you look at our feedforward matrix it's simply multiplying w with u.

And so if u is the image the f of x then W which is essentially shifted versions of

this filter performs exactly this particular operation of

subtracting neighboring pixel values at the location X and X plus 1.

Now here's something that's even more interesting,

if we look at the other type of receptive field for

vival neurons we call this the bar detector type receptive field.

We can implement this type of receptive field by

using a linear filter that looks something like this.

So if this is location X this would be location X minus one.

This would be location X plus one in the image and we would use a matrix W in

our feedforward network that would have shifted versions of this linear filter.

Now let's look at the definition of the second order derivative of a

function f. So D squared F over Dx squared is given by this expression.

If we approximate this expression

using a discrete approximation we would have something that looks like this.

Now you'll notice that this approximation,

so f of X plus one minus two times f of X plus f of x minus one,

has the same coefficients as our linear filter.

So one minus two and one here's one minus two and one.

And so when you multiply the feedforward matrix with

the input u using this linear filter what are you going to get?

Well you're going to get an approximation to

the second order derivative of the image if u represents an image.

And since the image is two dimensional the brain computes derivatives along

multiple directions in the image using oriented receptive fields such as these.

That wraps up our first expedition into the world of networks.

In the next lecture we'll meet those wild and crazy creatures called recurrent

networks and will also enjoy the company of

our good old friends eigenvectors and eigenvalues.

Until then Ciao and Alvida.