How would we write the change in uncertainty after you receive a message,

the change in entropy?

So let's say that we'll stay in neuroscience land for now.

S is your stimulus, and R is your response.

And the stimulus can be a scale or a vector, and the response can be a firing

rate or a spike train, or a whole pattern of spikes, or whatever you like.

But S is the stimulus and R is the response.

And so at the start, you have some distribution,

the probability that your random variable is equal to a specific value, S.

And after you get the response, you have a conditional distribution.

So this is the same thing, but conditioned on the fact that your response was

equal to little r and each one of these has an entropy.

Remember the entropy takes as input a distribution and

produces as output just a number.

So there's an entropy of the original stimulus distribution, H(S).

And there's an entropy of the conditional stimulus distribution,

H(S) given the fact that you measured response little r.

And the amount that the entropy decreases is simply H(S) minus H(S) given R.

However, as we saw with our Francious and Billy and Carol case,

different messages could have yielded different conditional distributions,

and therefore, different conditional entropies, different noise entropies.

So, in order to get a very general quantity that talks about

the entire distribution of S and R, rather than the distribution of S given

a single R, we are just going to take the average decrease in entropy.

Where the average is taken over all of the things that the message,

the response, could have been.

So we write that the information between S and

R is equal to the original entropy minus the average,

the expected value, of the conditional entropy, of the noise entropy.

And that average is taken with respect to all of the possible things,

the response, the message could have been.

And this value is called the mutual information.

And the information about S from R,

takes as input the joint distribution over S and R and outputs just a number.