The P-value is the most widely used statistic in the entire world

including for inference and for everything else.

Its so popular that if it was cited every time that it was used it would have at

least three million citations, making it the most highly cited paper ever created.

So the p-value is a very important statistic and since its such an important

statistic there are lots of people that hate the p-value because it's so popular.

And so part of the reason why people hate it,

is because people consistently miss interpret the p-value.

And so the p-value is defined as the probability of observing a statistic that

you've calculated.

That is extreme as you observed it, if the null hypotheses is true.

So a couple of the things that p-value is not and

that will make statisticians see red is if you say that the p-values

the probability that the null hypothesis is true it's not equal to that.

It's also not the probability that the alternative is true.

And in some sense it's not necessarily a measure of statistical evidence.

That's a philosophical term that people will worry about but in this case,

you need to interpret it very narrowly.

As the probability of observing a statistic as or more extreme than the one

you observed in the data if you would observe the null hypothesis to be true.

So here we're going to use that example again with the responders and

the not responders to illustrate what's going on.

So again, we have responders and not responders, now we're looking at say, for

gene one, a statistic that compares the responders to the not responders.

So we might calculate the T statistics to take the average expression level among

the responders, and subtract the average expression level among the non-responders.

And then standardize that by some measure of the variability, in this case,

the average variability in each of the two groups.

So in a previous lecture we learned that one way that you could try to

quantify a null hypothesis.

The null hypothesis that the distributions are exactly the same among

the responders and the non responders, is to permute the sample labels.

So when you permute the sample labels, you leave the relationship among the genes

unchanged, but you can look at the, you can break the relationship between each

gene and the responder non-responder label.

So if I recompute the statistic, after I do that, I get a distribution

under the permutations and then I have the original statistic that I calculated.

And so the p-value that I can calculate could be the number of

permutation statistics I observed to be larger than

the statistic that I originally calculated.

And I do that in absolute value since in general the null hypothesis is

that the value is equal to zero.

That there's no difference between the two groups.

But the alternative could be that it's either more or it's positive or

it's negative.

And so I have to look in both directions, whether it's positive or negative.

And so I just count up the number of statistics that are more extreme in

each direction, and I divide by the total number of permutations.

So I basically average the number of times I observed the statistic as or

more extreme under this null hypothesis as the statistic I originally calculated and

that gives me the p-value.

So this p-value is often used as a measure, but

in general it's basically used as a hypothesis testing tool to be able to say,

if that p-value is small, you're going to reject the null hypothesis.

Because the statistic is very extreme

compared to the distribution that you would have got under the the null.

So this is what p-value distributions look like for

genomic experiments that are done well.

So typically, you see a distribution like this where there's a spike near zero and

then there's a flat distribution as you move out here towards one.

So if you actually look at this and break it down into the different parts,

this part near zero, these p-values that are really small,

those are really the P-values that are coming from the alternative distribution.

Because remember,

the p-value is measuring the probability of observing a statistic more extreme

under the permutations than the statistic that you got when you observed it.

So if you observe a statistic that's very, very extreme, the number of null or

the number of permuted statistics that will be larger than that is very small,

and you'll get a small p-value.