Learn fundamental concepts in data analysis and statistical inference, focusing on one and two independent samples.

Loading...

来自 约翰霍普金斯大学 的课程

Mathematical Biostatistics Boot Camp 2

40 评分

Learn fundamental concepts in data analysis and statistical inference, focusing on one and two independent samples.

从本节课中

Hypothesis Testing

In this module, you'll get an introduction to hypothesis testing, a core concept in statistics. We'll cover hypothesis testing for basic one and two group settings as well as power. After you've watched the videos and tried the homework, take a stab at the quiz.

- Brian Caffo, PhDProfessor, Biostatistics

Bloomberg School of Public Health

Okay. So the calculation when mu is, when

your alternative is less than the null value is very similar.

You can, if you're taking this class, you should be able to do it.

For the unequal to hypothesis, things change a

little bit, and here's the procedure I'm going to recommend.

pick one of the two sides.

Let's say, you know, in general,

it's probably easier maybe to do the bigger side,

say, mu, mu a being bigger than mu 0.

So pick the bigger one sided hypothesis with a,

you have to pick a, you know, again, to

do a power calculation you have to pick the

value of the, under the alternative that you're going to check.

And so let's assume you pick a value of mu a that's larger.

So calculate the power for the one sided hypothesis,

the for the test of a larger mu a.

Calculate that power, but using alpha over 2 as

the error rate instead of alpha for the error rate.

And this is right enough that it excludes

a little bit of probability in the power calculation.

But it's kind of not a probability that you

have to worry about so much, because it's so small.

And then well, basically, what it, what it omits is the probability,

if your alternative mean is larger than the null mean, the probability of randomly

getting a test statistic is so small that you reject in the other direction.

And, you know, so that probability is usually very small so who cares.

so you can just omit it.

Fancy power calculators might include it, but it the, the

usually that probability is so small it's irrelevant so I should

quit talking then.

as alpha gets larger, then the power goes up, right?

If you're requiring less evidence to reject the

null hypothesis, then you're going to detect more alternative hypotheses.

Just like in a court of law, if you if you don't require

much evidence to convict people, you're going to convict a lot of guilty people.

you, you'll also convict some innocent more innocent ones, too.

but, but that's not, you know, power is

only concerned with the probability of under the alternative.

power of a one sided test is obviously bigger

than the power of a of a two sided test.

and you can see this in the calculation pretty

easily in the way that I'm suggesting you do it.

Because when you calculate a quantile of alpha over 2, you move further out,

say, into the normal probability tail.

And then you know, the, the, the, the that value gets a little bit

bigger and the, you can see that the probability gets a little bit smaller.

let's see.

And then, the further your alternative mean gets away from your

null mean, either, you know, larger, if you're testing larger than

or smaller if you're testing less than, or absolute value gets

further away from mu 0 if you're testing not equal to.

then the farther it gets away, the greater the power is.

And that makes, of course, that makes total

sense in that you know, the, the, the,

the stronger the alternative hypothesis is, the easier it should be to detect it.

And then, of course, as your sample size goes up, your power, your power goes up.

And then, and then this is why sample

size calculations are so important, it's because, you know,

of all the things we're discussing, the one

that's only potentially under your control is usually n.

So, let's talk about calculating power for T

tests because things get a little bit more complicated.

So, so the power, just like we calculated,

well, let's assume it's a a greater than alternative.

The power is the probability that our test statistic lies in the rejection region.

So X bar minus 30 in our RDI Respiratory Disturbance Index example,

X bar minus 30 over S over square root n is bigger than our t quantile.

Now it's t with 1 minus alpha and n minus

1 degrees of freedom with this calculation being executed

under the alternative hypothesis that mu equals mu a.

So now, if we were to do the same trick we

did before, where we were to add and subtract mu a,

and drag the term onto the other side, we'd still have

an S, which is random to contend with on the right-hand side.

so, that's, you know, that's a problem that, that doesn't

yield a collection of things that we know that we can solve for then.

so let's manipulate it around, and see what we can, we can figure out.

So this probability here, this top probability

is equal to the probability square root n X bar minus

30 is greater than this t times S. And then we could divide

the left-hand side by sigma and the, the right-hand side by, by sigma as well.

Okay.

Let's talk about how we can do this calculation

given the tools that we have from the class.

So here, I've added and

subtracted a mu a, so we get a z random variable

here for square root n X bar minus mu over sigma.

Here we get something that we, we know or

we, we, we need to plug in values at least.

So, we get mu a minus 30 over sigma square root times square root n.

here, you know, again, to, to remember to do any power

calculation, you're going to have to plug in the value under the alternative.

so even though we don't technically know mu a, we've got

to be able to plug in potential values of mu a.

We, of course, know our t quantile in the square root n minus 1.

And then over here on this right-hand side just in case you didn't

notice it, I multiplied and divided by a square root n minus 1.

So this far most right-hand quantity, n minus 1 s squared over sigma squared.

Well,

remember if the data are iid Gaussian, that that works out to be a chi-squared

random variable so we have over here this square root of a chi-squared.

So I've just re-written this equation now down here on the second line, where I've

replaced the Z random variable by Z and

the chi-squared random variable by a chi-squared notation.

And we've stated it, we've never proved it.

We've stated it, that the Z and

the chi-squared are independent.

So this probability statement is a little bit

hard because it involves the bivariate distribution of

a Z and a chi, and a chi-squared, which is a little bit tricky to work with.

But we could actually do this via Monte Carlo very easily.

We could simulate pairs of Z random variables and

chi-squared random variables, and evaluate this inequality for every pair.

So we'd take a simulated Z random variable, we would subtract square root

n mu a minus 30 over sigma, I mean, sorry, we would add that.

And then we would take a chi-squared simulant and we would square root it,

and then multiply it by the t quantile divided by square root n minus 1.

And we'll just check to see which is bigger.

If the left-hand side was bigger, we would record a

1 and if the right-hand side was bigger, we would

record a 0.

The percentage of ones from that simulation exercise

having generated lots of pairs of independent chi-squared

and Gaussians standard Gaussians, the percentage of ones

would then be an approximation to this probability.

And it would limit to this probability as we did more and more simulations.

So, if you wanted to do this more exactly you would have to be able to

numerically calculate the joint distribution of the normal

and the chi-square, which is not so hard.

you can figure out how to do it,

but r actually already has something, it's called power.t.test.

And here I give the r code just for

doing it for specific settings, and the result in this,

this case is a, is a power of, of 60%. So the function is power.t.test.

Of course, you have to enter in the n.

You have to enter in the delta, which in this

case is mu a minus the mu under the null hypothesis.

So let's assume that that's 2. and oh, a little fact that I, I should

have I should have alluded to before. If you look back at this

calculation here notice that the,

the, this formula here only depends on the

difference between mu a and mu 0 divided by sigma.

So we, so, so I, I, I was, I was actually

incorrect when I said we, we needed to know mu a.

We don't really, well, what, what, what we need to know is how different

mu a is from mu 0 in standard deviation units.

So, mu a minus mu 0 over sigma.

And notice mu a and mu 0 and sigma are all in the same units.

So this quantity, usually called an effect size is

is usually, what these problems are sort of characterizing

in terms of, because it's a unit free quantity

that has some hope of being interpretable across experiments.

so here we don't actually have to plug in a sigma we just

have to know what mu a minus mu 0 over, over sigma works out to be.

If we were to put in here that delta

equals 2, and then also add the argument that

sigma equals 4, we'd get the identical answer and

you can, you can try it to make sure.

Okay. So let's do the same exact calculation but

using Monte Carlo.

Here, I'm going to define the number of simulations, what is that, to a 100,000?

I'm going to set my n as 16, my sigma as 4, mu a, mu 0 to be 32 and 30.

And then I'm going to simulate a

100,000 standard normals, a 100,000 independent chi-squares.

So now I have a 100,000 pairs of normals in chi-squared, and of course,

it doesn't matter how I matched them up, I could permute the normals or

the chi-squares and you'd still get the same thing.

here, my degrees of freedom is 15 because I have

16 subjects and my degrees of freedom is n minus 1.

My t quantile is the 95th quantile of the percentile with the

t distribution with 15 degrees of freedom, and that's this qt function

right there. And then here I this line

right here ignore the mean statement for a second.

Imagine if I had just started with the z here

and said z plus square root n times mu a

minus mu 0 divided by sigma greater than t divided

by square root n minus 1 times square root x squared

[COUGH]

x squared being the random chi-squared.

That quantity would simply evaluate the pers the it would

return a vector of ones every time the left-hand side was

bigger than the right-hand side and a bunch of zeroes

every time the left-hand side was smaller than the right-hand side.

So, this vector now is just a vector of zeroes and ones,

one every time z plus square root n mu a minus mu 0 divided by sigma is

bigger then t divided by square root n minus 1 times square root xsq.

Okay, so that returns a vector of that type.

And then mean, the mean of a vector of zeros and ones is just the proportion of

ones, and you can type this in, the result is 60% or 60% up to Monte Carlo errors,

so you get 60 point something, of course.

by the way, the, the, you know, for these calculations, the error rate for the

mean is on the order of 1 over the square root of the sample size.

So, you know, we're getting a decimal place accuracy

on the order of 1 over square root 100,000.

so if you want higher accuracy, you gotta run the computer longer.

Okay.

So let's just go through these calculations.

I got my RStudio window open here. let's see.

So I'm going to first do the power t test, let me do that.

And here's the output right here.

it gives you, returns the, the values that you input, and you get power as 60%.

And then, to check, I'm going to do it here where,

[SOUND]

there you go. I, it was sd0 sigma.

So here, I set sd to be 4 and delta to

be 2, instead of delta to be 2 over

4, and then it assumes that sd is 1.

and so here you can see, you get identically the same answer.

Here, let's see if I can, there you go. So, 60, 60.40329, yeah, same thing.

Okay now, let's go ahead and do our Monte Carlo

calculation. We'll define a number of simulations

and, and then here, let me just enter this in just to show you

[SOUND]

so I'm just going to show you a couple of values from this vector.

And see, it's not, it's actually not zeroes and ones.

It's trues and falses which are the Boolean values in R.

But, you know, in R, if, if, if, if you say, let's just multiply them by 1,

which does nothing it converts whatever it needs to convert them to, to numerical

values for operations, it just does it.

Okay, so let's take the mean of those values.

And it works out to be 60.608 which

is very close to the new very precise calculation.

Okay and then in these final, final slides, I just remind us that we

have to have a true mean and

a true standard deviation to do these calculations.

but that really, it's only the change in the

means divided by the standard deviations that affect the calculation.

So just reminding you of that point because it's incredibly useful.

All right, that's the end of Lecture 2, and we'll see you next time for Lecture 3.