Learn fundamental concepts in data analysis and statistical inference, focusing on one and two independent samples.

Loading...

来自 约翰霍普金斯大学 的课程

Mathematical Biostatistics Boot Camp 2

40 评分

Learn fundamental concepts in data analysis and statistical inference, focusing on one and two independent samples.

从本节课中

Two Binomials

In this module we'll be covering some methods for looking at two binomials. This includes the odds ratio, relative risk and risk difference. We'll discussing mostly confidence intervals in this module and will develop the delta method, the tool used to create these confidence intervals. After you've watched the videos and tried the homework, take a crack at the quiz!

- Brian Caffo, PhDProfessor, Biostatistics

Bloomberg School of Public Health

Hi, my name is Brian Caffo and this is Mathematical

Biostatistics Boot Camp Lecture 4 on Two Sample Binomial Tests.

Okay, in this lecture we're going to talk about the score

statistic, which is specific two sample binomial test that will.

Serve as motivation for creating a confidence interval, as well.

We'll talk about how you can do exact tests for two binomial proportions.

And then we'll talk about comparing two binomial proportions.

and then we'll go over a little bit about

Bayesian and likelihood methods for comparing binomial proportions.

We're, we'll actually spend quite a bit

of time on binomial proportions, because later

on we'll also talk about relative risks and odds ratios and that sort of thing.

[NOISE]

[NOISE]

[NOISE]

Okay, so let's put some context on this.

So imagine, a randomized trial where there was 40 subjects.

And 20 each were randomized to two drugs and the, the two

drugs have the same active ingredients,

but let's assume they had different expedients.

What, you know that, ways in which the drugs were delivered.

One, let's just say one was a capsule and one was a, was a different

kind of pill. Okay.

so consider counting the, the, the number of

side-effects for each drug among the 40 people.

So here we have a table where there's 40 total people.

20 in drug A, 20 in drug B.

So this margin is fixed.

[INAUDIBLE]

20 and 20, that margin is fixed.

And then 11 from Drug A receive side effects and then 9 didn't.

And then five from Drug B had side effects and 15 didn't.

So on face value there, there seems to be a greater

propensity for side effects from Drug A then from Drug B.

And what we'd like to do is to, to do a test of

whether or not the side effects are the propensity for side

effects is the same within the two, within the two drugs.

And there's a lot of different ways we could

formulate this problem even from a simple data set.

And we'll talk about some of the ways.

To do that.

but for right now lets just start with talking about score tests.

So the scores should seem fairly familiar to you because

it's just going to be constructed in the same way.

that the ordinary test is, the, sort of, ordinary Z

tests are constructed in the way that we have before.

So, let's consider a single binomial proportion,

not two binomial proportions, but i can set

a single binomial proportion, and later on

in the lecture, we'll consider two binomial proportions.

so imagine only looking at drug A and saying well lets test whether or not

drug A has a specific population proportion of side effects.

So we want to test h0 p equal to p0 and so our obvious

estimate of p0 is p hat the sample proportion of side effects.

Okay, so our obvious metric of the discrepancy between

p hat and p naught would be the difference or

maybe we can do a long ratio or something like

that but lets for right now lets do the difference.

Okay.

But that, but, you know, eh, you know, in order to compare this

to a statistical distribution, let's normalize it by the variance of p hat.

Well the variance

of p hat is p times 1 minus p over n. And, under the null hypothesis, that's p0

times 1 minus p0 over n, so the standard standard error is the square root of that.

And notice that were plugging in the p0 under

the null hypothesis, not plugging in p hat which

would give us the estimated standard error that we

used, for example, in the construction of the confidence interval.

So any rate, this, this test when we plug in

p0 in the denominator here, performs a little bit better

than the so called wall test where in the denominator

rather than plugging in p0, we plug in p hat.

So remember how we can invert this test statistic

to perform a confidence, to create a confidence interval.

And we can, you know, of course we can use the test statistic to

just perform the test, so I should have said this on the previous slide.

but, that Z statistic you compare

to quantiles from the standard normal distribution,

the upper Alpha over Tuth quantile if you're doing a two sided test.

The upper alpha over two quantile if you're doing

the one sided greater than test, the lower alpha, the

upper alpha quantile if you're doing the one sided test

where the alternative is greater than, p greater than p0.

And the alpha quantile if you're doing in one sided test well where you're testing

p less than p0.

Okay so and this, this should be obvious to you at this point in the class

how you would create take a Z statistic and then use it to perform a test.

And then, of course, we've already talked at length, in other

settings, for example, in inverting the T test, and the standard

Z interval, on how we can invert a confidence interval, and

create, I'm sorry, invert a hypothesis test, and create a confidence interval,

namely, calculate those values of p0 for which we'd failed to reject.

If we invert the Wald test, we get the Wald confidence interval.

P hat plus or minus Z one minus alpha over

two square root p hat one minus p hat over n.

And if we invert the Score test we get a

so called Score interval which is a lot more complicated.

We get p hat times this quantitiy times

one half times this quantitiy plus and minus our

normal quantal, our upper Alpha over 2

normal quantile and then this standard error formula.

I want to point out one thing about the that, that the, this

is not a numerator here by the way this is this quantity p

hat times thing plus 1 half times this thing and then I ran

out of space, ten on the next line I put plus or minus.

The normal quantile times the standard error.

So this is p hat times n over n plus Z squared and

then plus one half times Z squared over n plus Z squared.

So, look at this two factors n over n plus Z squared

and Z squared over n plus Z squared, they add up to one.

So they are they are two proportions that add up that, that, that,

um,add up to one.

So in other words a point on the two

dimensional simplex is what you might call, say in mathematical

parlance but whats important is as n gets very big,

this first term gets very big and p hat dominates.

if n is small then, then, then this term in front of the one, this term in

front of the one half gets a little bit

bigger and, well, the one half probably hopefully doesn't

dominate but, but, but there's more a greater

fraction placed on the, on the one half.

And just to give you context, the Z One minus alpha over 2,

well you know that's usually going to be around 2 so this is about 4.

So it's n over, it's about n over n plus 4 plus one half,

4 over n plus 4 so at any rate, and as n gets very

large it just becomes very similar. To the Wald interval.

so what this, this does, is it takes p hat and it shrinks it towards one half.

And that's, that, that turns out to be a good

thing to do, because the binomial confidence interval you don't

want it centered exactly at p hat, because the binomial

distribution is p hat is Further away from one half

gets more asymmetric.

It gets more skewed, and because of that,

you don't want that point right in the middle.

At any rate, so we, we talked

previously about confidence intervals for binomial proportions.

And plugging in Z one minus alpha over two equal to two

yields the so called Agresti Coull interval that we talked about before.

So this is actually the motivation for the Agresti Coull interval is that most people

do 95% intervals and if we take our 1.96 and just round it up to two then and

plug it into the score interval we get exactly the Agresti Coull interval.

[INAUDIBLE]

[NOISE]

Okay, so let's do our example.

In our previous example consider testing whether or not Drug

A's percentage of subjects with side effects is greater than 10%.

So, I don't know, I made up 10%.

So let's assume that the FDA gets really mad for

this kind of drug if you have more than 10%.

Prevalance of side effects.

so H0 PA is equal to 0.1 versus HA PA greater than one,

where PA is the population proportion of side effects for Drug A.

And then our p hat is 11 over 20 which is 0.55.

Our test statistic is 0.55 minus 0.1 divided by square root remember to plug in

the p naught from the null hypothesis 0.1 times 0.9 divided 20 you get 6.7.

We reject the p value we reject you know our, our critical value is going to

be, or for one-sided, test is going to be about 1.65 or a two-sided test,

it would, it would be about 2.

Either way, 6.7's going to be bigger than it.

And then, our P value, the probability of, of getting

a Z bigger than 6.7 is of course, nearly around 0.

six, almost seven standard deviations away from, from

zero, for a standard normal, is, is quite.

Quite far out in the tails, remember the

three standard evasions covered the majority of the distribution.

and then if we were doing a two-sided

test remember that we would double this P value.

Okay.

Now let's, let's discuss this problem a little bit.

So what, what do we have to do to get this to work.

so mechanically I, I hope you find this easy, but

let's talk a little bit about the thinking of this.

So we're, we're postulating that, that, the number of side effects out of 20 is

a binomial trial.

Well that, implicit in that is the idea that, and we have I

ID g/ Data, every person is independent

and identically distributed, draw from a population.

So we're using those assumptions, the IID assumptions to create the idea of a

population, super population, that has a prevalence of side effects of p a.

of course, we cannot, in general you can't know that unless

your action is sending more people or going to great

pains to actually sample independently from the population you're interested in.

which is usually not the case.

In general the sample is, what you're, what you are doing is you're

doing a statistical model where you're hoping

that the people are a representative sample.

You're hoping that there's no crossover of side effects.

So let's suppose some of the people receiving

the drug are friends, or in the same family, or in the same

neighborhood, or something like that that there's no reason if one person gets.

Side effects.

That it's more likely for anyone else around them

to get side effects who also received the drug.

So no interference is what they would call that.

and so so, so our IID model is what's giving

us here, this idea of a population proportion.

And then we're, we're testing relative to that proportion.

But it's good, I think, whenever you're doing these.

Kinds of test actually think about what you're modeling

is random and what your population model is that

you're trying to do, because, you know, modeling is

a modeling the, the calculations here are very simple.

Right?

I hope we all agree that the calculations are very simple.

but the, the principles, the idea of how the

modeling is going is, is a lot more delicate.

And, and, and actually on a lot more

shaking ground so that's why its a good idea

to always sort of make through your model

assumptions and what they imply for your actual hypothesis.

So just give you an example lets assume that our sample of people.

Are all people who are professional drug takers for pharmaceutical companies?

Sort of professional guinea pigs.

They, whenever a drug company says, oh, well, we

want to test out this headache medication, they sign up.

And then, there are, there are people who

do is, of cours, of course, right, you know.

You can, you can make, maybe not a, a

tremendous living doing it, but, but, you know, they,

the drug companies of course pay people for their

time, to, to, to test out these sorts of drugs.

then the, it would be hard

to, to, to to declare that population IID draw.

From the general population, because this is a very, very different sort of people.

They probably adhere to their medication schedule very very precisely and, and

other things like that, they're probably very good takers of medications.

They probably do the instructions quite well.

They know And so on.

so so any way, my point, my point, my larger point

being is that, that, when you, when you get this number 6.7,

which is very easy to obtain, you know, think about what it

means in the, in the context of the model that you've used.

And in this case the model is seemingly very benign, but

the, you know, the big part of the model is that.

The IID draw from a population. think about what that means in the context

of the problem that you're studying.