A conceptual and interpretive public health approach to some of the most commonly used methods from basic statistics.

Loading...

来自 Johns Hopkins University 的课程

Statistical Reasoning for Public Health 1: Estimation, Inference, & Interpretation

237 个评分

Johns Hopkins University

237 个评分

A conceptual and interpretive public health approach to some of the most commonly used methods from basic statistics.

从本节课中

Module 3B: Sampling Variability and Confidence Intervals

The concepts from the previous module (3A) will be extended create 95% CIs for group comparison measures (mean differences, risk differences, etc..) based on the results from a single study.

- John McGready, PhD, MSAssociate Scientist, Biostatistics

Bloomberg School of Public Health

Okay in this section we're going to take a look at some practice problems designed

to highlight some of the ideas at least that we covered in lecture eight.

And so, as usual, I'll go through the problem

set-ups first then advise you to pause the recording, work

on them at your leisure, and then resume whenever the

fancy strikes, and you can compare your solutions to mine.

So the first couple exercises I'm going to ask

you three scenarios and ask you these are scenarios

extensively where we'll be carrying continuous outcomes between two groups.

Just want you to think about the study design.

Whether it's an example of a paired or unpaired design.

And if it's paired what is the unit of pairing.

So this first one, this first scenario in Baltimore, a real

estate practice known as flipping has elicited concerns from local and

Federal government officials.

Flipping occurs when real estate investor buys

a property for a low price, makes little

or no improvement to this property, and then

resell, resells it quickly at a higher price.

This practice has raised concern, because the properties involved in

flipping are generally in disrepair and the victim's generally low income.

Fair housing advocates are launching a lawsuit against

three real estate corporations accused of this practice.

As part of this suit, the advocates have collected data on all houses purchased by

these three corporations which were sold in

less than one year after they were purchased.

Data were collected on the purchase price and

the resale price for each of these properties.

These data were collected to investigate whether the resale

prices, were on average, higher than the initial purchase price.

And a confidence interval was constructed for

the average profit in these quick turnover sales.

So was this an example of a paired or unpaired study design.

In the second scenario, researchers are

testing a new blood pressure reducing drug.

Or a drug that potentially reduces blood pressure.

Participants in this study are randomized to

either the drug group or a placebo group.

And, the baseline,

meaning pre-randomization measurements are taken on both groups.

On everyone who's going to participate in the study.

And another measurement is taken three months after

the administration of the drug, or the placebo.

Researchers are curious as to whether the drug is

more effective in lowering blood pressure than the placebo.

Then finally in this third study, researchers are

interested in the impact of a vegan diet

on risk factors for coronary heart disease, CHD.

In subjects with a family history of such coronary heart disease.

So, researchers randomly select a 100 such families with more than one child And

randomized two siblings from each family to either a vegan diet or omnivorous diet.

These siblings prescribed by a nutritionist are to last for six weeks.

And then what they do is they, measure before

randomization, they measure baseline coronary

heart disease risk factor measurements.

they take these things like blood pressure, blood pressure,

cholesterol level, and percent body fat on each participant.

And then they do this again after six weeks of the diet or,

or the diet period, whether it be the vegan or the omnivore diet.

And then changes in the risk factor levels are

to be compared between those on the vegan diet and those on the omnivorous diet.

Okay.

In this next section, we'll practice doing some hand computations.

Not because hand computations are that important in this computer age.

But because it helps you think about what

aspects of the concepts gets translated into the mathematics.

And help you think about what goes

into computing the uncertainty, for sample mean differences,

differences in proportion, etcetera.

So this first question and we'll talk about

it at a high school in the United States.

A dietary counseling program is being tested to

measure the program's long-term impact on student's fat intake.

Of the 300 students at the school, 150 are

randomized to receive five one-hour sessions of dietary counseling.

And the other 150

students receive no counseling.

Six months after the last counseling session, all students

are asked to keep a food diary for one week.

And each student's average fat, daily fat

intake in grams for the week is calculated.

And then the results of this exercise as follows.

So the average daily fat intake, for the group of 146 of the 150 students who were

in the intervention group, four were lost to follow up.

The average daily fat intake was 54.8 grams but there was

a fair amount of person to person variability in this intake.

The standard deviation was 28.1 grams. In the group, the 142 who, who were

followed up in the control group, the average daily fat intake was 62.8 grams.

But the standard deviation of 34.7 grams.

So I'd like you to estimate using these

sample results, a 95% confidence interval for the true

mean difference in average fat intake, between the group

that received counseling and the group that did not.

And interpret the observed mean difference in the 95% confidence interval.

And finally we'll talk about a study that was performed on

a representative sample of 258 intravenous drug users

from a larger population of such drug users.

And now particularly interest to the researchers were factors which

may influence the risk of

contracting Tuberculosis amongst intravenous drug users.

So 97 of the study subjects admitted to sharing needles to shoot

drugs, and of these 97, 24 had a positive tuberculin test result.

The other 161 subjects denied having shared needles, and of

these 161 subjects, 28 had a positive tuberculin test result.

So I'd first like you to use the study results, estimate the difference of

proportions for those contracting tuberculosis amongst those

who shared needles and those who didn't.

And construct a 95% confidence interval, the true difference in

the population of IDVUs from which the sample, a sample was taken.

And then interpret this estimated difference

in proportions, and the 95% confidence interval.

Based on what you got in Part A, what, if

anything, can you say about the estimated relative risk and

odds ratios for comparing the tuberculosis outcomes between the needle

sharers and non-needle sharers And the corresponding 95% confidence intervals.

And then finally, I'd like you to go

ahead and now estimate the relative risk of tuberculosis

for those who shared needles compared to those who

didn't and it's 95% confidence interval using these data.

So I will patrol my own grammar here and it should be its without an apostrophe.

So now I'd like you to, advise you to turn off the tape,

solve these problems at your leisure, come back and compare your results to mine.

Okay, welcome back. I hope you enjoyed doing these exercises.

Let's go through my take on the solutions.

So which of the following examples involve the comparison of paired data?

Let's go through these three quickly and

we'll talk about the study design for each.

So the first was the slipping example, where the housing advocates collected

data on the purchase price and resale price of a group of properties.

And names were collected to see if the resale

price was higher, on average, than the purchase price.

And what the degree of the difference was.

So the study design here is that for the unit of observation for

each of the data points was a particular house, on which

two measures were collected on each house. The original price,

and then the resale price.

So for each house we could est, investigate the difference in these.

So we can compute the difference for each of

the houses and look at the nature of the differences.

We could average them to see if there was an increase on average and how

large it was, we could compute a

standard deviation on the individual changes in price.

So the unit this was a paired study and the unit of pairing was the house.

We had two measurements per house, that we're comparing.

So the second situation, researchers are

testing a new blood pressure reducing drug.

Participants in this study group in this study are

randomized to either the drug group or a placebo group.

And, they first took initial blood pressure measurements

prior to randomization on everyone enrolled in the study.

And, then another measurement was taken three months

after the administration of the drug or the placebo.

And researchers are interested as to whether the drug is

more effective in lowering blood pressure than the placebo.

So how was this set up?

So we initially had a large, or a single group of enrollees to the study,

and they were randomized to either receive the drug or the placebo.

So this should clue you in right away, that this is going to be

an unpaired analysis ultimately because the outcomes

of interest are compared across these two groups.

But what are the outcomes of interest?

Well, this may throw a little wrench into, into

your thinking, but let's just work through the logic.

So for each person randomized to the drug group,

we collected a pre randomization measurement and a post

randomization measurement in blood pressure.

And so we could compute the difference for each

person enrolled in the drug group, and we could

compute a mean difference, a mean change on average

for those individuals who are enrolled in the drug group.

We could do the same thing for those,

those in the placebo group.

Compute the change in blood pressure for each individual after

three months, compared to baseline for those in the placebo group.

So within the drug and the placebo group these measurements are paired,

but we're not interested in comparing each person in the drug group to

himself, or each person in the placebo to himself, or interest in comparing these

paired changes on average between the drug in the placebo group.

So the ultimate result is an unpaired comparison.

In this third scenario we said researchers are interested

in measuring the impact of a vegan diet on

risk factories, for factors for coronary heart disease in

subjects with a family history of coronary heart disease.

So if the researchers randomly select 100 such families with more than one child and

randomize two siblings from each family to receive

either a vegan diet or an omnivorous diet.

One sibling's

randomized to one group and then the other is put in the other group.

So these diets, prescribed by a nutritionist, are to last for six weeks.

And then what they do is they measure things before

randomization and then after the six weeks on each participant.

And they compare the changes in the risk factor levels

over the six week period between the two diet groups.

What is the data set up here?

Well, to start what they're doing, the unit of observation is really

at the family level, so for each family, they select two siblings.

So we'll call 'em Sibling A and Sibling B, so Sibling A from Family One, Sibling B.

Sibling A from Family Two, Sibling B from Family Two.

And then.

[BLANK_AUDIO].

The first sibling is randomized to either be

put in the vegan diet or the omnivorous diet.

And so if the sibling ends up in the vegan

group the second sibling will be placed in the omnivorous group.

If the first sibling is randomized to the omnivorous group, then

the second sibling will effectively have been randomized to the vegan group.

And, then what they do is, within each diet group for each sibling they

take a pre-diet and post-diet measure on

certain things like blood pressure, cholesterol level etcetera.

And then they look at the change and they

do the same for everyone

in the omnivorous diet group.

And for each sibling a randomized to that group they looked at the difference.

And they're ultimately interesting, comparing the changes

between the vegan and the omnivorous group.

This sounds very similar to the last study

except, except, the two diet groups are inherently linked.

Because there's one person from each family

represented in each of the two groups.

And so this is an example, ultimately, of a paired comparison,

where co-, because we are comparing the changes

between those who get the vegan and omnivorous diet.

But those who got the vegan diet, for every person who

got the vegan diet he or she is linked to his

[INAUDIBLE],

a specific person in the omnivorous diet group.

So, this is an example of a paired comparison ultimately.

All right in the second question I asked

you to do some computations and interpret the results.

So we had, in a high school in the United States, a dietary counseling program is

being tested to measure the potential long term impact on student's fat intake.

Of the 300

students at the school, 150 were randomized to

receive five one hour sessions of dietary counseling.

The other 150 students received no such counseling.

Six months after the last counseling session, all students

are asked to keep a food diary for one week.

Each student's average fat intake, in grams, is calculated at

the end of the week; and the results are as follows.

So the average daily intake

for each student during that follow-up week,

in the intervention group, was 54.8 grams.

Standard deviation of the individual values is

28.1 and there were 146 people followed up.

In the control group the average hot or daily fat intake was higher, 62.8 grams.

Here's the standard deviation. And there were 142 people in this group.

So

I wanted you to estimate using the sample

results on a 95% confidence interval for the

true mean difference in average fat intake between

those that received counseling and those that did not.

And then interpret the results.

So let's see what we got.

Let's lay out the data. So we said the counseling group, the mean

average that intake was 54.8 grams with the standard deviation of 28.1 and

there were 146 people. And then in the.

[BLANK_AUDIO]

No counseling group it was 62.8 Standard

deviation of 34.7, and 142 people.

So this observed mean difference, the observed

mean difference between these two groups, the

mean plan intake for the counseling group

minus the mean intake for the non counseling

group was negative eight grams.

So they consumed eight grams less, on average per day, than

those who got the couns- who did, did not receive the counseling.

But that's just an estimate based on the sample, nearly 300 across the two groups.

So we want to put the uncertainty bounds on that.

So we have to estimate the standard error of this mean difference.

And we'll just apply our formula, the one we love so much remind us

that the uncertainty of this mean difference

comes from the uncertainty in each mean combined.

So we take the standard deviation of the individual.

Daily fat intakes for those in the counseling group, square

it, divide it by the number of people in that group.

So this essentially turns

out to be the standard error of that first sample mean, square,

then we add it to the same thing done for the group

that did not receive counseling and then the uncertainty in our difference

means is basically an additive function of the uncertainty in each mean.

And if you do the math on this, this turns out

to be about 3.7 grams, a 95% confidence interval for the

true mean difference among all high school students.

You can think of that as the population under study.

It was negative 8 plus or minus 2 times

3.7, where it goes from negative

15.4 grams to negative 0.6 grams.

So what do you notice about this confidence interval?

Well, you probably notice it does not

include 0, so this result is statistically significant.

Meaning that all possibilities show a reduced average

fat intake for those who got the dietary counseling.

But the range of values leaves little bit to the interpretation.

On the one hand, this would be a pretty impressive effect.

If those who actually receive the program,

got consumed over 15 calories. 15 fat grams less per day on average.

On the other end, we're talking about a

negligible effect of average difference of negative 0.6 grams.

So there's a lot of uncertainty in this confidence interval.

I think given that the sample results were so large, and average intake,

reduced intake of eight grams and the fact that this is statistically significant,

though, shows credible evidence that this program was effective at lowering

the average daily fat intake in this population of high school students.

Okay, finally this study that we looked at

the representative sample of 258 intravenous drug users.

And I'll just jump to the problems here.

So the first thing I asked you to do

was look at the difference of proportions of those,

who had tuberculosis, amongst those who shared needles, and those who didn't.

So we just do this straight up the

proportion who had tuberculosis amongst those who shared needles,

was there were 24 cases out of the 97 people who admitted to sharing needles.

And this is about 25%. In the group who didn't share needles,

or said they didn't share needles, this proportion was 28 out of

161. So that's 17%.

So this difference here, the difference between

those proportion who shared, and those who didn't

was 8%. 8% greater proportion of persons

with tuberculosis than those amongst those

who'd share needles than those that didn't.

This is the difference in proportions but of course this

estimate is based on relatively small to medium sized samples so

we want to account for the uncertainty in it so we

have to estimate the standard error of this difference in proportions.

And this formula is very similar

in spirit to the estimated standard error for difference in

unpaired means, we take the uncertainty associated with the first proportion

and functionally square it, and then add it

to the uncertainty in the second estimated proportion.

[SOUND]

Which is also squared.

And then, take the square root of that sum.

So, again, the uncertainty in this difference in proportions,

is a function of the uncertainty in each proportion, itself.

And if you do this, this turns out to be about .05.

5%. So.

[SOUND]

And then all the dust settles.

Sorry, was trying to draw a barrier here, since it's for organization.

We observed an 8% greater proportion in the group

that shared needles, but when I account for the uncertainty.

And if we do this, we get a

confidence interval that goes from negative 2% to 18%.

So this, you'll notice that zero is in this interval, includes zero.

So this result is not statistically significant.

After accounting for the uncertainty, it's not

clear what the direction of association is,

although most values in the confidence interval

show an increase amongst those who shared needles.

But the literal interpretation is that the absolute difference

of proportions in those who shared needles and those who

didn't in this population could by anywhere from a reduction

up to 2% or and increase up to 80 18%.

And I said, before you do any estimation based

on these results, what can you say about the

estimated relative risk and odds ratios for comparing tuberculosis

outcomes between the two groups and their corresponding confidence level.

So, the risk difference between the sharers and those

who didn't share Was positive, indicating a higher proportion.

Well, we know there were higher proportion of share tuberculosis

outcomes amongst those who shared than those who didn't.

And hence, even though, I mean, we knew proportion was higher in the top group.

But if all we had was this risk

difference, we, we could still say that the

estimated relative risk of tuberculosis for those who

shared and those who didn't is greater than one.

And so it would be estimated, odds ratio.

And then the fact that the 95% confidence interval for the

population level difference of proportions, included zero.

We know that the confidence intervals for the relative risk

and odds ratios would concur in terms of the null value.

And their respective null value is one, so they're 95% CI's will include one.

We can ascertain that without actually computing them.

But just to full steam ahead and give this is a full on treatment why don't

we go ahead and estimate the relative and

compute a 95% confidence interval from these data.

So the easiest way to get this started is to set up a

two by two table, and there's certain arbitrariness about how you set these up.

But, in order for the formulae that I've given you

for standard errors with a relative risk to work it's advantageous

to set it up.

It makes things easier to keep track of if you put the outcome

status in the rows, and then those with the outcome in the first row.

And those who don't have the outcome in the

second row and in the columns you have the exposures.

And those with the exposures those who share needles in

the first and those who didn't share in the second.

We do that,

fill this table out. We're good, almost good to go, so

there are estimated relative risks to start.

Sorry, all of a sudden I'm having a lot more trouble with the pen.

And I'm used to having trouble with the pen,

but all of a sudden I'm really having trouble.

Here's the 25% we observed amongst the sharers.

We have the 17% amongst those who didn't share.

This is approximately equal to 1.4. We know we're going to have to put

this on the log scale, to start to estimate the confidence interval.

A log of 1.4 is roughly equal to 0.34.

And now to get the standard error of the log of this estimated relative risk, we

have to just run through our formula, which

involves some counts from the two by two table.

So we'd take the square root of 1 over the number of outcomes TB positive results in

the exposed group, the sharers, minus 1 over the number of people in that group.

And then we add 1 over

the number of TB positive outcomes in the group who didn't share needles.

Minus 1 over the total number in that group.

And if you do the math out on this you get something approximately equal

to 0.24. And so the 95% confidence

interval for the log of the true relative risk

goes from 0.34 log of estimated relative risk

plus or minus 2 times 0.24. And when all the

dust settles we get a confidence interval and a log relative risk scale.

Negative 0.14 to 0.82, so you see that

we knew, heads up that this confidence interval for

the log relative risk includes zero, which means

when we exponentiate the results to get the confidence

interval for the actual relative risk, we get an interval that goes

from 0.87 to 2.27 and it includes the null value of one.

So you suggest that an individuals risk of con,

having Tuberculosis being Tuberculosis positive is 13% anywhere from

13% less to 127% greater amongst any individual who

shares needles versus an individual who does not share needles.

And so the results are statistically un, inconclusive about the association at the

population level between sharing needles and

increased or decreased risk to tuberculosis.