0:01
Hi, my name is Brian Caffo. This is Mathematical Biostatistics
Bootcamp, lecture ten on T Confidence intervals.
So, in this lecture, we're going to go through group T intervals whereas the last
lecture we did T intervals for a single mean or you could do those intervals for a
group where the observations were paired. But now we're gonna talk about instances
where we have two independent groups. We'll briefly talk about a method that
construct a likely hood and then we'll talk about what you do if you have unequal
variances. Let me motivate the problem a little bit.
Suppose that we want to compare the mean blood pressure between two groups in a
randomized trial, those who received treatment to those who received placebo.
Unlike last week, where people would have had to have been matched, say, comparing
the same person before and after receiving a treatment, these groups are entirely
independent. The group that received the treatment and
the group that received the placebo. So we can't use the same procedure.
We can't take pairwise differences between measurements.
In fact, they might have different sample sizes in the two groups, and then we
definitely couldn't do it. So this lecture, we're going to talk about
ways for investigating the differences in the population means between groups when
we have independent samples. But we'll see that the methodology works
out to be very similar to what we did last week, the motivating ideas will be nearly
identical. So let's go through some assumptions that
we're going to use for our first variation of the t interval.
So our first collection is the X1 to Xnx is a collection of IID random variables.
They have some mean and they have some variance.
And Y1 to ny are IID normal. And they have a different mean but the
same variance. Right now we're, going to assume the
variance between the two groups is the same.
So we might think of X as the treaty group and Y as the control group.
Or X is one group, and Y is another group. So let's let X bar, Y bar, Sx and Sy be
this means in standard deviations for the two groups.
And we're, our goal is to estimate, say, the difference.
Mu X - mu Y or of course, you could do mu Y - mu X. and look at the negative of the
answer. We would like to estimate that.
But we'd like to have a confidence interval to quantify our uncertainty in
estimating that parameter. So the obvious estimator of saying mu Y -
mu X is Y bar - X bar. I think everyone would agree that the
interval needs to be centered at that point, or that point has to be central in
the construction of the interval. But we also need to figure out some way to
create a confidence interval to incorporate our uncertainty.
Well let's think can we do something that's along the lines of estimate plus or
minus a T quantile times a standard deviation.
Well, we want a standard error of this estimator Y bar - X bar.
If you turn to the calculations, and I would hope that everyone in this class
could do this calculation at this point, under the assumptions that we've made, the
variances of Y bar - X bar works out to be sigma2<i>1 squared times one / nx + one /</i>
ny. And there's a really good estimator of
that entity in this setting. In fact it's a maximum likelihood
estimator, or close to it. And that's the pool variants estimate,
Ssubp^2.. And that works out to be nx - one Sx^2 +
ny - one Sy^2 / nx + ny - two and this works out to be good estimator of sigma
squared. Let's talk about this estimator really
quickly. If you take nx - one and you divide it by
nx + ny - two you get a number that's between zero and one.
And if you take ny - one and you divide it by nx + ny - two you get one minus that
number. You can check the calculations to make
sure that I'm right about that but I'm right.
So, this estimator, ssubp^2 is nothing other than a weighted average of the two
group variances, right? So, it's a weighted average of the
variance for group x plus the variance for group Y.
If nx and ny are equal, if you have the same sample size in both groups, then you
can calculate nx - one over nx + ny - two works out to be 0.5 in which case the
pooled variance estimate works out to be the arithmetic average of the two
variances. On the other hand, if the group x contains
a lot more data. Right?
Nx - one is a lot larger than ny - one. Then nx -
One over this, denominator is going to be much bigger, and you'll get a much bigger
weight on Sx^2 / Sy^2. And in that case, the weighted average
does exactly what you would hope, is it takes whichever of the two groups that has
more measurements associated with it and weights the variance estimate from that
group more heavily which is exactly what you would hope.
There is more data. This variance estimate is going to be
estimated a little bit better since it has more data.
So it makes sense that a good estimator would place more weight.
And so that's basically what this pulled variance estimate is, it's other than an
average. It's just say a, what we called simplicial
average rather than a arithmetic average. Okay so just to reiterate some of these
points. The pooled estimator is a mixture of the
group variances placing bigger weight on whichever one has the larger sample sizes.
If the sample sizes are the same, it's really easy.
All you have to do is average the two variances.
And then the pooled estimate is unbiased. We can show that really quickly.
If you take the expect of value of ssubp^2 you just use the fact that the both of the
individual group variance estimators are unbiased and then you wind up with the
result. I'm not going to show this, cuz it's kind
of complicated to do. But the pool variance estimate turns out
to be independent of Y bar - X bar. But the reason is, if you stomach this
fact that I didn't show before, that X bar is independent of Sx and Y bar is
independent of Sy. Well then X bar - Y bar is going to be
independent of Sx - Sy because all of the collections of things are independent,
then because of that, it should be independent of any function of Sx and Sy
which ssubp^2 is a function of Sx and Sy. So I'm not going to dwell on this point
but take it as given that Y bar - X bar is independent of the co-variance estimate.
And hopefully you can kind of get a sense where I might be going with this
calculation, what I'd like to do is create a T confidence interval.
And remember, what did I need to create a T confidence interval?
I needed to figure out a way to get a standard normal and divide it by the
square root of a Chi-squared2 divided by its degrees of freedom, an independent
Chi-squared2. So, I'm hoping that some function of the
pool variance will be Chi-squared.2. And I just stated without proof that it'll
be independent of the difference in sample means.
Well, it turns out you know, another fact that I'm not going to prove, but one that
you can certainly take to the bank, is that the sum of independent Chi-squared2
random variables is again Chi-Squared And the degrees of freedom just add up.
So let's take nx + ny - two times the pooled variance divided by sigma squared.
Well, that works out to just be nx - one times the X group variance, divided by
sigma squared, and Y - one times the Y group variance, divided by sigma squared,
and we know from before that this first term is Chi-squared with nx - one degrees
of freedom. The second term is Chi-squared with ny -
one degrees of freedom. And so if you believe my fact above, that
the sum of two independent Chi-squared is again Chi-squared with the degrees of
freedom added, That would mean that when we add this
Chi-squared2 with nx - one degrees of freedom and this chi squared with ny - one
degrees of freedom, We get a Chi-squared2 with nx + ny - two
degrees of freedom. And of course we're happy assuming that
the two Chi-squared2 are independent because the entire presumption of
everything we're talking about is that the two groups we're looking at are
independent. This is sort of independent group
analysis. We're assuming that group X and group Y
are independent. Okay.
So now we can construct our t, T statistic.
So we take Y bar - X bar, subtract off its mean, mu Y - mu X and divide by its
standard error. Sigma times one / nx + one / ny square
root. And then divide the whole thing by nx + ny
- two ssupb^2 over sigma squared, which is a Chi-squared and then that is divided by
its degrees of freedom nx + plus ny - two. So if you look at that, that top part is a
standard normal so the original data for the two groups are Gaussian, so that we
know that the sample means are Gaussian, so that we know the difference in the
sample means is Gaussian. And if we take a Gaussian, and subtract
off it's mean and divide by its standard deviation, we wind up with a standard
normal. So the top is a standard normal.
We're stating that the top is independent of the bottom.
And then the bottom we know is the square root of a Chi-squared divided by two
degrees of freedom. So the whole thing has to be a T random
variable with nx + ny - two degrees of freedom.
And then if you collect terms and work with the arithmetic a little bit,
You see that this left hand side works out to be Y bar - X bar, -mu Y - mu X, whole
thing divided by ssubp times one / nx + one / ny square root, which is basically
just the statistic we'd like to use, which is the observed difference in means minus
the population difference in means divided by the standard error; but with sigma
replaced with our data estimate of sigma, so sigma replaced by ssubp..
10:34
And again, notice the form of this is estimator minus true value divided by
standard error, estimated standard error again.
And then, I'm hoping that you should be able to use the same logic from the
previous lecture in how we constructed that confidence interval to just say,
okay, well, the confidence interval for the difference in means is now just turn
through these same calculations, and we get Y bar minus X bar, plus or minus the
appropriate T quantile, times the standard error.
Okay let me repeat that again. The estimate plus or minus the appropriate
T quantile times the standard error. So again, the interval works out to be the
estimate plus or minus the appropriate quantile from the appropriate distribution
times the standard error. Okay. Remember a big assumption in this is
that there is equal variances in the groups.
And we'll talk about that later. I guess one thing I'd like to mention
about the equal variance assumption now, while we're on it,
Is there are actual tests for equality of variances between independent groups.
They work out to be F tests. But those tests are kind of notoriously
bad. So I think some textbooks will do things
like suggest testing equality of the variances.
If the variances are equal, then do this confidence interval and if they're
unequal, do the confidence interval that we're going to talk about in a slide or
two. But I don't like that procedure at all.
I think you should look at graphs, look at the data, and make assessments as to
whether or not the variants are equal or unequal and use that to decide.
If you really must estimate the ratio of the variances in the groups, then I would
suggest you, using bootstrapping would be my suggested technique for doing it,
unless the, maybe the sample sizes are very small.
But another safe thing to do is just always assume that the variances are
unequal. So, if you're worried about this
assumption, you just always do the conservative thing and assume that the
variances are unequal. It's little maybe above the discussion in
this class and how to get a likelihood for mu Y minus mu X.
But it turns out that getting a likelihood for mu Y - mu X divided by sigma, which is
still a single parameter, is very easy. And the reason is that this statistic, Y,
y bar X, x bar divided by its standard error.
That follows a distribution which is called a non-centrality distribution and
the non-centrality parameter depends on mu Y - mu X over sigma and then something
involving the n's that we know. And so you can use this fact to create a
likelihood, not a profile likelihood or anything like that, just an honest to
goodness likelihood for Mu Y - Mu X over Sigma.
I should say what Mu Y - Mu X over sigma is we, I think of it as kind of an effect
size type measurement, but it's the difference in the means standardized
relative to the inter-group standard deviation.
So it's the difference in the means and standard deviation units which is a very
useful thing if you want to calibrate your difference in the means across studies.
Right? If you want to say oh, well this
difference in the means is kind of big. Well, what does big mean?
You know, in one case, it's measured in inches and the other cases, it's measured
in tons or something and, and in other cases, it's measured in centimeters.
So if you look at different experiments, the units are all different, the context
is very different, and it's impossible to compare a say, mu Y - mu X across
experiments. But there is some hope for comparing mu Y
- mu X over sigma across experiments, cuz you've gotten rid of the units, and
everything is expressed in intra-groups, standard deviation units.
So it's a meaningful parameter and it's easy to create a likelihood.
So I'll show you how to do it, but say this is not a tremendously common
technique. On the other hand, the comparison of the
two groups using a standard T confidence interval with the pooled variance is an
extremely common technique. So let's go through an example of actually
constructing this interval. And by the way, this is just a special
case of what's called Annova estimation, just where you happen to have two groups.
Rosner has this great book called Fundamentals of Biostatistics and I got
this example from page 304 of his textbook.
And he looked at an example where they were comparing systolic blood pressure for
eight oral contraceptive users versus 21 controlled.
There was some concern, I guess, in the study, over whether or not oral
contraceptive use increased the systolic blood pressure measured in millimeters or
mercury. So the X bar for the oral contraceptive
users was 132.86, the standard deviation for the oral contraceptive users was
15.34, the mean systolic blood pressure for the control subjects was 127.44 and,
and so on. So the pooled variance estimate takes the
weighted average of the two variances, and so you'll see the formula right here, it
works out to be 307.8. One of the most easiest mistakes if you
happen to be doing these calculations by hand is to forget and to pool these
standard deviations instead of the variances.
You should pool the variance, not the standard deviations.
So if I'd took out these squares, I'd get the wrong answer.
I would get a pooled standard deviation instead of a pooled variance.
But if you'd really mess up if you treated that number as if it was a variance
because it would be on the wrong order of magnitude unit.
So the biggest mistake you can make is to not square those things.
One way to check, when you do this, by the way, is it's an average, right?
So in this case it's going to be an average of 15.34^22 and 18.23^22.
So it has to be between those two numbers. The 15.34^22 and 18.23^22.
So, so then when you square root it, that number has to be between 15.34 and 18.23.
So if that hasn't happened, you've really screwed up.
Okay, so we've got our pooled variance, and if we square root that, we get our
pooled standard deviation. And then we need the appropriate T
quantiles. So we need the T quantile for 97.5 if we
want a 95% confidence interval. And we need 27 degrees of freedom, which
is, if I'm doing my arithmetic correctly, 821-2.
+ 21 - two, and in R, you can just get this
number as the QT. Q standing for quantile, and t, T standing
for T distribution 0.975 cuz we want the 97.5th percentile and degrees of freedom
equals 27 it'll just return 2.052, maybe, plus some other decimal places.
And so the interval is 132.86 - 127.44 plus or minus our T quantile times our
pooled variance yimes one / eight + one / 21.
This works out to be -9.52 to 20.36. One of the most important things to look
for because it's a difference in means is whether or not the interval contains zero,
right? Because if the interval contains zero,
then that would say a reasonable estimate for the difference in blood pressures
between the two groups is that they're identical.
And, in this case it does contain zeroes. So it, you know, there's evidence to say
that there is no difference. In other words that this oral
contraceptive use doesn't appear to be presenting evidence that there's an
associated increase in blood pressure. Turns out that whether evaluating whether
or not zero in this interval turns out to be equivalent to a 2-sided hypothesis test
and we'll talk about hypothesis test later on.
So you have to be careful in how you interpret hypotheses test.
Right now, we might as well just say -9.5 to 20.4 say is reasonable accounting in
the measurements for comparing the average systolic blood pressure between oral
contraceptive users and controls. Now, by the way, another thing to keep
track of whenever you create these intervals is what order you've subtracted
things in, right? In this case, we did contraceptive users
minus controls. I think you should pick a rule and stick
with it, you know? So I always use, say, treated minus
control. And it doesn't matter.
Of course, you just get the negative of the interval if you do it the other
direction. But in interpreting it let's say this
interval was entirely above zero then you would be saying that oral contraceptive
users had an estimated high systolic blood pressure than controls.
But if you forgot and thought you had subtracted controls minus oral
contraceptives you would get the opposite interpretation.
So any rate my point being just remember what order you subtract things in because
it's an easy mistake to make. Here on the next slide I just have a
likelihood plot for the effect size using the non-central T distribution.
And I got, you know, a rough idea of the range of values to plot, by the way.
As I took my confidence interval -9.52 to 20.36.
And I just divided it by the pooled standard deviation.
And that gives me about -0.54 to 1.16. And so you know, I plotted I think from
19:46
-1.5 to positive 1.5. So you know, I got like, at least a rough
idea of the range of things to plot from looking at the interval.
This by the way, -0.54 to 1.16 is not a valid interval for the effect size.
Because we haven't accounted for the uncertainty in estimating this ssubp
Here in the denominator, later on we'll talk about a, an effective way for
generating confidence intervals for nearly any statistic you can dream of using
bootstrapping So finally, I just want to briefly mention what you do if you're not
willing to assume the variances are equal. Well we an calculate the variant in the
unequal case. Y bar - X bar is still, of course, normal.
Its mean is still, of course, mu Y - mu X, but now its variance is changed.
We can't factor out the sigma2, squared so it works out to just be sigma squared X /
nx + sigma2 squared Y / ny. And the statistic we'd like to use to generate our
confidence interval would be Y bar minus X bar - mu Y minus mu X divided by this
variance with the estimated variances plugged in, Sx^2 / nx, plus Sy^2 over ny,
all raised to the one-half power. Unfortunately, that doesn't exactly follow
a T distribution. But, there's a smart idea.
Right. And the, the idea was, well, it's maybe
not a T distribution. But people could simulate and kind of
figure out what is distribution looks like.
And they said, well, it looks an awful lot like a T distribution, but you know, we
can't seem to get the degrees of freedom exactly right to make it perfectly a T
distribution. And they said, well, why don't we find
like the best degrees of freedom to make this look like a T distribution.
And they said well, you know, we could have that degrees of freedom depend on the
data on the variances for example. And we could have it be a fractional
degrees of, not have to be a whole number. And everyone said, that's great, that's a
great idea, why don't we do that? And, so they came up with this crazy
formula for the degrees of freedom by trying to figure out the best degrees of
freedom that makes it look like a T distribution.
And that's what you can do. You basically, evaluate the statistic with
the imperical variances plugged in, in the denominator.
And then, act like it's T distributed with these, kind of, impossible to remember but
easy to plug into degrees of freedom formula.
And that's all you have to do. And that confidence interval works really,
really well. So our confidence interval is just going
to be Y bar - X bar plus or minus the appropriate T quantile with these crazy
degrees of freedom, 0.975 quantile for a 95% interval, times the estimated standard
error and then we have a T interval. So let's go through that really quickly
for our example. We're gonna compare our eight oral
contraceptive units to versus our 21 controls.
I just re-put all the numbers here just to remind you.
In this case if you directly plug SOC and SC into this formula from the previous
page where you have you know, Sx^22, in this case, would be Soc^22.
And sy^22 would be Sc^22. And nx would be noc, and ny would be nc.
Just plug those directly into that formula, and you get 15.04 degrees of
freedom. And that, if I plugged into the formula
correctly, Maybe everyone double checked me.
I think, with 10,000 people double checking me, we should get it right.
The T quantile for that turns out to be 2.13.
So you just construct the interval. Difference in the means plus or minus the
appropriate T quantile plus the variance. 15.34^2 / eight + 18.23^2 / 21, square
root the whole thing and you get this confidence interval right here.
So you interpret the confidence interval in the same way, you're obviously
interested in whether zero in this case whether 0's in the confidence interval.
And if you want kind of a safe thing to do, you just would always do this interval
instead of assuming equal variance. Well, that was a quick lecture today.
And, I hope using the kind of thought process from this lecture and the previous
one, that you should be able to sort of create confidence intervals at a whim now.
If there's any case where you can figure out what the standard error is of a
statistic, then you'd more or less think, well, I'll, I'll get a confidence interval
by taking estimate plus or minus some quantile from some distribution times the
standard error. And maybe that quantile will usually
either be a T quantile or a standard normal quantile.
And I think you'll notice that the vast majority of confidence intervals that we
cover in this class and the vast majority of confidence intervals that you encounter
in practice will be exactly of this form. So I look forward to seeing you next time,
and next time we are going to have a light lecture.
We're going to talk about plotting. .