0:01
So the, the Z test that we're talking about requires the assumptions of
the CLT and for n to be large enough for it to be applicable.
If n is small, then you could just do Gossett Student's T test.
In the same way, you're just replacing
the normal quantiles with the Student's T quantiles.
the probability of rejecting the null
hypothesis when it's false is called power.
Remember, we set the type
one error rate, which is the probability of rejecting the null hypothesis when
it's true, so we force the type one error rate to be small.
The type two error rate, which is the probability of failing to reject the
0:40
null hypothesis when in fact the null hypothesis
is false is called a type two error rate.
Power is 1 minus that, it's the probability
of rejecting the null hypothesis when it is false.
And so, power is a good thing.
You want to reject the null hypothesis when it's false.
And unfortunately, power is not typically under
our control after the experiment has been conducted.
so the way that people combat this is prior to conducting
the study, they do a power calculation where they vary the sample
size or if it's simple enough, just calculate the sample
size needed to obtain a certain level of power using guesses for what they think
the standard error and, and hypothesize hypothesized
significant effect would be.
And that's what we'll talk about next lecture.
Okay.
So let's actually go through the T calculation for this example.
suppose that n is 16 rather than a 100 as we were considering
before so we have to so we're going to use a T test.
then, look at this equation right here. we want 5% to be the probability
that X bar minus 30, the value under the null hypothesis,
divided by the estimated standard error now, s over square root 16.
we want to do the probability that that quanitity is
larger than the t quantile now instead of the z quantile.
Again, the 1 minus alpha quantile with 15 degrees of freedom.
2:17
So our test statistic now is this standardized observed
mean so 32, our observed mean, minus the hypothesized value
divided by the standard error 10 over square root 16.
Square root 16 then moves up in the denom, in the numerator and that works out to
be 0.8 and the t critical value is 1.75.
And so now we, we fail to reject and it, and it, and it shouldn't be surprising,
right, we're changing what used to be multiplication by a square root
100 to now square root 16. And so, the test statistic went
down substantially while the quantile that we're comparing it to went up.
Because remember, the t is a heavier tail distribution than the
normal, so it shouldn't be surprising that we now fail to reject.
Okay. So in the previous
slide, we did the one sided tests.
Let's now do the two sided tests, and we're
going to move through these things quickly because I'm hoping
at this point in the class that you're getting,
you'll be getting used to these kinds of calculations.
So let's, we want to now test whether mu is different from 30 as the alternative.
And maybe you could say that doesn't make a lot of sense in this case
because the way I framed the problem
was that we're looking at a particularly susceptible
population to having a high RDI so why aren't,
why don't we just test mu greater than 30.
And well, let's just, for the sake of argument,
just to show you the calculations do different from 30.
But also, I would say that in
many journals and avenues of scientific inquiry, they
demand two sided test even if the one
sided test is the natural direction to consider.
4:03
so let's do a two sided test.
So what we want is to test whether or not our observed mean X bar is significantly
different from our null hypothesized value of, for the population mean 30.
So that would if it's significantly larger than 30 or significantly smaller than 30.
So, we could just say, well, maybe we will look
at the absolute value, X bar minus 30, which would look
at whether it's too small too small below 30 or too large above 30.
And then of course, because we you know, we want to, to standardize our statistics.
We're going to divide by the standard error of the mean, s over square root 16.
And we know that X bar minus 30 over s over square
root 16, if the data are iid Gaussian, that follows a t distribution.
4:55
And so, if we want alpha, the type one error rate, to be specified
so that the probability that this test statistic is too large or too small,
the probability of that occurrence is exactly
alpha, well what we could then pick
is the t quantile t1 minus alpha over 3 and 15 degrees of freedom.
And what this does is this says, this random t statistic,
the probability of it being larger than
this quantile, is alpha over 2 probability.
of the, the positive part of this
statistic, the probability of it being larger than
the, the the, the, the t1 minus alpha
over two quantile gives alpha over 2 probability.
The probability that this test statistic on the negative end is less
than neg, the, the t alpha over 2 quantile with 15 degrees
of freedom which is a negative value is also alpha over 2.
So we put alpha over 2 in the lower tail, alpha over
2 in the upper tail, and that yields a total probability of alpha.
5:58
And in the next slide, I'll describe that a little bit.
And this calculation is, of course, all done
under the null hypothesis that mu equals 30.
so we'll reject if our test statistic, which in this case, X bar minus 30
over s over square root 16 is 0.8.
So, when we take the absolute value, it remains 0.8.
And we're going to reject if it's either too large or too small.
But again, remember, the critical value is calculated now using
alpha over 2 rather than alpha because we want alpha over
2 probability of rejecting for too large, and alpha over 2
probability for rejecting if the test statistic is too small, small
negative.
So, in this case, the critical value is 2.13 and notice, of
course, that's a larger value than when we just use alpha, because
we're going further out into the tail, so it's harder to reject
for the two sided test than it is for the one sided test.
So since we rejected for the one sided test,
we're, of course, going to reject for the two sided test.
6:57
Okay.
Let's just briefly again show you the calculate,
the two sided calculation in where the alpha over 2 comes from.
So here, I'm setting a sequence of x values from minus 4 to plus 4.
I'm evaluating the t density with 15 degrees of freedom at those
points, and then let me plot. And there's my t distribution.
7:23
Okay, now I'm going to shade in that area right there.
That's 2.5%.
And let's say my alpha, my type one error rate that
I want is 5%, and that value right there is 2.13.
So for the t distribution, the 97.5th
quantile is 2.13 with, when you have 15 degrees of freedom.
Then let's do the same thing for the lower quantile.
sorry about that.
[SOUND]
There we go. That's
better. And that's 2.5%
right there and that's negative 2.13 and
then that's 95%. So what we're saying is we calculate
the our normalized test statistic X bar minus 30
over S over square root 16 and the probability that the
absolute value of that statistic is bigger than 2.13.
Or in other words, the positive, the probability that that statistic is too
large positive above 2.13 is 2.5%. Or too small negative is 2 point
negative too small negative in the form of being less than negative 2.13 is 2.5%.
So the probability that it's absolute value is bigger than 2.13
is 5% including the upper tail 2.5% and the lower tail 2.5%.
So that the probability we, we, the test
statistic lies in the rejection region is 5%.