0:01

Well, it can be described by its mean and its variance.

And this is true for at least a couple of reasons.

One is that we did make an assumption that the underlying population is normal, and

so we could guess that the derived

0:15

distribution of T bar minus S bar is normal as well.

We could also argue by the Central Limit Theorem that

no matter what the distribution is as long as we have enough samples,

it will tend to be described accurately by a normal distribution.

Which has these two parameters, mean and variance.

So what is the mean of the statistic that we're interested in?

Well, it's the mean of the T sample,

minus the mean of the S sample, so you can really write this just T bar minus S bar.

That one's pretty straightforward.

The variance of t bar minus s bar is the variance of

our mean t bar plus the variance of our mean s bar.

I'm emphasizing that.

That is the variance of the mean of T, not the variance of the sample itself of T.

If you repeat the experiment over and over again, what you'll,

say 10,000 times, what you'll end up with is ten thousand

means for the patients who were subject to the new treatment.

And the variance of those estimates is what this refers to.

So it's not the variance of the population, nor

is it the variance of the sample itself.

It's the variance of that values of the mean over many, many experiments.

1:38

So why does the variance of the difference of means, bar t minus bar s equal to

the variance of bar T plus the variance of bar S.

Well, it's derived from the definition of variance,

but it's somewhat intuitive if you think about it.

There's two populations, there's two samples that we're taking, and

the increase in variance of the means, the more they the wider the interval in which

we have, the difference between those things is also going to get wider, right?

We could get the maximum number of green jelly beans and

the minimum number of red jelly beans.

And so the difference in means could get bigger, and bigger, and

bigger the more they vary, okay?

So whether we're adding the variable together, whether we're subtracting this,

you always add the variances.

I'll also point out there's another place where we've had to assume that these

are Independent samples as we did assume.

2:31

Okay, so where are we?

Well, we're looking for the sampling distribution of bar T minus bar S, and

we know it's described by its mean and variance by the central limit theorem and

due to an assumption of normality for the underlying population.

And we've re-expressed the variance of this statistic that we're interested in,

in terms of this sum.

2:50

But we can't compute this value, directly.

We don't know the variance of the means repeated over many experiments.

We don't have access to those many experiments

in which to compute this value.

So, we have to estimate it somehow.

Well, by the central limit theorem, we can rewrite this in terms of

the population variance divided by the sample size.

3:16

I'll refer to the central limit theorem to see that.

Then we can estimate the sample, the population variance by the sample

variance, which I've denoted by this sigma hat, and so we can do that from both

values, and now we have this re-express in terms that we actually have access to.

We can compute the sample variance for s, and we know the size of the sample,

and we compute the sample variance for t and we know the size of that sample.

3:40

Okay.

So we need sampling distribution of the difference in means.

We have the mean, the variance,

we've rexpressed in terms of quantities that we actually have access to.

I guess I shouldn't have the bar there anymore, it should be just the T and S.

3:58

So back to our original question.

What is the estimated standard error?

Remember that was the denominator of this t-statistic that we need in order to

see how significant this difference is.

All right, remember we're counting sort of in terms of the number of standard errors

away from normal.

So, remember the estimated standard error is the same thing as the estimated

standard deviation of this sampling distribution that we've now described in

terms of a mean and variance.

And so we just take the square root of this side and

the standard deviation of bar T- bas S is the square root of this thing.

All right, so we're getting closer, almost there.

Now the t-test is the statistic we're interested in,

bar t minus bar s, minus the hypothesized value, which in this case is just zero, so

we drop it, divided by the standard error which we just derived.

And, do you remember the mean from a few slides back?

We can now also compute this value and we get this t value.

4:54

If you've done this kind of process before, you might see this is as a fairly

low number of standard errors away from, this is a fairly low critical value,

which means it's probably not particularly significant.

But we can't judge that right away, we have to go look it up.

We also need the notion of degrees of freedom, and I'm not going to walk through

that process, but I'll show an estimate for the number of degrees in freedom for

this T-distribution.

And the t distribution, if you're not familiar with it, it basically looks

exactly like the Gaussian, but it's got sort of heavier tails, but

the crucial thing is that it depends on the sample size, so it becomes more and

more, closer and closer to the normal distribution as the sample size increases.

And it was designed to be sensitive to issues of sample size.

So, if we compute the degrees of freedom using this monstrosity and

we have our critical value, we then can choose

a significance level that we're interested in and use one of these and

the tables in the back of a statistics text book, or of course now online,

and the degrees of freedom is actually pretty high in this case.

So we're way down here and we look at this value of 2.660, and

we compare that to our t-statistic.

So we find that 2.660 is greater, so

we do not have evidence to reject the null hypothesis at significance level 0.05.

This result could have been by chance and

the difference between the two treatments does not seem to be significant.

So, my opinion of that process is that it's pretty painful.

And I want to show you a different way that is just as rigorous, but can be done

in ten lines of code just anyone who has a little bit of programming experience.