In this video we'll introduce the t-distribution and

discuss its origins and mechanics.

In a nut shell, the t-distribution is useful for

describing the distribution of the sample mean

when the population standard deviation, sigma, is unknown, which is almost always.

We'll start our discussion with a review of conditions for inference so

far as a motivation for why we need this new distribution.

What purpose does a large sample serve?

As long as your observations are independent and

the populations distribution is not extremely skewed, a large sample is going

to ensure you have a sampling distribution of the mean that is nearly normal.

And that the estimate of the standard error is reliable.

Remember, we estimate the standard error of the sampling distribution

as s over the square root of n, where s is the sample standard deviation.

That is the best estimate we have for

the unknown population standard deviation sigma.

If the sample size is large enough, chances are s is indeed a good estimate

for sigma, and therefore your overall standard error estimate is reliable.

But what if the sample size is small?

You might be thinking in the age of big data why are we talking about small

samples.

It is true that in certain disciplines,

especially when data are automatically recorded like webpage clicks or

Twitter stream, small sample sizes might be irrelevant.

However, there are certainly disciplines where this is not the case.

Think for example about a lab experiment or

a study that follows a near extinct mammal species.

So we need methods that work well for both large samples and small samples.

The uncertainty of the standard error estimate

is addressed by using the T distribution.

This distribution also has a bell shape, so it's unimodal and symmetric, and

it looks a lot like the normal distribution.

However, its tails are thicker.

Comparing the normal and

t-distributions visually is the best way to understand what we mean by thick tails.

Notice that the peek of the t-distribution doesn't go as

high as the peek of the normal distribution.

In other words, the t-distribution is somewhat squished in the middle and

the additional area is added to the tails.

This means that under the t-distribution observations are more likely to fall

two standard deviations away from the mean than under the normal distribution.

Meaning that confidence intervals constructed using the t distribution will

be wider, in other words more conservative

than those constructed with the normal distribution.