0:00

We're going to wrap up our discussion on working with one unknown

population proportion by talking about doing a hypothesis test for a proportion.

Let's go through the steps for doing a hypothesis test,

these are going to look very similar to what we've seen before.

First, we set our hypothesis.

In this case, our unknown population parameter is denoted at with p,

as opposed to mu for means, so our null hypothesis sets that p

equal to some null value, and our alternative hypothesis says that p

can be less than, greater than, or not equal to that null value.

Next, we calculate our point estimate.

In this case, that's the sample proportion, a p-hat.

Then we check our conditions.

The first condition is independence.

We want to make sure that the

sampled observations are independent of each other.

We, this could either be ensured

through random sampling or random assignment.

Depending on whether you are doing an observational study or an experiment.

And if you're sampling with a replacement, we want the

sample size to be less than 10% of the population.

In terms of sample size ad skew, we want to make sure we

have at least ten expected successes and ten expected failures in our sample.

Note that here I've used p,

instead of p hat, and that is because in a hypothesis

test, we have to assume that the null hypothesis is true.

If you think about the definition of a p value, it says, probability

of observed or mare, more extreme outcome, if the null hypothesis is true.

So, when going through the conditions, or any other portion of the

hypothesis test, we must assume that the null is true, and therefore,

wherever we see a p, we plug in whatever the null

value for that p is, that's set forth in the null hypothesis.

So, we could read this as not ten observed successes and ten observed failures,

but instead as ten expected successes and ten expected failures.

Next step is to draw the sampling distribution.

Remember, we always,

always, always want to draw our curve before we calculate our p

value and we want to shade where the p value belongs to.

Either is it in one tail and if so, is it the upper tail or

the lower tail or is it a two tail test and we calculate our test statistic.

The test statistic is always of the form

observed minus null divided by the standard error.

That's observed sample proportion p hat minus

the null value p that comes from the

null hypothesis divided by the standard error, and

we calculate that standard error as the square root of p times 1 minus p over n.

Note again that I've said p and not p-hat,

because we are again assuming that the null hypothesis

is true and therefore we are using what the

null hypothesis has set forth as our true population parameter.

We don't know if that's the case, but we must assume

that the null is true as we proceed through the hypothesis test.

Lastly, we make a decision and interpret it in context of the research question.

If the p value is less than our significance level, we reject the

null hypothesis and decide that the data

provide convincing evidence for the alternative hypothesis.

If, in fact the p value

is greater than our significance level, we fail to reject the null hypothesis

and conclude that the data do not

provide convincing evidence for the alternative hypothesis.

3:33

So, just to clarify this discussion about when do we use p, and when do we use

p-hat, the moral of the story is, we

use the sample proportion when there's nothing else known.

And we use the population proportion, or at least

the null hypothesized value of the

population proportion, when we're doing a hypothesis

test, which dictates that we must assume that the null hypothesis is true.

So, if I want to check the success-failure condition for our confidence

interval, I would use the

observed proportions, the observed sample proportions.

If, on the other hand, I'm checking the

success-failure condition for a hypothesis test, I use the

expected counts and plug in the p that comes from my null hypothesis.

4:47

poll found that 60% of 1,983

randomly sampled American adults believe in evolution.

Does this provide convincing evidence that

majority of Americans believe in evolution?

And when say majority, what we mean is more than 50%.

So if the question is, is the true proportion

of Americans who believe in evolution greater than 50%,

then our alternative hypothesis should state p is greater than 0.5.

And using this, we can easily figure out what the null hypothesis

can be, because we keep the same population perimeter and the same null

value, except we simply set it equal to that number as opposed

to giving a direction one way or another or saying not equal to.

Remember, the null hypothesis always has an equal sign in it,

versus the alternative could have a greater

than, less than or not equal to sign.

Depending on the research question that's being posed.

We are also given that sample proportion is 0.6.

So, in this sample, definitely more than 50%

of the respondents believe in evolution, but we're, what

we are checking to see is, is this

difference that we're observing between the sample proportion and

what we're hypothesizing statistically significant.

In other words, does this particular sample yield

convincing evidence of majority of Americans believing in evolution.

Another input that we're going to need is our sample size and that's 1,983.

Before we move on to actually doing

inference, remember, we must always check our conditions.

The first condition, as usual, is about independence.

1,983 is definitely less than 10% of all Americans and we have a random sample.

And therefore we can assume that whether one American

in the sample believes in evolution is independent of another.

The second condition is about the sample

size or the skew of the sampling distribution.

And remember, we check this

for proportions using the success-failure condition.

And because we're doing a hypothesis test and

because, within the hypothesis test we have to assume that the null is true, we

would use the p as set forth by the null hypothesis in checking this condition.

So, the total number of successes and failures that are expected in this sample

are going to be both 983 times 50% or 0.5.

Which gives us roughly 991.5, which is obviously greater

than ten.

We didn't calculate both of them separately.

Because we are multiplying by the same 0.5.

Either way, whether you're calculating expected successes or expected failures.

7:33

Then, since the success-failure condition is met.

We can assume a nearly normal sampling distribution for our sample proportion.

Now that we've checked and got out of the

way our conditions and given a set of hypotheses and

characteristics on the sample, we can finally calculate our p value.

Before we get there, we need a test statistic.

Before we get therei we need to draw the sampling distribution.

So first, let's try to write it out.

p hat is distributed nearly normally according to our conditions.

And according to this central limit theorem.

The center of that distribution should be at the true population parameter.

We don't know the true population parameter.

But, since we are doing a hypothesis test,

we are assuming the known hypothesis to be true.

Therefore, we can plug in the value of the

population parameter that we set forth in our hypothesis.

And assume that that is indeed the true

population parameter for the purpose of this hypothesis test.

The standard error of the distribution can be calculated as

the square root of 0.5 times 0.5 divided by our

sample size, which comes out to be roughly 0.0112.