0:10

In the earlier video, we introduced the experiment from Mythbusters, where they

were testing the validity To know something like the back of your hand.

And they had 12 volunteers where they had

showed each volunteer a set of 10 pictures,

backs of hands, and they were asked to guess

which one was the back of their own hand.

And we had seen that 11 out of the 12 subjects.

We're able to identify the backs of their hand correctly.

When they redid the experiment with the palms

of the hands this time, only seven out

of these 12 subjects were able to identify

their, the palm of their own hand correctly.

So while the success rate was 91.67% for the back of the hand experiment,

it was only 58.33% for the palm of the hand experiment.

1:03

We want to answer the question, do these

data provide convincing evidence that there is a

difference in how good people are at recognizing

the backs and the palms of their hands?

In this case our null hypothesis

is that there is no difference between the two, so p back minus p palm is 0,

and the alternative says that there is a difference,

so p back minus p palm is not 0.

So first, lets check our conditions.

The first one is independence, and for,

when were looking for within group independence, Within

each group we can assume that the guess of one subject is independent of another.

So in other words just because one might be good added doesn't

necessitate that the other is going to be good at it as well.

When it comes to between groups though we have the same subject.

That are doing the guessing for the back and the palm of the hand.

So, this is actually not met.

For the time being, we're going to assume it to be met

for simply illustrative purposes, so that we can actually work through how

we would simulate this scenario.

2:09

When it comes to the sample size and skew

condition, remember we need to check the success failure condition.

And here we're doing a hypothesis test comparing two proportions and our

null hypothesis test that these two proportions are equal to each other.

So, in order to check the success failure

condition, we actually need our full proportion estimate.

Let's take a look back at the data

that we were given.

Overall, that's what we mean by the pooled proportion.

Of the 24 guesses, 18 of them were correct so 18

out of 24, the pooled proportion that we can use is 0.75.

Within each group there are 12 guesses, so 75% of them is 9 and the num expected

number of successes is going to be only 3. Since these are two

small numbers, we certainly cannot say that the success failure condition is met.

And therefore we cannot rely on the sampling distribution to

be nearly normal, or use any methods that assume so.

Therefore in this case, we're going to need to do a simulation test.

So how do we set up our simulation?

We have 24 trials so we can use 24

index cards, where each card represents a subject or

a guess at least.

We mark 18 of the cards as correct and the remaining six as wrong.

Remember this is what had happened in our actual experiment.

3:32

We shuffle the cards and split them into two groups of

size 12 for back of the hand and palm of the hand.

Then we calculate the difference between the proportions of

quote unquote correct, in the back and palm decks and

record this number.

Finally, we repeat steps three and four many, many times

to build our randomization distribution of differences in simulated proportions.

3:58

Remember, we simulate the experiment under the assumption of independence.

Or, in other words, leaving things up to chance.

If the results from the simulations look like the

data, then the difference between the proportions of correct guesses.

Can be said to be due to chance.

If on the other hand, results from the simulation do not

look like the data, the difference between the proportions of correct guesses

in the two groups, we can conclude was not due to

chance, but because people actually know the backs of their hands better.

So this is what our randomization distribution looks like.

The heights of the bars here basically represent what percent of the time, or how

many times within these 10,000 simulations a particular simulated p hat was achieved.

Remember the definition of the p value is the probability of

observed or more extreme outcome, given the null hypothesis is true.

And when we think about the observed, we want to

think about what was the success rate in the

back of the hand group, and what was the

success rate in the palm of the hand group?

And we want to take the

difference between these two, because that's

going to be the corresponding point estimate

that looks like our null hypothesis, but is based on our sample data.

The difference between the two proportions come out

to be roughly 33% so the p value is

calculated as the percentage of simulations that are more

than 33% away from the center of the distribution.

And the center of the distribution is always at zero because

remember we're assuming that the non hypothesis is

true and we're leaving things up to chance.

When we shuffle them into the two decks.

With a p-value of 0.16, 16%, we would fail to

reject the non hypothesis and say that no there isn't

actually convincing evidence based on these data that people are

better at or worse at or least there's some difference

between how they recognize the backs versus the palms of their hands.