Hypothesis testing is a key concept in public health medicine and in science as a whole, but what is it, and why is it so important? I'll first go through a worked example, and then summarise the step-by-step procedure you should apply each time you want to test a hypothesis. Suppose you want to see whether people who got cancer are more likely, equally likely, or less likely to eat five portions of fruit and vegetables daily. Let's say, you've randomly sampled 50 people with cancer of whom 10, said they met the fruit and vegetable target, and you've got a 100 randomly sampled people without cancer of whom 30 say they met the target. That gives proportions of 20 percent for those with cancer, and 30 percent for those without cancer. Now, that looks like a difference of 10 percentage points, but is it? The answer is, you don't know yet. You need to test whether it's just due to random variation. So, how do you do that? You need to do a chi-squared test, as in the Greek letter chi. This is a standard statistical test for comparing two or more proportions. But before saying any more about it, I need to explain hypothesis testing and therefore state what hypothesis you are actually testing with this chi-squared test. It works like this, you want to know whether your sample estimates of 20 percent and 30 percent are genuinely different. To do this, you test whether they come from the same distribution. This is known as the null hypothesis, that there is no difference in the two proportions. You produce a test statistic which is covered in the following reading, and you look up the value of that statistic in a table of the relevant distribution to see how extreme it is. As you're comparing proportions, relevant distribution is the chi-squared distribution. How extreme the statistic is, is measured by the P-value. This is the proportion of the distribution that is equal to or greater than the test statistic. It's the shaded bit on this graph. The P-value is the probability that the difference in the proportions of people getting their five a day with and without cancer is 10 percentage points or more if in fact, their proportions are the same. It's the probability of getting the result you got or a more extreme result, that is of getting 10 percentage points or more given that the null hypothesis is true. In this example, let's say that you've got p equals 0.19 or 19 percent. That's not very high, but it's not that low either. In fact, convention says that in order to reject the null hypothesis, you need a P-value of 0.05 or less. With your P-value of 0.19, you can't be sure that people without cancer are more likely to get their five a day than those with cancer and so you have to accept the null hypothesis that the two proportions are the same. According to this result, there's no strong evidence of an association between fruit and vegetable eating, and risk of cancer. Note how I phrased that, "There is no strong evidence of an association," I did not say "there is no association" because there may in fact, be an association, but, based on these numbers, you couldn't find one. Maybe a bigger sample would have found it. We'll look at the effect of sample size and how to choose it later in the course. Also, note that I did not say "there is no evidence of an association." P equals 0.19 is some evidence of an association, it's just not very strong evidence. The conventional threshold for strong enough evidence is P less than 0.05. As I said, the smaller the P-value, the stronger the evidence for an association, and the less likely that chance is responsible for the result. If P equals 0.04, you could be unlucky and conclude that there's a difference when in fact there isn't. But when the P equals 0.0001, so 1 in 10,000, you'd have to be seriously unlucky. Hypothesis testing can be summarised as follows in five steps. One, set up a null hypothesis. Here it's the proportion of people eating five a day, is the same in people with and without cancer. Two, choose the measure to be tested - the proportions getting their five a day. Three, decide what is an appropriate distribution for your measure. Here it's the chi-squared distribution for proportions getting their five a day. Four, choose an appropriate statistical tests and this is the chi-squared test for comparing proportions. Lastly five, run the test and interpret P-value and that is hypothesis testing in a nutshell.