Have you ever wondered why some studies in medicine are done on what seems like a handful of people, whereas other studies involve thousands? Probably one of the first reasons that you'll think of is cost and that's undoubtedly a key influence. In fact, the most common sample size in medicine is zero, because the researchers can't get the money to even begin the study. But there are scientific considerations too and I'm now going to outline some of the main ones. One of them is study type or study design. There are many different study types in medicine and public health from the case series which describes a few people with something common, to a clinical trial in which patients are randomised to receive one treatment or another. If you want to evaluate some public health policy, you could use the trial design in which the collection of new data specific to the patients recruited for that trial is the norm. But you could also use some non-trial design in which the use of administrative data or Electronic Health Records is generally at least a part of it. Collecting your own data in a trial is expensive and time consuming, but that way, you'll get exactly the kind of information you need. In contrast, administrative data often cover whole regions or even whole countries allowing much larger numbers of people to be studied. Now that saves time and money but you might not get all the information on your patients that you need. Even within a given study type, the sample size, that is the number of patients whose data are analysed varies greatly. One of the main determinants of that is the need to be able to show evidence of an important difference between the patient groups we are studying. So, for instance, if you want say, evaluate a new health promotion campaign to reduce smoking rates, that you plan to test in a few areas, you'll want two control group with whom to compare their results. How many areas and therefore how many people do you need in each group? Well, if the intervention areas cut their rate of smoking from say 25 percent to 23 percent and in control areas cut their rate from 25 percent to 24 percent, there's a net benefit of two minus one equals one percentage point in favor of your new campaign. Would you judge that campaign a success? It doesn't sound like much. So what size difference would you consider success enough to want to you to roll out the intervention to other areas? That difference is called the minimum clinically important difference, it's big enough to get you excited and to make you change policy. You would want tell your family and the media about it. Now they are all formulae online to tell you how many patients you have to enroll in each group in order to be able to detect that difference - that is, for our results to be statistically significant as well as clinically significant. The more patients you enroll, the more likely you are to get a nice low P-value. The formula tells us how many patients you need to recruit, but this will always be adjusted to take account of the reality of cost and the reality that some people will refuse to take part and others will drop out during the course of the study. So to sum up, the sample size is driven by a mixture of science and reality. So on the science side, we have study design, the size of the minimum clinically important difference that's worth detecting, and how sure we want to be that there is a real difference between the groups. On the reality side, we have the need to get funding and logistics of recruiting the patients. No wonder people do thought experiments.