Hi. In this module we'll continue talking about multiple comparisons problem. So, in the last module we talked about familywise error rate correction. Here, we'll probably talk about false discovery rate correction. So, methods that control familywise error rates, such as Bonferroni, random field theory, and permutation tests, provide a strong control over the number of false positives. While this is really an appealing property, the resulting thresholds often tend to be very stringent and lead to tests that suffer from low power. So power is very important in fMRI applications because most interesting effects lie in the edge of a detection. The false discovery rate, or the FDR, is a recent development in multiple comparisons that are due to Benjamini and Hochberg. So this comes from a paper around 1995. So while the familywise error rate controls the probability of any false positives, the false discovery rate controls the proportion of false positives among all rejected tests. So this is a slightly different criteria that we're controlling here. So, to get some notation down, let's suppose that we're performing tests on m different voxels. Now we can make the following little table. We can separate voxels into those that are truly inactive and those that are truly active. Of course we're never privy to this information but in general there are truly active voxels and truly inactive voxels in this setting. Now we can also declare voxels inactive or active and that's something that we have control over. So, for example, in this table we have V as the number of truly inactive voxels that were declared active. So, those would be things that we should've not rejected but did reject. That's a false positive. So, both U, V, T and S are unobservable random variables, because we don't know how many false positives we're making in practice. The only thing we do know is R, which is the number of voxels that were declared active, because we know which ones were declared active, and weren't. But we don't know what proportion of those were truly inactive or truly active. And that's something that we want to be able to control and that's the basis behind FDR. So in this notation, a familywise error rate is simply the probability that V is greater than or bigger than one. That means that we have one or more false positives because V is the number of false positives. The false discovery rate is defined to be 0 if R=0 in this case. So, if we didn't declare any voxels active then we can't have any false positives. Then this is not a problem. So a procedure controlling the FDR ensures that on average the FDR is no bigger than some pre-specified rate q which usually lies between 0 and 1. Let's say 0.05. However, for any given data set, the FDR need not be below the bound, because the FDR is just the expected number of false positives among all rejected tests. So, we just know that on average we're going to to control it at a certain rate, not what's going to happen in any given situation. So basically an FDR-controlling technique guarantees control of the FDR in the sense that it's less than or equal to q. So on average we're going to control the FDR at the level q. So how do we do this? Well, the most popular way of doing this is the so-called Benjamini-Hochberg procedure. So here we begin by selecting a desired limit q on the FDR, let's say 0.05. The next thing we do is we just rank all the p-values over all the voxels from 1 to m, where m is the total number of voxels. So we just rank them from smallest to biggest, and then we just plot them from smallest to biggest in a plot. Then we let r be the largest i, so the largest voxel in this ranking such that p(i) is less than or equal it i/m x q. So this is, i/m x q is just a straight line that goes from 0 to q/m. Now, in this case, we're going to get a straight line there, and this is the black line we see in this little cartoon here. And then anything that's below that black line is going to be deemed active and anything above is not active. So, here we're seeing the active voxels are those hypotheses, we're going to reject all the hypotheses whose corresponding p-values are between p(1) to p(r). If all of the null hypotheses are true, then the FDR is equivalent to the familywise error rate. Any procedure that controls the familywise error rate will also control the FDR. So a procedure that only controls the FDR can only be less stringent and lead to a gain in power. And so since FDR controlling procedures work only on the p-values and not on the actual test statistics, it can be applied to any valid statistical test. In the last couple of years, FDR-controlling procedures have become increasingly popular as they are less conservative than the familywise error rate-controlling procedures. And so they're getting a lot of use in the neuroimaging context and other big data context as well. In the next module, we'll talk a little about pitfalls with multiple comparisons. Okay, see you then. Bye.