[SOUND] Welcome back. In this module, we're going to continue the discussion of statistical efficiency and how to optimize designs. We talked about eight principles of fMRI design, including all of these factors here. And now we're going to unpack each of these principles one by one and explain it in a little bit more depth. So first principle is sample size. Number of subjects is usually the rate-limiting factor. And so larger sample sizes increase power and often dramatically, especially if your sample sizes are small, up to 30 or 40 participants. Secondly, scan time. More time on task is helpful. That can mean scanning for longer within a person but it can also mean maximizing the time in which you're driving up the signal of interest for each task type or each progressor of interest. And you have to also balance this against psychological effects of fatigue, habituation, head movement, and other factors. So, in terms of within-person scan time, maximize the time on task. So, what we'll see now is a simple simulation in which we're generating event related designs with evenly spaced events. We'll go from very close together events to very far apart events. So now events are very close together. And now they're getting farther and farther apart. Once every 20 seconds and now we go to once every 30 seconds. As you'll see, as the design gets sparser, then we've got more efficiency and there is an optimal level. Now we'll do this again with longer duration blocks, five second blocks, and now we'll do it again with ten second blocks. So now we've got ten seconds on, ten seconds off. And as you'll the see, the efficiency can reach an even higher level. Now, if we take a closer look at those things, what we see at the bottom is very sparse events, 1 second on, 15 off. And that's very low efficiency. because, essentially, we're not driving the signal up very often or very high. With five second blocks, that's better efficiency because we're driving the signal up higher and it's staying at that peak for longer. And now, with ten seconds on, ten seconds off, we have the best efficiency in this simulation. And in general, equal amounts of time on task and rest is going to be optimal for optimizing a task versus rest comparison. So we want equal amounts of data sampled at the highs and the lows of whatever we're comparing. So in this case, ten on and ten off is a better strategy. Now, we talked before also about efficiency in a multi-level setting. So in a group analysis, efficiency and power depend on the within-person variance and the between-person variance. And that depends on the within-person standard errors. That's the first part of the equation you see here, plus the variance due to the true individual differences across people. And that's the between-person variance, of course divided by the sample size. So how do we increase this? Well, increasing within-person efficiency can help up to a point and increasing sample size always helps. The greater the between-person variance, then the more that the sample size is important, relative to how long you scan which only affects the within-person variance. So as we said before, even if efficiency at the first level is infinite, the power at the group level is limited by the between subject's variance divided by the sample size. So how do we balance scan time and participants? Well, this depends on the ratio of between to within-person variance, which you'd have to estimate for any given task and any given brain area or areas. So as an empirical example, here's a study that we completed recently. It's an N-back versus rest comparison, and we calculated power in these voxels here, which are N-back related. And now what we're looking at is the optimal allocation of 40 scan hours. If I just had 40 hours to spend, I can scan 40 people for one hour each minus time to set up, so about 40 minutes each. Or I can scan fewer people for longer, more sessions each, or balance it out. What you see in the pie chart here is the breakdown of the within and between-person variance that we estimated from the data and given that, we get an answer. The optimal scan time is about 40 subjects at about 40 minutes each. We also see three thresholds here. What we're looking at on the top curve is P is less than 0.05 uncorrected, very liberal. P is less than 0.001 in the middle curve and the family wise error rate corrected value is the third curve. And what we see also is that with whole-brain family wise error rate correction, we still only have about 10% power with 40 subjects, given the effect sizes. So this and other power calculations we discussed earlier can help you decide whether the subjects that you can run are really enough or not. Let's look at the next principle now. This is the number of conditions that you test. For detection power, fewer conditions is better. Optimally, you want two conditions, task versus control, A versus B. For interpretability and inference, we might want to include more conditions, more different kinds of control conditions or comparison conditions. This is going to decrease our efficiency and it'll only help with interpretability. So I like to say, if you're doing your first study and you don't know what to expect, do a block design with two event types or two block types. If you already know you're getting good signal, you've patterned this before and you want to really maximize your interpretability, then it's okay to include more different types of events for inferential purposes. But you're going to lose power. Now let's talk about grouping of conditions, how you group the events. For detection power and robustness, block designs are the most efficient designs. Given that the task is psychologically amenable to creating a block design with strong neural responses. For more specific inferences that are linked to particular events at particular times, then event-related designs are better. Let's first look at three types of designs. On the top is a block design. In the middle, we have a dense event-related design, with lots of stimuli packed in. And then on the bottom, we have a sparse event-related design. These are the efficiencies for these three types of events. So as you can see here, the sparse event-related design is very low efficiency. The dense event-related design is higher efficiency. And the block design is the highest efficiency of all. So this leads us to consider a fundamental tradeoff between detecting activity in task A versus B, we'll call that contrast detection, and estimating the shape of the HRF response, which allows us to link activity to particular events at particular times. As a rule, blocks of the same trials provide greater power to detect differences among conditions. And that's because we're building up giant mounds of activity, presenting lots of repeated trials. And that also makes it very robust to the HRF shape. It doesn't matter exactly what the shape is because we're mushing everything together anyway. Conversely, that means we can't recover what the shape actually is. And so we can't say that the activity was coming from these particular events. It might be coming from what happened in between the events, what happens is a function of the general state that one's in. Conversely, unpredictable sequences of trials, like random event-related designs, or optimized designs, give us greater power to estimate the shape of the hemodynamic response. And these designs look very different. Instead of big mounds of activity, they essentially orthogonalize the design with respect to itself, shifted over in time. So what we'll see here is that increasing the number of conditions will cost us in power. So this movie shows a simulation with events going from very short blocks to very long blocks, that's what you see now. And first, we're looking at only two blocks, that's the blue line. And now, we'll add a third block type. That's three block types alternating. You can see the peak efficiency is lower. And now, we'll add a fourth block type. And we'll see that the efficiency for detecting differences across those block types is lower still. So every time we add conditions, we have more things to compare, less time on each task, and that's going to cost us in power. The only gain is really in the interpretability and the number of contrasts that we can try to test. And this graph also lets us look at what happens with short versus very long blocks. So what you can see here is that the optimal block length in this simulation is about 18 to 20 seconds. And that's the guideline. Use short blocks, 18 to 20 seconds, to maximize power. What we see here, though, is at very short blocks, we have low predictor variance. That's because the blocks are alternating so fast that the hemodynamic response is a blurring filter, and it's blurring them all together so everything looks the same. What about the long blocks, what's wrong with those? Well, with long blocks, the frequencies are going to overlap with the high-pass filter. And so when we apply the high-pass filter, we're going to remove variance from the design. So here we have a standard high-pass filter and it's going to really cut off a lot of the variability in the design. And it's going to reduce our power dramatically. If we did not use a high pass filter, then the efficiency would level off as a function of block duration. It wouldn't matter to have longer blocks, and that would be assuming that we don't have room with low frequency noise. But then we'd have a lot of noise to contend with from the scanner that we could have otherwise filtered out.