So the problem of interpretability relates to the problem of specificity, and I'll illustrate this with this map. So if you look at this map here, it's a typical brain imaging map, it's a group analysis from about 30 subjects. And the question is, what can you tell about that map, what does it mean in terms of psychology or function? And so, you know, in principle we should be able to look at such maps if we're experts in the brain, we should be able to say something about what task is being done at least. So let's look at the map, well I see anterior cingulate there. I see the insula, mid and anterior insula. So, I'm a pain researcher, in part, and so I can say, well, maybe this is pain because pain activates the anterior and cingulate insula. There's thalamus, the metathalamus also activated by pain. It's looking pretty good. And this is the secondary metasensory cortex which is another area that turns out to be pretty specifically activated by pain. And so, I'm doing a pretty good job of brain reading, right? There's some primary somatosensory cortex. And now let's look at a database of studies. Now this is a database that was built by Tal Yarkoni when he was in my lab a couple years ago, and it's a database, now, of nearly 10,000 studies. Each of those blue dots a reported coordinate from one study, a published study. So what we can do with this is look at all of the different studies, and we can say, what are the top hits? If we feed in this brain map, what are the topics that are associated with this brain map, that looks like this brain map. And what we get out of this, the top hits are noxious, heat, somatosensory, painful, sensation, stimulation, muscle, temperature. So this increases my confidence that this is really a pain map, right? Everybody on board? Great. The problem is, this is not a pain map. This map came from looking at [NOISE] the faces of people who rejected you. So this is a romantic rejection related map. So, if we looked at all this evidence, and you believed my brain reading, and you were fooled, then you're like many of us. And the point is, it's very difficult to actually infer what somebody is doing or experiencing based on their brain map. And there are many cases where we can be confused. So this is the problem with specificity, in a nutshell. I started with the anterior cingulate and anterior insula. And if we look at those two areas, those are often used to infer the presence of pain or other emotions, depending on what the purpose is of the interpreter. But here's the base rate of activation across about 3,500 studies. And the higher, the more likely it is that the area is activated across many tasks. And what you can see here is, no matter what kind of task people are doing across these thousands of studies, the anterior cingulate and the anterior insula are the most frequently activated areas in the brain. So, just getting activation somewhere in those areas has, arguably, the least amount of information about what somebody is actually experiencing. So, this areas are not really specific for pain or any other type of affect at this level of analysis. So, what's happening here is when we interpret a brain image and we say, the insula is active, that must be discussed. Or that must be love or something else. We're implicitly treating those brain images as a marker, as a biomarker. And a biomarker is an objectively measured process that serves as a measure of some other mental experience or process. So for brain finding, like cingulate activity, is use as a biomarker, then activation of that pattern is assumed to imply the presence of that state, like pain, or decision conflict, or anything else. So, if we look at the literature, what's happening is that people are using fMRI activity as a marker for many different processes. This happens in the popular press as I showed you, and it also happens a lot in the scientific literature. So we think that we have markers for our reward, that's the nucleus accumbens, or value, with the medial prefrontal cortex, memory, pain, etc. Here are a couple of common ones. Amygdala activity is often taken as an indicator of negative emotion. And pain processing activity in the areas that you see here is often taken as an indicator for pain. So, if a drug treatment or a psychological treatment influences those markers, I might infer that pain or emotion is influenced. But, as we just saw, this is not a valid inference. So, there are not yet biomarkers for any of these processes. So, let's look at a little bit more systematically about why not. Why are brain maps not biomarkers? So, we have a couple of problems. We have a problem of definition and replication, and what this means is every time you get amygdala response, activity, it can be a little bit different. The voxels could be different, the relative levels of activity in the voxel could be different. And the amygdala is a very small structure but actually contains hundreds and hundreds of standard-sized voxels. So there's tremendous flexibility from study to study in picking results out that seem to support the hypothesis. What we need is exact replications, not only in which voxels but of the relative activity and magnitudes of activity in the different voxels. And we'll deal with that later when we talk about machine learning and multi-varying pattern analysis. But for now, this current state of affairs reflects a lack of exact replication at the spatial pattern level, at the voxel level. That causes a lot of problems. Number two is, most of the results that we get from studies are group maps. We don't apply them to individual cases, individual people, and they're not validated for application. And before we really start to use imaging to say something about a person, what a person's thinking, or feeling, or experiencing, we need to validate them at this level of the individual person. We need to apply them to individuals. Three is a problem of diagnostic value. So this relates to the specificity problem I talked about a moment ago. But if you can think of diagnostic value, if you go back to the earlier lecture, in terms of Bayes' Rule. So what we'd like to know is, what's the probability of experiencing a psychological event, like pain, given the presence of any particular brain marker, like anterior cingulate activity. So that's probably a sight given brain. And that's called the positive predictive value of the brain image as a test. And, this breaks down into two related problems. One, is a problem of sensitivity. We need to know how big the effects of the manipulations are and is it really reliable. And this relates to the probability of observing that brain marker, anterior cingulate, given that I'm in pain. And if that effect is really strong I have high sensitivity. But we don't typically quantify how strong those effects actually are. And secondly, there's the problem of specificity, that I mentioned a moment ago. And this relates to whether these observed patterns are actually specific enough to be used for those bio markers or what they're specific for. What class of events? In Bayesian terms, this relates to the probability of observing that brain marker in the absence of the psychological event. Like in the absence of pain. And as we saw from the earlier plot, with the anterior cingulate and anterior insula, that specificity is extremely low. Because probably of activation in the absence of pain is just about as great as the probability in the presence of pain. So finally, we'll talk about the translation crisis. And all of these previous problems feed into the problem of translation. Because to have practical applications, we need things that are replicable, things that work, things that can be applied to individual cases and say something meaningful about the psychological status or clinical status of a person. And so that's why this is coming last because we have to solve many of these other problems before we can really address the translation crisis. And this is the article that influenced me quite a bit by Kapur, Phillips, and Insel. The title is, Why has it taken so long for biological psychiatry to develop clinical tests and what do we do about it? And there's this feeling that, in science there's lots of research, in fact, here is a plot of emotion studies that we gathered. It's 163 studies of emotion. Yellow is perceived emotions, red are experienced emotions. So if you want to know where the emotional brain is, there it is. It's everywhere in the brain. Of course, these are actually not random activations, but it's very difficult to sort through exactly which patterns of activation are related to which outcomes. So we have an accumulation of findings, but we don't have markers that we can apply to a person and say, this is how sad a person is, or this is whether a person is angry or not, or whether a person is in pain or not, and so forth. So some of the causes of our translation crisis are, the lack of brain patterns sensitive and specific to clinical features or particular clinical outcomes, lack of application to the individual-person level, and finally, when we develop these brain maps, we don't usually share them across sites and test how they work in new studies, new samples, new populations. So that's another critical piece. We have to take these brain maps, instead of just putting them in a study, and then publication as the end goal, that's really the starting point for developing, and testing, and refining these measures for increasing clinical and translational utility, if they work. So there are many positive responses, and I think that partly this series of debates has raised awareness about the issues and has led to a lot of positive responses by the community. I'll just highlight a few of them here at the end of this module. So, first of all, the broad criticism has some good elements to it, some positive features. Improves awareness. Improves changes in practice that are being more widely. Implemented and talked about. But there are also some negative implications. So, there's a wide spread criticism of science that really goes beyond the real problems and problematic studies and it gets applied very broadly. And this is a very dangerous thing because the truth is that it's very difficult, even in the best case, to get it right. And it takes a lot of resources, and thinking, and expertise. And we have to build on the successful cases and test them and refine them. And there are always going to be cases where things seem very promising and they don't pan out. But this doesn't mean that the scientific process is not working. We just need to make it work as well as we can. Let's look, again, at some of the issues of selection bias and some of the resolutions, some of the solutions to this. So let's look at the file drawer problem. One solution that's been adopted is national registries or pre-registered trials, as well as outlets for publication of null findings. Flexibility in the use and publication of experiments has also been addressed, in part, by study pre-registration. In some cases, data embargoes which are standard in clinical trials. You can't look at the data until after you've collected a certain number. And you have to stop when you say you're going to stop collecting it. Also perspective data sharing is becoming increasingly popular in neuroimaging. And that allows us to share data as it is coming in and evaluate whether a particular experiment is panning out or not. Flexibility at the model level is handled, in part, by standardized pipelines that are applied consistently across studies. That doesn't mean it should be one size fits all. But having a standard pipeline is really helpful to avoid flexibility. And, making principal choices ahead of time and using those choices instead of looking backwards based on your results and trying to change your pipeline to get better results. Also study pre-registration and blinding of experimenters to the research hypothesis is also helpful in that way. And finally, the voxel selection bias problem can be ameliorated by true a priori hypotheses, that can be derived from meta-analyses of literature, for example. And, attempts to do exact replication and the sharing of maps and findings in electronic form, so they can be tested precisely across laboratories. And there's a number of positive responses by the community more broadly. So there are homes and funding for replication of null results like PsychFileDrawer project, and the Center for Open Science, and Reproducibility Project. There's a new focus in many journals on replicability and Registered Reports. Some of them are publishing null findings, in some cases more often, and some of them are accepting Registered Reports. And finally, open, online platforms for conducting replications and research like this platform PsyTurk from Todd Gureckis and colleagues. In your imaging community, there are a number of positive responses that include collaborative efforts and consortium sharing efforts, data sharing efforts like the OpenfMRI Project, The ADNI for Alzheimer's, 1,000 Functional Connectomes, the ABIDE project for autism, ADHD200, Human Connectome Project, and others that are coming down the pipeline. And there's also a broad criticism of heuristic reverse inference. That's making these claims based on simply reading a map with your eyes, without formal assessment of positive predictive value. And increased efforts at making more formalized quantitative reverse inference. And these efforts include the brain decoding or machine learning literature that we'll talk about later in the course and other approaches as well. And finally, an important thing is, these efforts to develop and share biomarkers and patterns of activity across laboratories, which is critical for evaluating how they hold up in different populations. So that's the end of this module, thank you. [NOISE]