[MUSIC] Well today, I want to talk a little bit about the role of electronic medical records in the administration or in the delivery of personalized medicine information, and the way in which I want to start that discussion is by showing you this cartoon. One of my favorite cartoons was published in the New Yorker in the year 2000, along with the announcement that the first draft of a human genome sequence had been completed. And the woman hands the pharmacist, her sequence and says, here's my sequence. Now you do something good with it, pick the right drug for me. And the things that I like about this cartoon are first of all, the pharmacist doesn't look like he knows what to do with that information and many of us in this kind of personalized medicine space sometimes feel the same way. The second is that she's not handing it to her doctor, she's handing it to part of a health care team and I think it's going to take teamwork to deliver this kind of information, to use this kind of information to better people's health care, and the third thing that's sort of interesting and I'm pretty sure is wrong is that she has it on a piece of paper. And whatever way in which we're going to deliver sequence information to the healthcare system to allow the system to use it, to presumably improve our care, it is not going to be on a piece of paper, that seems pretty clear to me. So this is the way the medical record room looked at Vanderbilt University Medical Center in the early 1990's and anyone who participated in the care of patients in that era or before is familiar with the large manilla charts and the manilla folders and how often it was that you couldn't find those charts in order to take care of a patient. Or you had a chart that was 14 volumes, which is probably two or three feet thick and you had to find some piece of information in that chart somewhere. So there are many things that having electronic medical records can enable and I want to give you some examples of those. So first of all, electronic medical records can help in the care of individual patients. Before there were electronic medical records, it was possible to have a call from the emergency room saying, I have a patient here who was discharged from your hospital 12 hours ago. They're semi-comatose, we don't know what the diagnosis is, we don't know anything about the patient, and we don't know what medications they take, and trying to find the chart of a patient like that would be next to impossible at 3 o'clock in the morning. It was probably locked up in some resident's desk waiting to have the discharge summary dictated. So having immediate access to that kind of information can really help improve the care of individual subject. It can also help the system. So in 2004, the drug Vioxx was recalled from the market, a very widely used painkiller, particularly in rheumatoid arthritis. It was withdrawn and one important question is which of our patients is taking Vioxx? Now, imagine trying to deal with a healthcare issue like that in a paper kind of environment. In an electronic environment, you can ask the system and then develop ways in which to contact those patients or ways in which the physician's offices can contact those patients. A physician says, well, I'm starting the patient on one drug. Do they take any other drugs whose dose I should adjust? The question of drug interactions, everybody's familiar with that idea, but the number of drug interactions is huge and some of them are very, very important. Some of them are less important and so the question is every time you add a drug to a complicated regiment, are you running the risk of a drug interaction? Do you need to know about that? Do you need to be alerted? On the other hand, I will say now that there are certain things electronic medical records don't do well. And if you alert physicians 20 times a day every time they change medications for drug interactions that are probably trivial, they start to ignore everything. And they start to ignore everything, they'll ignore the important stuff. Of course, electronic records have the problem that somebody has to do the input. There's a lot of push back in the healthcare community around the need to develop those records. But I think once the growing pains are over, the healthcare system can be efficient enough that you can not slow down to actually get the information into the record and actually improve care. So a researcher will come to me and will say, I'm interested in studying variability in vitamin D levels. How many of our patients have vitamin D levels across this entire system? That's a question that you could never even ask in a paper environment. Now you can ask and you can also ask as I'll show in a minute, do any of them have DNA samples that are stored somewhere? That would be an important question for a researcher, and then a physician might say, well, a patient tells me he gets high every time he gets codeine for dental pain, then a physician might say, I wonder why that is? I hope many of you will have guessed that that person is probably an ultra rapid metabolizer for CYP2D6 and they biotransform codeine into morphine very, very rapidly. Now it would be nice to know that information in a patient before they get codeine, you can do it by history or you can do it by having the genetic information embedded in their electronic medical record against the day that they might receive a CYP2D6 substrate. And in the next module, I'll talk about initial efforts at our center and other places to actually execute that kind of vision. So at Vanderbilt, we have created large resources for using electronic medical records as a tool for research for discovery and that's what I want to talk about here. And in the next module, I'll talk about how we use that to also delivery information that we hope will improve healthcare. So we have created a de-identified image of our entire electronic medical record and that is over 2.3 million subjects. Because it's de-identified, it's relatively easy to use for accruing large sets of patients and asking interesting questions about what it is their like? What it is that they have? How their diseases, or their conditions, progressed? And notice I haven't used the word genetic yet, because the 2.3 million people don't have DNA samples, but you can still use that information to predict who after admission to the hospital is likely to get a bedsore. After admission to hospital, which patient is likely to be discharged and readmitted within 30 days? And if you could use information from examining large numbers of subjects to develop predictive tools like that, you can start to personalize care on the level of preventing complications in the hospital. On the level of preventing remissions on the level of individualizing or personalizing care, which has nothing to do with genetics. At Vanderbilt, we've created a large data bank that we call BioVU that includes around 185,000 subjects that have DNA samples coupled to these identified electronic medical records. And we're in the process of doing a lot of genetic work on those at the GWAS level and other kinds of levels to use that information. To do discovery. Discovery of variants that drive variability in drug response, variants that drive variability and susceptibility to disease. So I alluded to this idea before, that an investigator might have this question around vitamin D. And in fact one of our investigators had exactly that question. So we've created an interface at Vanderbilt that allows an investigator to ask that question and get an answer within 60 seconds. So they go to a web site, the web site looks like this, and it's a lot of drag and dropping. But they basically drag and drop the vitamin D levels and ask, how many patients do we have with vitamin D levels that is something greater than 17,000 in our electronic medical record. And then they'll ask how many of those patients have genetic information, and it's something around 10,000. And then the intersection of those sets, right now it's around 1,795 a subject. So that's something that happens on a high speed search engine, and it happens instantaneously. So an investigator can find out how many samples there are, and is there viability in doing a study. One of the other things that we have done at Vanderbilt is we've developed a new kind of technology to interrogate the relationship between human diseases and genetic variation. The top of this slide shows the typical GWAS approach, or a typical approach across genetics, identify a phenotype. The phenotype could be high or low cholesterol. The phenotype could be breast cancer in your family. The phenotype could be macular degeneration. The phenotype could be red hair. And then you do a genome wide scan across 500,000 or a million common SNPs in the genome and you come up with signals using the Manhattan plot, that's familiar to many of you. That's the GWAS, the genome-wide association study. We've created the phenome-wide association study which we call PheWAS. And the PheWAS turns that question on its head. It says here is a gene or here is a genetic variant that seems to be of biological interest. But I don't know exactly what human phenotype it associates with. So I take the particular genetic variant and I do exactly the same search strategy as with the GWAS. I take the genetic variant and I say you either have the reference allele or you have a variant allele. And then for every diagnosis in the electronic medical record I say you have the diagnosis or you don't have the diagnosis. So you create a two-by-two table, you attach a P value to that two-by-two table, and you display the data just like a Manhattan plot. And you can see on that particular PheWAS that I'm showing you here, there are some dots that seem to rise up, and I'll show you those on a subsequent slide in a moment. The idea is, to do this experiment you have to have many, many patients with many, many diagnoses. And the electronic medical record lends itself to that very, very nicely. So we actually developed the technology. And then searched across the GWAS catalog that is maintained at the NIH, that lists all the publications that have had a GWAS result since the beginning of that technology in the mid 2000s. So there are around 12,000 publications, sorry, around 1,700 publications, with around 12,000 SNPs. Now, some of those are phenotypes that are not represented in the electronic record. Are you able to smell your urine, detect a funny smell in your urine after eating asparagus? Well that's not something we record in the electronic records, that's not something we can address. Hair color, baldness, those are the kinds of things that we can't address. But we can address things like susceptibility to diabetes and susceptibility to disease, variable drug responses. Each one of these dots represents a particular GWAS result that has been published. And the position of the dot is the position on the chromosome with which the dot is associated. So we took the ones we could test and just asked the question, can you replicate those using PheWAS? And around two-thirds of the associations were replicated in the electronic medical record. And that's all published at a website whose URL is shown there. What was really interesting is this result, and that is the pleiotropy result. And that is we take a genetic variant, in this particular case, in a gene called IRF4, and that IRF4 variant is associated with hair and eye color. But when we do an interrogation across the genome we detect very strong signals with P values of 10 to the -10th, 10 to the -12th, 10 to the -14th, incontrovertibly strong, with a variety of skin cancers and actinic keratosis. So that's a new association that is detected by PheWAS, and we're very excited by the idea that you can use this to start to dissect the genetic architecture of many complicated diseases. The PheWAS data set was actually done at Vanderbilt, and a number of other collaborating institutions, as part of something called the Electronic Medical Records and Genomics Network, or eMERGE. Sponsored by the National Human Genome Research Institute, one of the 27 institutes at the NIH. And there were five centers that are shown on this slide as part of eMERGE-1 that participated in that PheWAS study. Now what eMERGE-1 did was actually just ask whether DNA collections coupled to electronic records can be used for discovery in genome science. Discovering new associations using GWAS or PheWAS or whatever technology that we can come up with. The carrot for us was to identify a phenotype of interest in around 3,000 subjects and do a genome-wide study at our center. It turned out what happened was, every center had their own GWAS, they're all too small to come up with a reasonable signal. But when we aggregate them across the network we came up with reasonable signals. We had other questions in eMERGE having to do with data privacy. Assessing consent mechanisms, how do you get people to consent and what are they consenting for, what are they not consenting for? And then develop methods to actually find cases and controls for target phenotypes like drug response or disease progression in electronic medical records. This is one example of work from eMERGE, one additional example. This is a phenotype that we identified that we wanted to study across the network, it was hypothyroidism. So we identified around 1,000 cases and around 5,000 controls using an electronic phenotyping algorithm that was developed in one site and validated at others. And you can look at the positive predictive values and the negative predictive values for the case and control definitions on this table. And you can see that the algorithms perform quite well across multiple sites with multiple electronic medical records and electronic medical record architectures. And that's a very important result because it says that you can use the algorithms developed in one site with one EMR to find cases and controls at other sites, perhaps with a different EMR system. All the electronic algorithms are posted at a website called PheKB whose title is shown there. The National Human Genome Research Institute is very interested in the idea of genomic medicine and personalizing medicine. They have a number of efforts besides eMERGE. They're shown here, the NSIGHT Network is looking at neonatal sequencing. The Undiagnosed Disease Network is looking at rare cases and using exome sequencing to come up with answers. The Clinical Sequencing Research Consortium, CSER Is looking at using sequencing in targeted diseases like cancer, or eye diseases, and there are other initiatives that are shown here. The overall questions that are being asked in these, and many other initiatives around the country and in fact around the world, are which patients do we want to sequence, do we want to apply new technologies to? What are we trying to accomplish when we do the sequencing? Are we trying to find susceptibility to cancer, or are we trying to find causes of disease? Are we doing sequencing in healthy people, and how do we deal with that information? How do you analyze genomes? How do you get these data sets across many, many ancestries, because I've alluded to the importance of ancestry before. How do you actually figure out how to get that information, once you're sure that it's important, into electronic medical records so it can be used in the future? And how do you analyze the results of all those efforts in terms of healthcare outcomes, and in terms of how much does it cost? Does it ultimately save the system money by creating a system in which there are fewer drug side effects, and better targeted therapies, and better focus of healthcare resources on people who are at high risk for diseases? Now, we're doing that across the eMERGE network and across other networks in the United States. But there are many other efforts around the United States, and around the world, that are accumulating large number of subjects, with DNA samples coupled to dense phenotypic measures. Sometimes that's the electronic medical record, sometimes that's large cohort studies, where there's dense systematic phenotyping. The Biobank in the UK is one example of the latter approach. And these are some of the examples around the world. And you can see that there are literally millions of patients whose DNA samples and dense phenotypes have been coupled, and are available for these kinds of studies. And I call this the paradox of personalized medicine. So, in order to treat 1 patient in a 100 differently from the average, you have to have evidence, you have to have data. Where's that data going to come from? It's not going to come from studying 100 patients and finding the 1 outlier, that's an anecdote. It's going to come from studying 100,000 patients, and finding the 1,000 that are outliers. Or it's going to come from the million and finding the 10,000 that are outliers or the 1,000 that are outliers. So in order to be able to develop the evidence to personalize medicine, we actually have to start with very, very large numbers, very large denominators from which to draw these smaller subset numerators. And the kinds of things that we and many other people are thinking about in terms of using these resources, is to study rare genotypes. So, the SLC30A8, or the PCSK9 variance that I've talked about in previous modules, how many people across the world have those? Are there other examples like that? In order to find the rare genotypes and convince yourself that those are actually real, you have to have very large denominators. Studying patients with extreme phenotypes, rare adverse drug reactions is at the top of everybody's list, but there's the flip side. There are patients who get a single drug for a generally fatal condition and they're cured. What's that all about? We call those elite responders, it's a term that the cancer doctors like to use, but that happens in other areas of medicine. We would like to know what identifies those patients, is there something specific about them, or their disease that makes them elite responders? And then were very excited about the idea of using phenome wide scanning, but what we need is not a data set of 10,000 or 20,000, but a data set of 100,000 or 200,000, in which to start to really explore the relationships between genotype and phenotype across rare uncommon genetic variance. Now, at the end, how is this going to work? We're using the healthcare system for discovery, and in order to use the healthcare system for discovery, we have to have a couple of infrastructure requirements. One is, there has to be basic science behind all this. What do these genetic variance do? What do they mean? Who's going to discover the mechanism whereby a genetic variant affects the human genotype? That is a really, really important question. You have to have an informatics engine that actually maintains the electronic medical record, maintains the ability to mine and interpret the data, those are the requirements. And then you can use the healthcare system for discovery in the ways in which I've started to outline and in many other ways. But once you use the healthcare system for discovery, what you have to do is implement that discovery. And as you implement that discovery, you change the way people's healthcare is delivered. You start to accumulate new information on what happens to patients. You start to make new discoveries, and the whole thing turns into a positive feedback loop. The more we implement, the more genetic data there are. The more information we have, and the more new discoveries we'll make, and the more we'll be able to implement. So that's called a learning healthcare system. A healthcare system that sees, for example, what kinds of patients are at risk for bed sores after they're admitted to the hospital. That takes measures to prevent those bed sores, or pay special attention to those patients, and that in turn reduces the number of bed sores. So the healthcare system teaches us what to do, and in that way we learn, and we have better outcomes. I use the bed sore example because it specifically has nothing to do with genetics, but has everything to do with personalizing. But you can use exactly the same kind of logic that includes genetic and other markers. So that's the learning healthcare system, and that's the ultimate goal of the electronic medical records system, as a tool for discovery and implementation in personalized medicine. [SOUND] [SOUND] >> [APPLAUSE]