This is Maynard Olson here, offering at Professor Weiâs invitation. <<A few comments to students in that the Peking University Bioinformatics MOOC about the history of bioinformatics, particularly in connection with the Human Genome Project>> So perhaps you understand genomics and information technology grew up together, with genomics piggybacking under the development of personal computers, powerful desktop workstations, database technology, networking, and Internet. It will be hard for many MOOC students to appreciate that, with rare exceptions, such as protein crystollagraphers, experimental biologists barely used computers thirty years ago.biologists barely used computers thirty years ago. When I was an assistant professor of genetics at Washington University on in St. Louis in 1980, I made a deal with a crystollagrapher to gain access to *as many* computer, a single-processor and had less computing power than an iPhone. I*âm unaware* from my lab for a whole series of utility ducts over the *hist* building and connected down terminal of this computer. As I set it up, everyone in my department stopped by to look at it, and asked me what I planned to do with it. There were no other computers in the genetics department at that time. And even a word processor in the office. A few months later, I bought a second terminal, causing even more amazement. My goal was to develop a detailed map of the restriction sites across the 15-million-base-pair yeast genome. Sequencing was still out of the question; *theyâre* recognized that detailed physical mapping with being the central first step. Twenty years later, I described our method, which was essentially the same as the one automatically applied in the Human Genome. in the 2001 Nature issue devoted to publication of the rough graph of the human reference sequence. You can read that description of the mapping technique in this Nature issue. Weâve had an e-mail recently from a student asking me to explain how the map of the *figure* was built from the schematic data that was shown. Iâm confident that the MOOC students can build this map on your own. Itâll also be about to appreciate the computational complexity of building such maps on a huge scale. For yeast to be needed⦠roughly ten thousand zillions of data for the humanï¼ that number might to million Computational biology is full of problems, such as building maps from this type of data. When generalizational make about them, itâs unforgotten as if youâve gotten for often* such projects, be sure to take for account inevitable errors of the data. Computer scientists tempt to over-abstract problems of this type, idealize some, whereas in the real world of experimental biology inadequacies of the data which cause the logical inconsistencies the final composite maps of the sequences that are being produced, are the rule, not the exception. The purpose of our physical mapping was to guide the long range assembly of the genome sequence. Although by 2001, it was clear that whole genome assembly was adequate for capturing basic long range structure of the genome, clone-based maps were essential for the finishing phase of the human genome project. Since there are *lot of us to marginalized the finishing* problem, that is local difficulties in the sequence primarily due to the repeats, it could be solved at the level of individual clones, thereby avoiding entangling sequences and one part of the genome with those from another. The finishing effort proved a means with hundreds of experts expert finished their working from about the year of 2001 until 2003 to resolve residual problems in the sequence. Of course, this polishing of human reference sequences was ongoing, some highly repetitive regions such as those surround centromeres. Telomeres remain out of reach even by current methods. However, the large investment of finishing pay huge evidence in subsequent uses of the human reference sequence. The current build 37 of the sequence is employed every day, everywhere, in the world that genomics is practiced. It is the highest quality multi-giga-base-pair sequence in existence in the starting point for the study of human genetic variation, the primary application of next-gen sequencing in the human, and the major focus of the current bioinformatics research. As a word of encouragement to MOOC students, I will turn out that some of the high level lessons of my forty year career in genomics Here are a few that I think remain relevant today. First, go deep, both with the biology and the computer science. Biology is rich, complex, constantly changing. Thereâs no substitute for spending time with real biologists.Talking to them, reading their papers, going their seminars, learning how they think. On the other hand, computer sciences and statistics offer a foundation of concrete reliable theory that the biology lacks. When I started writing simulation software for our physical mapping project, the only sorting algorithm I knew is interchange sort. It had n-squared computational complexity. And I get very far until I learned some computer science theory, not just practical programming skills. Donât worry that there will be too much data. Genomicists are small-time players in the IT revolution. The IT industry was on an arrest in its hardware, networking, and software innovations until everyone in China can stream a different high-definition video to his or her mobile device simultaneously. We can continue to piggyback on this tsunami of technological innovation as we have for the past 30 years. Try to use general solutions to your problems rather than getting involved in technologies that are overly specific to biological applications. Importantl, keep what you do, tie to the genome sequence. Iâve watched one area of biology after another during the past 30 years. Have a renonson such that engages the sequences more deeply. Proteomics is a dramatic example with the availability of the DNA sequence transformed to the biochemistry of proteins. The discovery that biological information is digital. Encoded in a long base-four number was the real message of the double helix. It was one of the greatest and the most unexpected scientific discoveries of all time made 60 years ago. Biologists a thousand years from now will gaze at the same sequence we see today. Unlike the astronomers, they were not able to build better telescopes to achieve higher resolution. The ultimate resolution in a DNA base pairs. The sequence is what it is. We already know most of it. Our successors will know unimaginably more about how to read its hidden truth. My generation of genomicists was preoccupied with acquiring reference sequence. The generation represented by most students in this MOOC confront the rigid problems and start figuring out what I means. We just putting our toes on that water. Good luck, with your studies and researches.