An introduction to the statistics behind the most popular genomic data science projects. This is the sixth course in the Genomic Big Data Science Specialization from Johns Hopkins University.

Loading...

来自 约翰霍普金斯大学 的课程

Statistics for Genomic Data Science

116 评分

An introduction to the statistics behind the most popular genomic data science projects. This is the sixth course in the Genomic Big Data Science Specialization from Johns Hopkins University.

从本节课中

Module 4

In this week we will cover a lot of the general pipelines people use to analyze specific data types like RNA-seq, GWAS, ChIP-Seq, and DNA Methylation studies.

- Jeff Leek, PhDAssociate Professor, Biostatistics

Bloomberg School of Public Health

In this class, I've tried to cover the key topics of statistical genomics,

but as you might imagine, this is a really complicated topic.

And actually, a whole sequence of classes has been designed around just this one

topic that's being taught by my colleague Rafael Irizarry up at edX.

And I think that, that's a really useful set of classes if you've like,

enjoyed what you've learned here, and

you want to get a little more in depth, this is a great place to start.

But it turns out that, sort of, genomics and

statistical genomics is sort of a huge area of research and

it's sort of actively ongoing and so often you need to get some help.

And that help isn't always in a form of classes where

we can sort of formalize the basic ideas but

it's very hard to formalize the latest and greatest for any new technology.

I do suggest that you take more statistics classes if you're interested.

We also teach a set of statistics classes in data science

that are more general purpose, but will teach some of the topics we've learned

here in maybe a little bit more depth.

So one thing to keep in mind is that

no matter how much you've learned particularly in these classes or

in other classes, there's always a lot more to learn.

And so there's this sort of default assumption that statistics is

once learned the basic concepts,

you can just apply it without sort of thinking about it too much.

And I think that I, for

certain, am of the opinion that statistics is a very complicated and difficult topic

that deserves careful attention just like any other part of scientific endeavor.

And so, this is a post I wrote kind of tongue-in-cheek talking about how we

require surgeons to be very well trained and to be very careful,

but sometimes we let data scientists kind of get away with whatever they want.

And so, I think it's important that you know when and where to look for help.

And so you can look for help locally.

If you're here at Johns Hopkins, we have a biostatistics consulting center which

can help out with this statistical genomics experiments.

And in many sort of research institutions or many universities or

colleges, there'll be a consulting center that could be a local resource for

you to talk to people about statistics.

And then you could also go to online resources like Stack Overflow

if you want to ask questions about coding or R programming.

If you have questions about bioconductor packages,

you can go to bioconductor support site.

Or you can go to sequencing specifics support sites, like SEQanswers, which will

give you more answers to the questions that are specific to the latest and

greatest sequencing types that maybe people are seeing in their labs or

in their research groups.

So the key point, the key take-home message though, is that

know that if you're getting a little bit uncomfortable with the statistics or

the functions that you're using, it's worth looking into it more and

it's worth going and asking for help.