关于此专项课程

Ask the right questions, manipulate data sets, and create visualizations to communicate results.

This Specialization covers the concepts and tools you'll need throughout the entire data science pipeline, from asking the right kinds of questions to making inferences and publishing results. In the final Capstone Project, you’ll apply the skills learned by building a data product using real-world data. At completion, students will have a portfolio demonstrating their mastery of the material.

Globe

100% 在线课程

立即开始,按照自己的计划学习。
Beginner Level

初级

You should have beginner level experience in Python. Familarity with regression is recommended.
Clock

完成时间大约为9 个月

建议 5 小时/周
Comment Dots

English

字幕:English, French, Chinese (Simplified), Greek, Italian, Portuguese (Brazilian), Vietnamese, Russian, Turkish, Hebrew, Japanese

您将学到的内容有

  • Check
    Build models based on new data types, experimental design, and statistical inference
  • Check
    Create products that can be used to tell stories about data to a mass audience
  • Check
    Formulate context-relevant questions and hypotheses to drive data scientific research
  • Check
    Utilize tools that transform and interpret large-scale datasets

您将获得的技能

R ProgrammingGithubMachine LearningData Cleansing
Globe

100% 在线课程

立即开始,按照自己的计划学习。
Beginner Level

初级

You should have beginner level experience in Python. Familarity with regression is recommended.
Clock

完成时间大约为9 个月

建议 5 小时/周
Comment Dots

English

字幕:English, French, Chinese (Simplified), Greek, Italian, Portuguese (Brazilian), Vietnamese, Russian, Turkish, Hebrew, Japanese

专项课程的运作方式

加入课程

Coursera 专项课程是帮助您掌握一门技能的一系列课程。若要开始学习,请直接注册专项课程,或预览专项课程并选择您要首先开始学习的课程。当您订阅专项课程的部分课程时,您将自动订阅整个专项课程。您可以只完成一门课程,您可以随时暂停学习或结束订阅。访问您的学生面板,跟踪您的课程注册情况和进度。

实践项目

每个专项课程都包括实践项目。您需要成功完成这个(些)项目才能完成专项课程并获得证书。如果专项课程中包括单独的实践项目课程,则需要在开始之前完成其他所有课程。

获得证书

在结束每门课程并完成实践项目之后,您会获得一个证书,您可以向您的潜在雇主展示该证书并在您的职业社交网络中分享。

how it works

此专项课程包含 10 门课程

1课程

The Data Scientist’s Toolbox

4.5
14,592 个评分
3,083 个审阅
In this course you will get an introduction to the main tools and ideas in the data scientist's toolbox. The course gives an overview of the data, questions, and tools that data analysts and data scientists work with. There are two components to this course. The first is a conceptual introduction to the ideas behind turning data into actionable knowledge. The second is a practical introduction to the tools that will be used in the program like version control, markdown, git, GitHub, R, and RStudio....
2课程

R Programming

4.6
11,146 个评分
2,400 个审阅
In this course you will learn how to program in R and how to use R for effective data analysis. You will learn how to install and configure software necessary for a statistical programming environment and describe generic programming language concepts as they are implemented in a high-level statistical language. The course covers practical issues in statistical computing which includes programming in R, reading data into R, accessing R packages, writing R functions, debugging, profiling R code, and organizing and commenting R code. Topics in statistical data analysis will provide working examples....
3课程

Getting and Cleaning Data

4.5
4,775 个评分
779 个审阅
Before you can work with data you have to get some. This course will cover the basic ways that data can be obtained. The course will cover obtaining data from the web, from APIs, from databases and from colleagues in various formats. It will also cover the basics of data cleaning and how to make data “tidy”. Tidy data dramatically speed downstream data analysis tasks. The course will also cover the components of a complete data set including raw data, processing instructions, codebooks, and processed data. The course will cover the basics needed for collecting, cleaning, and sharing data....
4课程

Exploratory Data Analysis

4.7
3,638 个评分
550 个审阅
This course covers the essential exploratory techniques for summarizing data. These techniques are typically applied before formal modeling commences and can help inform the development of more complex statistical models. Exploratory techniques are also important for eliminating or sharpening potential hypotheses about the world that can be addressed by the data. We will cover in detail the plotting systems in R as well as some of the basic principles of constructing data graphics. We will also cover some of the common multivariate statistical techniques used to visualize high-dimensional data....
5课程

Reproducible Research

4.5
2,528 个评分
387 个审阅
This course focuses on the concepts and tools behind reporting modern data analyses in a reproducible manner. Reproducible research is the idea that data analyses, and more generally, scientific claims, are published with their data and software code so that others may verify the findings and build upon them. The need for reproducibility is increasing dramatically as data analyses become more complex, involving larger datasets and more sophisticated computations. Reproducibility allows for people to focus on the actual content of a data analysis, rather than on superficial details reported in a written summary. In addition, reproducibility makes an analysis more useful to others because the data and code that actually conducted the analysis are available. This course will focus on literate statistical analysis tools which allow one to publish data analyses in a single document that allows others to easily execute the same analysis to obtain the same results....
6课程

Statistical Inference

4.1
2,533 个评分
538 个审阅
Statistical inference is the process of drawing conclusions about populations or scientific truths from data. There are many modes of performing inference including statistical modeling, data oriented strategies and explicit use of designs and randomization in analyses. Furthermore, there are broad theories (frequentists, Bayesian, likelihood, design based, …) and numerous complexities (missing data, observed and unobserved confounding, biases) for performing inference. A practitioner can often be left in a debilitating maze of techniques, philosophies and nuance. This course presents the fundamentals of inference in a practical approach for getting things done. After taking this course, students will understand the broad directions of statistical inference and use this information for making informed choices in analyzing data....
7课程

Regression Models

4.4
2,011 个评分
360 个审阅
Linear models, as their name implies, relates an outcome to a set of predictors of interest using linear assumptions. Regression models, a subset of linear models, are the most important statistical analysis tool in a data scientist’s toolkit. This course covers regression analysis, least squares and inference using regression models. Special cases of the regression model, ANOVA and ANCOVA will be covered as well. Analysis of residuals and variability will be investigated. The course will cover modern thinking on model selection and novel uses of regression models including scatterplot smoothing....
8课程

Practical Machine Learning

4.5
1,926 个评分
386 个审阅
One of the most common tasks performed by data scientists and data analysts are prediction and machine learning. This course will cover the basic components of building and applying prediction functions with an emphasis on practical applications. The course will provide basic grounding in concepts such as training and tests sets, overfitting, and error rates. The course will also introduce a range of model based and algorithmic machine learning methods including regression, classification trees, Naive Bayes, and random forests. The course will cover the complete process of building prediction functions including data collection, feature creation, algorithms, and evaluation....
9课程

Developing Data Products

4.5
1,325 个评分
274 个审阅
A data product is the production output from a statistical analysis. Data products automate complex analysis tasks or use technology to expand the utility of a data informed model, algorithm or inference. This course covers the basics of creating data products using Shiny, R packages, and interactive graphics. The course will focus on the statistical fundamentals of creating a data product that can be used to tell a story about data to a mass audience....
10课程

Data Science Capstone

4.5
639 个评分
177 个审阅
The capstone project class will allow students to create a usable/public data product that can be used to show your skills to potential employers. Projects will be drawn from real-world problems and will be conducted with industry, government, and academic partners....

Instructors

Jeff Leek, PhD

Associate Professor, Biostatistics

Roger D. Peng, PhD

Associate Professor, Biostatistics

Brian Caffo, PhD

Professor, Biostatistics

关于 Johns Hopkins University

The mission of The Johns Hopkins University is to educate its students and cultivate their capacity for life-long learning, to foster independent and original research, and to bring the benefits of discovery to the world....

常见问题

  • What is the refund policy?

  • Can I just enroll in a single course?

  • Is financial aid available?

  • Can I take the course for free?

  • Is this course really 100% online? Do I need to attend any classes in person?

  • Will I earn university credit for completing the Specialization?

  • How long does it take to complete the Specialization?

  • How often is each course in the Specialization offered?

  • What background knowledge is necessary?

  • Do I need to take the courses in a specific order?

  • What will I be able to do upon completing the Specialization?

  • Can I sign up for the course without paying or applying for financial aid?

More questions? Visit the 学生帮助中心.