This course focuses on one of the most important tools in your data analysis arsenal: regression analysis. Using either SAS or Python, you will begin with linear regression and then learn how to adapt when two variables do not present a clear linear relationship. You will examine multiple predictors of your outcome and be able to identify confounding variables, which can tell a more compelling story about your results. You will learn the assumptions underlying regression analysis, how to interpret regression coefficients, and how to use regression diagnostic plots and other tools to evaluate the quality of your regression model. Throughout the course, you will share with others the regression models you have developed and the stories they tell you.

This session starts where the Data Analysis Tools course left off. This first set of videos provides you with some conceptual background about the major types of data you may work with, which will increase your competence in choosing the statistical analysis that’s most appropriate given the structure of your data, and in understanding the limitations of your data set. We also introduce you to the concept of confounding variables, which are variables that may be the reason for the association between your explanatory and response variable. Finally, you will gain experience in describing your data by writing about your sample, the study data collection procedures, and your measures and data management steps. ...

Some Guidance for Learners New to the Specialization10分钟

Getting Set up for Assignments10分钟

Tumblr Instructions10分钟

How to Write About Data10分钟

Writing About Your Data: Example Assignment10分钟

周

2

完成时间为 4 小时

Basics of Linear Regression

In this session, we discuss more about the importance of testing for confounding, and provide examples of situations in which a confounding variable can explain the association between an explanatory and response variable. In addition, now that you have statistically tested the association between an explanatory variable and your response variable, you will test and interpret this association using basic linear regression analysis for a quantitative response variable. You will also learn about how the linear regression model can be used to predict your observed response variable. Finally, we will also discuss the statistical assumptions underlying the linear regression model, and show you some best practices for coding your explanatory variables
Note that if your research question does not include one quantitative response variable, you can use one from your data set just to get some practice with the tool.
...

Multiple regression analysis is tool that allows you to expand on your research question, and conduct a more rigorous test of the association between your explanatory and response variable by adding additional quantitative and/or categorical explanatory variables to your linear regression model. In this session, you will apply and interpret a multiple regression analysis for a quantitative response variable, and will learn how to use confidence intervals to take into account error in estimating a population parameter. You will also learn how to account for nonlinear associations in a linear regression model. Finally, you will develop experience using regression diagnostic techniques to evaluate how well your multiple regression model predicts your observed response variable.
Note that if you have not yet identified additional explanatory variables, you should choose at least one additional explanatory variable from your data set. When you go back to your codebooks, ask yourself a few questions like “What other variables might explain the association between my explanatory and response variable?”; “What other variables might explain more of the variability in my response variable?”, or even “What other explanatory variables might be interesting to explore?” Additional explanatory variables can be either quantitative, categorical, or both. Although you need only two explanatory variables to test a multiple regression model, we encourage you to identify more than one additional explanatory variable. Doing so will really allow you to experience the power of multiple regression analysis, and will increase your confidence in your ability to test and interpret more complex regression models. If your research question does not include one quantitative response variable, you can use the same quantitative response variable that you used in Module 2, or you may choose another one from your data set. ...

In this session, we will discuss some things that you should keep in mind as you continue to use data analysis in the future. We will also teach also you how to test a categorical explanatory variable with more than two categories in a multiple regression analysis. Finally, we introduce you to logistic regression analysis for a binary response variable with multiple explanatory variables. Logistic regression is simply another form of the linear regression model, so the basic idea is the same as a multiple regression analysis. But, unlike the multiple regression model, the logistic regression model is designed to test binary response variables. You will gain experience testing and interpreting a logistic regression model, including using odds ratios and confidence intervals to determine the magnitude of the association between your explanatory variables and response variable.
You can use the same explanatory variables that you used to test your multiple regression model with a quantitative outcome, but your response variable needs to be binary (categorical with 2 categories). If you have a quantitative response variable, you will have to bin it into 2 categories. Alternatively, you can choose a different binary response variable from your data set that you can use to test a logistic regression model. If you have a categorical response variable with more than two categories, you will need to collapse it into two categories.
...

Awesome course. More than regression generation, they have explained in details about how to interpret regression coefficients and results and how to make conclusions. 5 Stars

创建者 PC•Nov 28th 2016

This was a great course. I've done a few in the area of stats, regression and machine learning now and the Wesleyan ones are the most well-rounded of all of them

At Wesleyan, distinguished scholar-teachers work closely with students, taking advantage of fluidity among disciplines to explore the world with a variety of tools. The university seeks to build a diverse, energetic community of students, faculty, and staff who think critically and creatively and who value independence of mind and generosity of spirit.
...

关于 数据分析和解释 专项课程

Learn SAS or Python programming, expand your knowledge of analytical methods and applications, and conduct original research to inform complex decisions.
The Data Analysis and Interpretation Specialization takes you from data novice to data expert in just four project-based courses. You will apply basic data science tools, including data management and visualization, modeling, and machine learning using your choice of either SAS or Python, including pandas and Scikit-learn. Throughout the Specialization, you will analyze a research question of your choice and summarize your insights. In the Capstone Project, you will use real data to address an important issue in society, and report your findings in a professional-quality report. You will have the opportunity to work with our industry partners, DRIVENDATA and The Connection. Help DRIVENDATA solve some of the world's biggest social challenges by joining one of their competitions, or help The Connection better understand recidivism risk for people on parole in substance use treatment. Regular feedback from peers will provide you a chance to reshape your question. This Specialization is designed to help you whether you are considering a career in data, work in a context where supervisors are looking to you for data insights, or you just have some burning questions you want to explore. No prior experience is required. By the end you will have mastered statistical methods to conduct original research to inform complex decisions....