This course introduces you to one of the main types of modeling families of supervised Machine Learning: Classification. You will learn how to train predictive models to classify categorical outcomes and how to use error metrics to compare across different models. The hands-on section of this course focuses on using best practices for classification, including train and test splits, and handling data sets with unbalanced classes.
课程信息
您将获得的技能
提供方

IBM
IBM offers a wide range of technology and consulting services; a broad portfolio of middleware for collaboration, predictive analytics, software development and systems management; and the world's most advanced servers and supercomputers. Utilizing its business consulting, technology and R&D expertise, IBM helps clients become "smarter" as the planet becomes more digitally interconnected. IBM invests more than $6 billion a year in R&D, just completing its 21st year of patent leadership. IBM Research has received recognition beyond any commercial technology research organization and is home to 5 Nobel Laureates, 9 US National Medals of Technology, 5 US National Medals of Science, 6 Turing Awards, and 10 Inductees in US Inventors Hall of Fame.
教学大纲 - 您将从这门课程中学到什么
Logistic Regression
Logistic regression is one of the most studied and widely used classification algorithms, probably due to its popularity in regulated industries and financial settings. Although more modern classifiers might likely output models with higher accuracy, logistic regressions are great baseline models due to their high interpretability and parametric nature. This module will walk you through extending a linear regression example into a logistic regression, as well as the most common error metrics that you might want to use to compare several classifiers and select that best suits your business problem.
K Nearest Neighbors
K Nearest Neighbors is a popular classification method because they are easy computation and easy to interpret. This module walks you through the theory behind k nearest neighbors as well as a demo for you to practice building k nearest neighbors models with sklearn.
Support Vector Machines
This module will walk you through the main idea of how support vector machines construct hyperplanes to map your data into regions that concentrate a majority of data points of a certain class. Although support vector machines are widely used for regression, outlier detection, and classification, this module will focus on the latter.
Decision Trees
Decision tree methods are a common baseline model for classification tasks due to their visual appeal and high interpretability. This module walks you through the theory behind decision trees and a few hands-on examples of building decision tree models for classification. You will realize the main pros and cons of these techniques. This background will be useful when you are presented with decision tree ensembles in the next module.
Ensemble Models
Ensemble models are a very popular technique as they can assist your models be more resistant to outliers and have better chances at generalizing with future data. They also gained popularity after several ensembles helped people win prediction competitions. Recently, stochastic gradient boosting became a go-to candidate model for many data scientists.
Modeling Unbalanced Classes
Some classification models are better suited than others to outliers, low occurrence of a class, or rare events. The most common methods to add robustness to a classifier are related to stratified sampling to re-balance the training data. This module will walk you through both stratified sampling methods and more novel approaches to model data sets with unbalanced classes.
常见问题
我什么时候能够访问课程视频和作业?
我订阅此证书后会得到什么?
完成课程后,我会获得大学学分吗?
还有其他问题吗?请访问 学生帮助中心。