提供方

Machine Learning 专项课程

University of Washington

课程信息

4.7

2,413 个评分

•

415 个审阅

Case Studies: Analyzing Sentiment & Loan Default Prediction
In our case study on analyzing sentiment, you will create models that predict a class (positive/negative sentiment) from input features (text of the reviews, user profile information,...). In our second case study for this course, loan default prediction, you will tackle financial data, and predict when a loan is likely to be risky or safe for the bank. These tasks are an examples of classification, one of the most widely used areas of machine learning, with a broad array of applications, including ad targeting, spam detection, medical diagnosis and image classification.
In this course, you will create classifiers that provide state-of-the-art performance on a variety of tasks. You will become familiar with the most successful techniques, which are most widely used in practice, including logistic regression, decision trees and boosting. In addition, you will be able to design and implement the underlying algorithms that can learn these models at scale, using stochastic gradient ascent. You will implement these technique on real-world, large-scale machine learning tasks. You will also address significant tasks you will face in real-world applications of ML, including handling missing data and measuring precision and recall to evaluate a classifier. This course is hands-on, action-packed, and full of visualizations and illustrations of how these techniques will behave on real data. We've also included optional content in every module, covering advanced topics for those who want to go even deeper!
Learning Objectives: By the end of this course, you will be able to:
-Describe the input and output of a classification model.
-Tackle both binary and multiclass classification problems.
-Implement a logistic regression model for large-scale classification.
-Create a non-linear model using decision trees.
-Improve the performance of any model using boosting.
-Scale your methods with stochastic gradient ascent.
-Describe the underlying decision boundaries.
-Build a classification model to predict sentiment in a product review dataset.
-Analyze financial data to predict loan defaults.
-Use techniques for handling missing data.
-Evaluate your models using precision-recall metrics.
-Implement these techniques in Python (or in the language of your choice, though Python is highly recommended)....

立即开始，按照自己的计划学习。

根据您的日程表重置截止日期。

建议：7 weeks of study, 5-8 hours/week...

字幕：English...

Logistic RegressionStatistical ClassificationClassification AlgorithmsDecision Tree

立即开始，按照自己的计划学习。

根据您的日程表重置截止日期。

建议：7 weeks of study, 5-8 hours/week...

字幕：English...

Week

1Classification is one of the most widely used techniques in machine learning, with a broad array of applications, including sentiment analysis, ad targeting, spam detection, risk assessment, medical diagnosis and image classification. The core goal of classification is to predict a category or class y from some inputs x. Through this course, you will become familiar with the fundamental models and algorithms used in classification, as well as a number of core machine learning concepts. Rather than covering all aspects of classification, you will focus on a few core techniques, which are widely used in the real-world to get state-of-the-art performance. By following our hands-on approach, you will implement your own algorithms on multiple real-world tasks, and deeply grasp the core techniques needed to be successful with these approaches in practice. This introduction to the course provides you with an overview of the topics we will cover and the background knowledge and resources we assume you have....

8 个视频（共 27 分钟）, 3 个阅读材料

What is this course about?6分钟

Impact of classification1分钟

Course overview3分钟

Outline of first half of course5分钟

Outline of second half of course5分钟

Assumed background3分钟

Let's get started!分钟

Important Update regarding the Machine Learning Specialization10分钟

Slides presented in this module10分钟

Reading: Software tools you'll need10分钟

Linear classifiers are amongst the most practical classification methods. For example, in our sentiment analysis case-study, a linear classifier associates a coefficient with the counts of each word in the sentence. In this module, you will become proficient in this type of representation. You will focus on a particularly useful type of linear classifier called logistic regression, which, in addition to allowing you to predict a class, provides a probability associated with the prediction. These probabilities are extremely useful, since they provide a degree of confidence in the predictions. In this module, you will also be able to construct features from categorical inputs, and to tackle classification problems with more than two class (multiclass problems). You will examine the results of these techniques on a real-world product sentiment analysis task....

18 个视频（共 78 分钟）, 2 个阅读材料, 2 个测验

Intuition behind linear classifiers3分钟

Decision boundaries3分钟

Linear classifier model5分钟

Effect of coefficient values on decision boundary2分钟

Using features of the inputs2分钟

Predicting class probabilities1分钟

Review of basics of probabilities6分钟

Review of basics of conditional probabilities8分钟

Using probabilities in classification2分钟

Predicting class probabilities with (generalized) linear models5分钟

The sigmoid (or logistic) link function4分钟

Logistic regression model5分钟

Effect of coefficient values on predicted probabilities7分钟

Overview of learning logistic regression models2分钟

Encoding categorical inputs4分钟

Multiclass classification with 1 versus all7分钟

Recap of logistic regression classifier1分钟

Slides presented in this module10分钟

Predicting sentiment from product reviews10分钟

Linear Classifiers & Logistic Regression10分钟

Predicting sentiment from product reviews24分钟

Week

2Once familiar with linear classifiers and logistic regression, you can now dive in and write your first learning algorithm for classification. In particular, you will use gradient ascent to learn the coefficients of your classifier from data. You first will need to define the quality metric for these tasks using an approach called maximum likelihood estimation (MLE). You will also become familiar with a simple technique for selecting the step size for gradient ascent. An optional, advanced part of this module will cover the derivation of the gradient for logistic regression. You will implement your own learning algorithm for logistic regression from scratch, and use it to learn a sentiment analysis classifier....

18 个视频（共 83 分钟）, 2 个阅读材料, 2 个测验

Intuition behind maximum likelihood estimation4分钟

Data likelihood8分钟

Finding best linear classifier with gradient ascent3分钟

Review of gradient ascent6分钟

Learning algorithm for logistic regression3分钟

Example of computing derivative for logistic regression5分钟

Interpreting derivative for logistic regression5分钟

Summary of gradient ascent for logistic regression2分钟

Choosing step size5分钟

Careful with step sizes that are too large4分钟

Rule of thumb for choosing step size3分钟

(VERY OPTIONAL) Deriving gradient of logistic regression: Log trick4分钟

(VERY OPTIONAL) Expressing the log-likelihood3分钟

(VERY OPTIONAL) Deriving probability y=-1 given x2分钟

(VERY OPTIONAL) Rewriting the log likelihood into a simpler form8分钟

(VERY OPTIONAL) Deriving gradient of log likelihood8分钟

Recap of learning logistic regression classifiers1分钟

Slides presented in this module10分钟

Implementing logistic regression from scratch10分钟

Learning Linear Classifiers12分钟

Implementing logistic regression from scratch16分钟

As we saw in the regression course, overfitting is perhaps the most significant challenge you will face as you apply machine learning approaches in practice. This challenge can be particularly significant for logistic regression, as you will discover in this module, since we not only risk getting an overly complex decision boundary, but your classifier can also become overly confident about the probabilities it predicts. In this module, you will investigate overfitting in classification in significant detail, and obtain broad practical insights from some interesting visualizations of the classifiers' outputs. You will then add a regularization term to your optimization to mitigate overfitting. You will investigate both L2 regularization to penalize large coefficient values, and L1 regularization to obtain additional sparsity in the coefficients. Finally, you will modify your gradient ascent algorithm to learn regularized logistic regression classifiers. You will implement your own regularized logistic regression classifier from scratch, and investigate the impact of the L2 penalty on real-world sentiment analysis data....

13 个视频（共 66 分钟）, 2 个阅读材料, 2 个测验

Review of overfitting in regression3分钟

Overfitting in classification5分钟

Visualizing overfitting with high-degree polynomial features3分钟

Overfitting in classifiers leads to overconfident predictions5分钟

Visualizing overconfident predictions4分钟

(OPTIONAL) Another perspecting on overfitting in logistic regression8分钟

Penalizing large coefficients to mitigate overfitting5分钟

L2 regularized logistic regression4分钟

Visualizing effect of L2 regularization in logistic regression5分钟

Learning L2 regularized logistic regression with gradient ascent7分钟

Sparse logistic regression with L1 regularization7分钟

Recap of overfitting & regularization in logistic regression分钟

Slides presented in this module10分钟

Logistic Regression with L2 regularization10分钟

Overfitting & Regularization in Logistic Regression16分钟

Logistic Regression with L2 regularization16分钟

Week

3Along with linear classifiers, decision trees are amongst the most widely used classification techniques in the real world. This method is extremely intuitive, simple to implement and provides interpretable predictions. In this module, you will become familiar with the core decision trees representation. You will then design a simple, recursive greedy algorithm to learn decision trees from data. Finally, you will extend this approach to deal with continuous inputs, a fundamental requirement for practical problems. In this module, you will investigate a brand new case-study in the financial sector: predicting the risk associated with a bank loan. You will implement your own decision tree learning algorithm on real loan data....

13 个视频（共 47 分钟）, 3 个阅读材料, 3 个测验

Intuition behind decision trees1分钟

Task of learning decision trees from data3分钟

Recursive greedy algorithm4分钟

Learning a decision stump3分钟

Selecting best feature to split on6分钟

When to stop recursing4分钟

Making predictions with decision trees1分钟

Multiclass classification with decision trees2分钟

Threshold splits for continuous inputs6分钟

(OPTIONAL) Picking the best threshold to split on3分钟

Visualizing decision boundaries5分钟

Recap of decision trees分钟

Slides presented in this module10分钟

Identifying safe loans with decision trees10分钟

Implementing binary decision trees10分钟

Decision Trees22分钟

Identifying safe loans with decision trees14分钟

Implementing binary decision trees14分钟

Week

4Out of all machine learning techniques, decision trees are amongst the most prone to overfitting. No practical implementation is possible without including approaches that mitigate this challenge. In this module, through various visualizations and investigations, you will investigate why decision trees suffer from significant overfitting problems. Using the principle of Occam's razor, you will mitigate overfitting by learning simpler trees. At first, you will design algorithms that stop the learning process before the decision trees become overly complex. In an optional segment, you will design a very practical approach that learns an overly-complex tree, and then simplifies it with pruning. Your implementation will investigate the effect of these techniques on mitigating overfitting on our real-world loan data set. ...

8 个视频（共 40 分钟）, 2 个阅读材料, 2 个测验

Overfitting in decision trees5分钟

Principle of Occam's razor: Learning simpler decision trees5分钟

Early stopping in learning decision trees6分钟

(OPTIONAL) Motivating pruning8分钟

(OPTIONAL) Pruning decision trees to avoid overfitting6分钟

(OPTIONAL) Tree pruning algorithm3分钟

Recap of overfitting and regularization in decision trees1分钟

Slides presented in this module10分钟

Decision Trees in Practice10分钟

Preventing Overfitting in Decision Trees22分钟

Decision Trees in Practice28分钟

Real-world machine learning problems are fraught with missing data. That is, very often, some of the inputs are not observed for all data points. This challenge is very significant, happens in most cases, and needs to be addressed carefully to obtain great performance. And, this issue is rarely discussed in machine learning courses. In this module, you will tackle the missing data challenge head on. You will start with the two most basic techniques to convert a dataset with missing data into a clean dataset, namely skipping missing values and inputing missing values. In an advanced section, you will also design a modification of the decision tree learning algorithm that builds decisions about missing data right into the model. You will also explore these techniques in your real-data implementation. ...

6 个视频（共 25 分钟）, 1 个阅读材料, 1 个测验

Strategy 1: Purification by skipping missing data4分钟

Strategy 2: Purification by imputing missing data4分钟

Modifying decision trees to handle missing data4分钟

Feature split selection with missing data5分钟

Recap of handling missing data1分钟

Slides presented in this module10分钟

Handling Missing Data14分钟

4.7

完成这些课程后已开始新的职业生涯

通过此课程获得实实在在的工作福利

加薪或升职

创建者 SS•Oct 16th 2016

Hats off to the team who put the course together! Prof Guestrin is a great teacher. The course gave me in-depth knowledge regarding classification and the math and intuition behind it. It was fun!

创建者 CJ•Jan 25th 2017

Very impressive course, I would recommend taking course 1 and 2 in this specialization first since they skip over some things in this course that they have explained thoroughly in those courses

Founded in 1861, the University of Washington is one of the oldest state-supported institutions of higher education on the West Coast and is one of the preeminent research universities in the world....

This Specialization from leading researchers at the University of Washington introduces you to the exciting, high-demand field of Machine Learning. Through a series of practical case studies, you will gain applied experience in major areas of Machine Learning including Prediction, Classification, Clustering, and Information Retrieval. You will learn to analyze large and complex datasets, create systems that adapt and improve over time, and build intelligent applications that can make predictions from data....

When will I have access to the lectures and assignments?

Once you enroll for a Certificate, you’ll have access to all videos, quizzes, and programming assignments (if applicable). Peer review assignments can only be submitted and reviewed once your session has begun. If you choose to explore the course without purchasing, you may not be able to access certain assignments.

What will I get if I subscribe to this Specialization?

When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile. If you only want to read and view the course content, you can audit the course for free.

What is the refund policy?

Is financial aid available?

还有其他问题吗？请访问 学生帮助中心。