你是否好奇数据可以告诉你什么？你是否想在关于机器学习促进商业的核心方式上有深层次的理解？你是否想能同专家们讨论关于回归，分类，深度学习以及推荐系统的一切？在这门课上，你将会通过一系列实际案例学习来获取实践经历。在这门课结束的时候，

Loading...

来自 University of Washington 的课程

机器学习基础：案例研究

7583 个评分

你是否好奇数据可以告诉你什么？你是否想在关于机器学习促进商业的核心方式上有深层次的理解？你是否想能同专家们讨论关于回归，分类，深度学习以及推荐系统的一切？在这门课上，你将会通过一系列实际案例学习来获取实践经历。在这门课结束的时候，

从本节课中

Classification: Analyzing Sentiment

How do you guess whether a person felt positively or negatively about an experience, just from a short review they wrote?<p>In our second case study, analyzing sentiment, you will create models that predict a class (positive/negative sentiment) from input features (text of the reviews, user profile information,...).This task is an example of classification, one of the most widely used areas of machine learning, with a broad array of applications, including ad targeting, spam detection, medical diagnosis and image classification.</p>You will analyze the accuracy of your classifier, implement an actual classifier in an iPython notebook, and take a first stab at a core piece of the intelligent application you will build and deploy in your capstone.

- Carlos GuestrinAmazon Professor of Machine Learning

Computer Science and Engineering - Emily FoxAmazon Professor of Machine Learning

Statistics

[MUSIC]

So classifiers are really trying to make decisions.

Decisions as to whether a sentence is positive or negative, or

whether a set of lab tests plus x-rays

plus measurements lead to a certain disease like flu or cold.

That's a decision that needs to be made.

So let's talk a little bit about how classifiers especially linear classifiers

make decisions.

To understand decision boundaries,

suppose you only had two words with non-zero weight.

You have awesome with positive weight of one and

awful which is just awful so it has a negative weight of 1.5.

If you have this situation then the score is gonna be 1 times the number of

awesomes in the sentence, minus 1.5 times the number of awfuls.

So we can plot this on an axis, there's the awesome axis and

then there's the awful axis.

So for example, the sentence, the sushi was awesome,

the food was awesome, but the service was awful.

Then that has two awesomes and one awful.

It's plotted in the point (2,1) on the axis.

And similarly for something that has say, three awfuls and one awesome,

and for something that is all awesome, so three awesomes is at a point (3,0),

and so on for other sentences.

Now, let's understand a little better how we scored the sentences and

what does that imply about our decisions.

So, for example, take the point (3,0) as 3 awesomes, no awfuls.

Three awesomes gives you a positive

prediction because the score is greater than zero.

And that is true for every point really on the bottom right of the axis.

While the points on the top left all have score less than zero so for

example the point three awfuls, one awesome got score less than zero.

So, those get labeled negative.

And, in fact, what separates the negative predictions from the positive predictions

is the line that defines the places where I don't know what's positive and

what's negative, and that's the line where 1.0 #awesome- 1.5 #awful = 0.

And that's the line when, I don't know, the prediction is uncertain, and

so we call that the decision boundary.

Everything on one side we predict is positive,

everything on the other we predict is negative.

Now notice that the decision boundary,

1.0 times awesome minus 1.5 times awful equals to 0 is a line.

And so that's why it's called a linear classifier.

It's a linear decision boundary.

So decision boundaries are what separate the positive

predictions from the negative predictions.

So, in the case of just two features, we see this is just a line.

But that situation will differ as we increase the number of features.

So in two dimensions, a linear function is a line.

In three dimensions, we have three, words for example that have non-zero weight and

everything else has zero weight and we get to the plane.

And so it's a little hard to draw in 3D, but

the positive predictions here are above the plane and the negative predictions

here are below the plane and the plane is somehow inclined in the space.

Now when you have not just three non-zero words but in

re-application you're gonna have tens of thousands of words with non-zero weight.

And in that case we'll call those hyperplanes,

really high dimensional separators called hyperplanes.

Now, of course you can use more than linear classifiers.

You can use more complex classifiers.

And those, instead of having lines or

hyperplanes, they have more complicated shapes or squiggly separations.

And we're gonna learn more about that in the classification course.

>> [MUSIC]