1:23

So quoting from Wikipedia,

machine learning is the construction and study of systems that can learn from data.

And as you know, we've got lots of data in the stock market.

Machine learning can be used for classification.

In other words, think about face recognition.

You show a machine learning system someone's face, can it recognize it?

Can it classify that face?

3:20

Now here's a real simple example.

I made this data up so don't go looking it up on the web and

discovering that my sort of data here is wrong.

I understand that it's made up so bear with me, but

I thought it was a decent example.

So we know that change in barometric pressure affects whether it's gonna rain.

So here's a little chart of data, again, made-up data, where we

have along the x-axis there, different changes in barometric pressure.

So if the barometric pressure goes up a lot over to the right,

4:03

that means we're not very likely at all to see rain and

the vertical axis here is how much rain we get.

If barometrical pressure goes down, we're much more likely to see rain.

Each dot there represents one day's observation, where the x position

represents what the change of barometric pressure was and

the y represents how much did it rain that day.

And I've colored the instances where it rained with green dots,

and where it didn't rain with red dots.

4:48

It's represented by that blue line there.

And it's a simple model which says, y, in other words,

a predicted amount of rain, is equal to mx + b.

So, remember, x is what's the barometric pressure, and

b is essentially how high up or down that line is.

5:14

Now one thing to mention here, just as an aside,

the CAPM is also a linear model that relates expected returns in a linear way.

So, y is an expected return, m is beta,

x is how much did the market change, and b is actually our alpha.

So just to draw a little bit of an analogy there, CAPM is a linear model.

5:41

Anyways, here's a linear model for this rain problem.

And it models it okay, but notice over here,

it doesn't predict as much rain is gonna happen and

perhaps here it predicts that they'll be more rain than typically occurs.

Well we can, and this is a group of models called parametric models.

And what we're learning is the parameters.

And we can learn these parameters, in fact,

a lot of people wouldn't even consider it learning.

We can just apply linear regression to the well-known algorithm, to these points.

And we can get this m parameter and this b parameter.

So that's learning a model.

Now we can learn more complex parametric models.

Here's an example of polynomial and it fits the data a little bit better,

doesn't it?

This is a second order polynomial and so we have to learn more parameters.

We have to learn this coefficient, that coefficient plus the b.

We have to discover three parameters.

Now we could add even more degrees,

we could learn up to x cubed, x to the fourth, and so on.

Each time that we add one more parameter that we have to learn, and each time

adding a little bit more flexibility in the shape of the curve that we learned.

Okay, those are parametric models.

Now there's another kind of machine learning in this, the kind that I like and

use a lot, which is called data-driven.

And here is an example of how it might work.

So let's say we have a query.

Oh and by the way before I delve into this,

let me explain a little bit more about how we use this model.

Suppose we have created this model, this polynomial model, and we wanna know, okay,

the pressure has gone down 0.5 millibars, or whatever the correct unit is.

So it's gone down 0.5 millibars, how much do we predict it'll rain?

Well, we just plug that into x here and x here, and

the answer we would get is this value here.

So that's how we query a polynomial model.

So at any point in time, we can look at the barometric pressure change and

ask the model how much does it think it's gonna rain.

It does pretty well except when we start getting here to significant

negative changes, it'll predict more rain than is likely to actually happen.

Okay, that's a query for a parametric model.

Let's now look at a query for a data driven model at this same point.

9:05

And then we can do something as simple as just take

the mean of all those points, and the mean value,

the y mean value, and say, that's our forecast.

Now if we look at queries across the whole span of the data.

I've drawn this blue line to represent what the model would end up looking like.

Notice that it fits the data very nicely everywhere.

Now this is a made up example, but this typical kind of thing you would see.

9:49

and very naturally and easily, and it's driven just by the data.

We don't impose a model a parametric model on top of that.

So if we compare these approaches, there's pros and cons to each approach.

Let me step through those really quick.

A parametric model pros are that they don't over fit,

usually anyways, which means that they

tend to create a smooth line that doesn't zig and

zag with each little piece of data and they generalize fairly well.

They're very, very fast at runtime.

You just have these you know three parameters, you do a quick calculation and

boom you get the, you get the prediction.

A problem with them is, of course, non-linear data,

or non-polynomial data, parametric models aren't able to fit those very well.

And sometimes, to some extent, they sort of over simplify.

Now if we look at data driven, pros for

data driven, is they can model fairly complex data very well.

11:12

For instance, these dots might represent

the most recent few months of barometric pressure in rain data.

And if we wanna drop off the older data and

add new data, it's sort of a trivial thing to do with data driven models.

And you can do it continuously and

you don't have to re-learn, you just use the data directly.

11:33

Some of the cons.

At run time, they're definitely slower than parametric models,

because you have to consult all the data and find the nearby data values.

It requires sorting and distance calculations and so on.

So it's slower at runtime.

And sometimes data driven models are definitely more susceptible

to over-fitting, which means that they model noise in the data,

perhaps more than actually the underlying principle.

12:15

So the input is one dimension, barometric pressure.

And then y being how much rain.

We can extend this approach to multiple dimensions fairly easily.

Here agin, a made up example.

We have two components now of our x.

One is the barometric pressure just like before, but the other is humidity.

So again I've colored the, when it rains I've colored the dots green.

You can kind of imagine that these values are sort of coming

at you out of the screen.

Kind of that y is for the third dimension.

So a red dot means there was no rain, so it's low and

12:57

green dot means there was rain, so it's a high value.

And as you can see, again, made-up data, but over in this corner where

the barometric pressure has dropped a lot and it's humid we're likely to see rain.

But even if the barometric pressure goes down and

there's not much humidity, we might not see rain.

So that's the reason for these dots being colored red here.

Now, we can, instead of the query now being

just a single x, it's a two dimensional x.

So we measure the humidity and the pressure change.

13:56

And that's our result, is the mean of those nearby values.

Now this approach that I've been showing you, this data driven approach, Is, and,

of course, parametric models are data-driven as well,

but the idea being that you are sort of keeping the data around, and

not reducing the size as you do in parametric models.

14:44

Okay, so we can apply this to the markets as well.

So instead of weather we can think about stocks.

And instead of say, barometric pressure we might use factors of stocks like P/E

ratio, news, and instead of estimating rainfall we might estimate future price.

That's a way that we map from these examples I've been

showing you to a stock market example.

Now this is just a taste of what we can do here.

Here's one example.

Each dot here represents the one stock in the S&P 500.

And we've plotted them in a three dimensional space so

we went from two dimensions there of Barometric pressure and humidity.

Now we're using three dimensions and

we're using technical indicators to place each dot in this space.

And we've colored them according to their future return.

So you can see some patterns here, like up in this corner.

We tend to have a reduction in future price.

Blue is low price.

Whereas, the further we get out this way, the warmer the color and

actually the more it went up in the future.

So we can apply these same principles to this kind of data to

estimate future prices.

16:12

Now, there's, like I said,

this is just sort of a taste of what machine learning is about.

And there's lots of questions to answer.

These are a few of those questions.

Which features should you use?

You can use machine learning to find those features as well, or

you can use human insight.