In the previous slides,

I drew the mathematical model in a specific form.

The model consists of many layers arranged to one after the other.

The input passes through the first layer,

and then the second,

and then the third, et cetera,

with each of the layers themselves being a simple mathematical function.

So, the entire model consists of a function,

of a function, of a function, you get the idea.

The diagram depicts a mathematical model called a neural network.

There are other common mathematical models used in machine learning,

linear methods, decision trees,

radial basis functions, ensembles of trees,

radial basis functions, followed by linear methods, the list goes on.

But we're talking about neural networks.

Traditionally, neural network models didn't have this many layers.

Neural networks date back to the 1970s,

but they used to have only one hidden layer.

The reason had to do with number one, computational power,

training deep neural networks,

neural networks with lots of layers takes a lot of computing power.

Second reason, they had only one hidden layer, availability of data.

As you add more layers,

there are more and more weights to adjust,

so you need lots of more data.

And the third reason why you had only one hidden layer, computational tricks.

It turns out that if you just add layers,

you will run into some issues,

the neural networks will take a long time to train,

some of the layers will become all zero or they'll blow up,

and become all NAM or not a number.

So, the research community had to develop a number of tricks and

techniques to get deep neural networks to work.

So, in the last few years neural networks have proven themselves to

be the best or near best in a wide variety of tasks,

even tasks that used to be thought to be unsolvable with machine learning.

Neural networks have enable dramatic improvements in

really hard problems like language translation,

image classification, speech understanding, et cetera.

And they work just as well or better unstructured data problems,

that's traditional machine learning methods such as

support factor machines are boosted or bagged decision trees,

and you can see this at Google.

The use of deep learning at Google has accelerated rapidly.

We had pretty much no deep learning models four years ago,

and now we have more than 4,000 deep learning models within Google.

So, in this specialization,

we will use neural networks almost exclusively.

We will start off unstructured data problems,

and once we know how to build an end-to-end pipeline,

we will take that knowledge,

and show you how to do image problems,

and sequence problems, and recommendation systems.

But look again at this graph, 4,000-plus models.

How can there be so many ML models?

Well, ML is part of pretty much every Google product out there,

whether it's YouTube or Play or Chrome or Gmail or Hangouts,

they all use Machine Learning.

It's not that there is just one ML model at YouTube.

There are dozens of ML models per product.

In my experience, this is something that takes some getting used to.

You might look at a business problem, say,

how to forecast rather an item will go out of stock

and think of it as a single Machine Learning model that you have to build.

But in practice, to forecast whether an item will go out of stock,

you will have to build many Machine Learning models to solve the problem.

You may have to break this problem down into

smaller problems based on your knowledge of the business.

For example, your first model might be to predict

the demand for the product at the store location,

and your second model might predict the inventory of

this item at your supplier's warehouse and at nearby stores.

You might need a third model to predict how long it's

going to take them to stock your product,

and use this to predict which supplier you will ask to refill the shelf, and when.

And of course, all these models themselves might be more complex.

The model to predict the demand from milk is going to be very

different from the market to predict the demand for dry noodles.

And the model for restocking electronics is very

different from the model for restocking furniture.

There is not one ML model.

There are dozens of ML models per product.

This being a teaching course,

we will show you how to train, deploy,

and predict with a single model.

In practice though, you'll be building

many machine learning models to solve the use case.

Avoid the trap of thinking of building

a monolithic one model solves a whole problem solutions.