机器人如何实时确定他们的状态，并从带有噪声的传感器测量量获得周围环境的信息？在这个模块中，你将学习怎样让机器人把不确定性融入估计，并向动态和变化的世界进行学习。特殊专题包括用于定位和绘图的概率生成模型和贝叶斯滤波器。

Loading...

来自 宾夕法尼亚大学 的课程

机器人学：估计和学习

254 评分

机器人如何实时确定他们的状态，并从带有噪声的传感器测量量获得周围环境的信息？在这个模块中，你将学习怎样让机器人把不确定性融入估计，并向动态和变化的世界进行学习。特殊专题包括用于定位和绘图的概率生成模型和贝叶斯滤波器。

从本节课中

Gaussian Model Learning

We will learn about the Gaussian distribution for parametric modeling in robotics. The Gaussian distribution is the most widely used continuous distribution and provides a useful way to estimate uncertainty and predict in the world. We will start by discussing the one-dimensional Gaussian distribution, and then move on to the multivariate Gaussian distribution. Finally, we will extend the concept to models that use Mixtures of Gaussians.

- Daniel LeeProfessor of Electrical and Systems Engineering

School of Engineering and Applied Science

In this lecture, we are going to learn how we can compute an estimate of

the Gaussian model parameters from observed data.

Remember that a Gaussian model has two parameters, mean and variance.

We are going to use the term likelihood often throughout the course.

Let's talk about its definition.

Likelihood is the probability of

an observation given the model parameters.

The subscript i indicates one particular observation

xi among multiple observations of x.

The important thing I want to point is that we have the data,

what is to be determined are the parameters.

In our case, we are using a Gaussian model.

Thus, the parameters are mu and sigma.

We are interested in obtaining the parameters of our model

that maximizes the likelihood of a given set of observations.

If we express what I just said mathematically, we can write like this.

We are using our hat accent to indicate an estimate for mu and sigma.

The likelihood function we are going to maximize is the joint

probability of all the data, which can be intractable if each

instance of x depends on the other observations.

However, if we assume that each observation,

xi is independent of each other.

The joint likelihood can be simply expressed,

as a product of individual likelihood.

If the concept of independence does not sound familiar and please review

the supplementary material on basic probability, which you can download.

With this notation,

we now look to compute the maximum likelihood estimate of mu and sigma.

Fortunately, there is an analytic solution of Gaussians.

Another reason to choose the Gaussian Distribution.

The full derivation of the solution can be found

in the supplementary material, but here are a few points.

For the estimate, we are going to apply properties of the logarithmic function.

Let me visualize a log function for you.

It is a monotonically increasing function.

So if some value, x star is the maximum of the domain and

the log of the same value, log x star is also the maximum.

Using this property instead of maximizing the likelihood,

we can try to find the parameters that maximize the log likelihood.

The objective functions are different.

The arguments that maximize the objective functions are equivalent.

It will turn out that maximizing log likelihood is simpler in many cases.

Another property of the log function we are going to use is

the log of the products equals the sum of the logs and

we can write the problem, as finding mu and sigma that maximizes

the sum of all the log likelihoods of the individual measurements.

The next thing I want to remind you is that we are dealing with a Gaussian,

which has this specific form.

Using a property of the logarithmic function.

We can write a log likelihood exactly like this.

With the expanded notation of the log likelihood,

we may rewrite the problem like this.

We're going to further simplify this.

We can ignore the constant term, log square of 2 pi.

In this case, because it does not vary with the parameters.

So, it does not affect the solution.

And we can change the formula into a minimization problem by

changing max to min and taking the negatives of all the terms.

Why switch to minimization?

Well, two problems are equivalent, but

writing optimization problem as a minimization is the standard form and

we are following that to be consistent in notation.

Let's call this whole parts J.

This is a common symbol to represent a cosine function that we want to minimize.

If we apply the optimality condition for convex optimization,

the first order derivative of J with respect to mu should be zero.

From this, we can compute the maximum likelihood estimate of mu and

we are going to write it as mu hat.

Once again,

we apply the same optimality condition to compute the estimate of sigma.

For this, we can use the value of mu hat in place of mu as a parameter.

The final solution we get for computation is relatively simple.

Mu hat is exactly the sample mean the average of the data,

which is a natural estimate of data.

Also, sigma hat square is just a sample variance.

You'll find the derivation of the solution in supplementary materials.

Try to understand the principles behind what we have obtained here.

Now that we have seen how to compute the MLE of the parameters,

let's get back to our ball color distribution.

Using our results, the maximum likelihood estimate,

the ball color model is computed like this.

Based on what we've learned so far, we'll start talking about using Gaussian

models for more than one features in our next lecture.