Case Study - Predicting Housing Prices
In our first case study, predicting house prices, you will create models that predict a continuous value (price) from input features (square footage, number of bedrooms and bathrooms,...). This is just one of the many places where regression can be applied. Other applications range from predicting health outcomes in medicine, stock prices in finance, and power usage in high-performance computing, to analyzing which regulators are important for gene expression.
In this course, you will explore regularized linear regression models for the task of prediction and feature selection. You will be able to handle very large sets of features and select between models of various complexity. You will also analyze the impact of aspects of your data -- such as outliers -- on your selected models and predictions. To fit these models, you will implement optimization algorithms that scale to large datasets.
Learning Outcomes: By the end of this course, you will be able to:
-Describe the input and output of a regression model.
-Compare and contrast bias and variance when modeling data.
-Estimate model parameters using optimization algorithms.
-Tune parameters with cross validation.
-Analyze the performance of the model.
-Describe the notion of sparsity and how LASSO leads to sparse solutions.
-Deploy methods to select between models.
-Exploit the model to form predictions.
-Build a regression model to predict prices using a housing dataset.
-Implement these techniques in Python....

PD

Mar 16, 2016

I really enjoyed all the concepts and implementations I did along this course....except during the Lasso module. I found this module harder than the others but very interesting as well. Great course!

KM

May 4, 2020

Excellent professor. Fundamentals and math are provided as well. Very good notebooks for the assignments...it’s just that turicreate library that caused some issues, however the course deserves a 5/5

筛选依据：

创建者 Ernie M

•Sep 25, 2017

I enrolled in this specialization to learn machine learning using GraphLab Create. Half way into the specialization the creators sold Turi, GrapLab's parent company, making it non available to the general public (not even by paying) and then all the knowledge devalued. I wish I had known this and I would have enrolled on a different specialization. The creators still give you the possibility of using numpy, scikit learn and pandas but I had already done a lot with GraphLab create. The time I invested on my nights after work became a waste. I was trying to convince the company I worked for to buy licenses for GraphLab create.

Coursera should not allow folks to create courses that promote a private license course because it would make people waste their time and money if they decide to privatize the software.

Don't take this course, and if you take it then only use GraphLab create when the authors give you no other option.

Teaching style: Carlos was good, Emily is not very clear and loses focus of the topics and often rambles. She seems very knowledgeable but she lacks clarity of exposition when compared to Carlos or Andrew Ng.

创建者 Kelsey H

•Mar 1, 2020

Be aware that this course is from 2015. The videos are a good foundation, but they are old. The homework assignments use a proprietary python library (graphlabcreate/Turicreate) that is not useful outside this course. The more recent TuriCreate library only works on Mac. A Windows user needs to use older software. There is also very little activity on the forum - I see people asking for help, but no one replies.

创建者 Rohan G L

•Aug 29, 2020

I leave 2 stars as I learned a lot of new information and methods, and the theory and math behind them.

You will learn about Data Science and Machine Learning, but not much about Python.

The course is pretty much abandoned and outdated. Sframes and Turicreate packages (instructor's creations) are used instead of more universal packages. Installation in the beginning took some time and research. Many of the assignments have errors and bugs in the code that have not been updated. Forum assistance is abysmal for clarification or deeper questions. Many links are dead.

There are many times in the lectures where the instructors are writing several sentences in their handwriting on their notes instead of having the text ready to appear.

I would suggest using this course and series as a supplement to other information one as learned, not as an introduction for initial understanding. I found myself frustrated too many times.

创建者 Pantelis H

•Apr 7, 2016

This is an excellent course. The presentation is clear, the graphs are very informative, the homework is well-structured and it does not beat around the bush with unnecessary theoretical tangents.

创建者 Chase M

•Jan 26, 2016

I really like the top-down approach of this specialization. The iPython code assignments are very well structured. They are presented in a step-by-step manner while still being challenging and fun!

创建者 Ferenc F P

•Jan 10, 2018

This is a very good introductory class to regression. Even though I had taken already other classes in regression, like Statistical inference or Machine learning from Stanford, this course provided me much better understanding about the variance and bias of a model, as well as, how the the true error and test error is related. For some Quiz the result is different with scikit-learn than with Graphlab while the Quiz is prepared for Graphlab results. What is really helping is the notebooks provided to each programming assignment, so basically one need to write only a few lines of code when using Graphlab in order to pass the Quiz. I spent much more time making programs from zero with scikit learn (due to different results I gave it up in the last 3 weeks and used only notebook with Graphlab). Learning the usage of Graphlab is not so difficult, so I had no problem with that.

创建者 Gowtham A B

•Oct 8, 2020

Very good course to understand the regression concepts like simple regression, multiple regression, lasso, ridge, kNN and kernel regressions. On top of that the course explained about the gradient descent and coordinate descent algorithm really well. The course is designed very well maintaining the continuity. The lecturer's pace and the explanations are very good and easy to follow. I recommend this course to anyone who wants to start learning regression.

创建者 Leonardo D

•Oct 28, 2018

Excellent course, the professors made it very easy to learn quite powerful technics like gradient descend and coordinate descend. I always saw them like black-boxes, but now, thanks to this course I not only understand how they really work, but I learned how to apply them to real data. This course was simply awesome.

创建者 Jafed E G

•Jul 6, 2019

I enjoy the lectures. The professor has a good speaking and teaching style which keeps me interested. Lots of concrete math examples which make it easier to understand. Very good slides which are well formulated and easy to understand

创建者 Konduri V

•Dec 25, 2018

I really enjoyed learning through out this course. I did little bit struggle with Python but now I am a bot more confident to take on advanced programming in Python.

Thank you very much for offering course.

创建者 Pau D

•Mar 17, 2016

I really enjoyed all the concepts and implementations I did along this course....except during the Lasso module. I found this module harder than the others but very interesting as well. Great course!

创建者 Jeyanthi T

•Aug 11, 2018

Very Informative and Technical Course...But lot of Mathematical derivations were too long. But very patiently explained.

创建者 Hiral P

•Oct 9, 2018

I loved this course because of the detail understanding of the concepts. I was looking for a course which provide detail understanding of algorithms, and here I am. I am giving four stars for what has been given in detail, not five because I something is left ;) interpretation..

创建者 Gabriele P

•Apr 16, 2019

The program is well structured, the lessons are interesting and the hands on nice. However, the instructor should really consider to update their material to python 3 + turicreate. Python 2 is reaching EOL in 2020 and should be avoided for teaching/training. I did most of my notebooks with python 3 and turicreate, it is really worth the effort to update the material. The tests are ok, but some looked somewhat buggy (as reported in the forum by many users) and could use a revision

创建者 Robert K

•Aug 14, 2020

Decent course with some good challenges, I would have rated it higher if it was tailored to more used packages (e.g. scikit learn) because even though there was an option to submit using other packages, i would have preferred it if these were in the primary jupyter-notebooks.

创建者 Rajib D

•Sep 5, 2019

I think sometimes instructor jump to some concept without explaining why

创建者 Prasad B D

•Jan 15, 2016

To start with, I have been dedicating time to improve my understanding and depth in statistics and calculus. The reason is that, I am totally impressed by the lecture videos. The videos are detail, precise and covers enough depth. It has helped me to correlate the statistical concepts and application areas and rekindled my interest to go back to learn and strengthen my basics.

I like the contents that its not a overdose of many mathematical equations. Appreciate the pains taken by the expert professors in explaining every detail possible to make the course more interesting.

Appreciate the mention about the topics not covered in the course.

My suggestion is to include a seperate section for each course with a list of reference books, topics and weblinks of material to increase our depth and breadth of technical understanding of the statistical methods.

Many Thanks to Emily Fox and Carles Guestrin for their time and efforts in making this specialization course, and of course providing us with the free license to use Dato for learning purposes.

创建者 Theo L

•Jan 5, 2016

This course was well structured and well executed. I thoroughly enjoyed and was challenged by the material in the course. I appreciated the assignment/quiz approach to deal with such dense topics. I can see where people who have backgrounds in a number of the topics discussed throughout the course could feel there was too much hand holding, but I found the level of hints/help in the assignments were at the right level for me to work through & gain deeper understanding for the material presented.

My one criticism of the course stems from the denseness of the material. I believe there is an opportunity to introduce more quizzes after various sections within each module. It would be best to make these quizzes optional in order not to turn off more advance students, but I believe it would be beneficial for those students who do not have much, or any, experience in these topics to have more opportunities to test and gain deeper understanding in the material just covered.

Overall, solid course!

创建者 Patrick M

•Feb 1, 2016

A great course that will take you way past what you may remember of linear regression from high school or college days. This course is part math, part algorithms and part application (in Python). I loved it. The instructors are good and the material is generally well presented (I took the course the first time through, so there seemed to be a few gaps / rough edges.)

This course may be intimidating if you don't like mathematical notation, or if you have never used Python before. It may also be challenging if your high school / college freshman calculus is rusty. The concepts aren't super hard (basic statistics, integration, differentiation, matrix math but with multi-variate twists), but you will need to think carefully through some lessons to appreciate them.

The online tests are good - and the instructions for each week's problems are detailed. There is enough guidance to clearly show what needs to be done, but enough gaps to bridge that you're made to think about the problem at hand.

创建者 Carlos D M

•Jan 18, 2016

The topics are presented in a meaningful and understandable way. With enough detail, clarity, and fun. The instructors are super sweet and their dynamics in front of the camera are very inspiring.

The assignments are amazingly well designed. I get to practice the theory I learn from the lectures which truly reinforces what we review.

Even though I don't use the alternative tools (like Pandas), I appreciate that the organizers of the class prepare files and data sets for people who use those tools.

Another thing that's really valuable to me its' the fact that assignments (data, instructions, Jupyter, etc.) can be worked on completely offline and only need Internet connectivity to post results. Because all we do is enter numbers and select a few options, I have successfully submitted my assignments 10 minutes away from boarding a plane. I have had the chance work while riding a car (not driving it, LOL) or in an airplane. Because I have a full-time job, this is a HUGE advantage.

创建者 Havan A

•Mar 13, 2016

This is an amazing and brilliant course for machine learning. If you've done Andrew Ng's course, most of this material will feel familiar, but definitely has a lot more detail. Each sub-topic under regression is taken with a decent level of detail, with sufficient quiz and assignment questions to drill important concepts into your head. The lectures are lucid and concise, even the optional ones that cover more advanced concepts of the underlying math.

As an aside, I would like to clarify to any reader that, when they say you can use other tools, they aren't being a 100% honest. After a few assignments of using Scala and R, I quickly realized that using their iPython notebooks is the simplest and most straightforward way of clearing this course. Eventually, the assignments are such that using any other tool can cause a lot of strife.

Brilliant course. Looking forward to the next one.

创建者 Phil B

•Jan 29, 2018

This was the deep dive into regression that I was looking for, learning how and why to implement the various different algorithms that are used without being tied to a specific software package. Some of the other reviews complain about the use of graphlab but really it has no impact on the value of the course, because you can literally write the functions from scratch yourself using standard python and Numpy. The use of graphlab is just to speed things up in some of the programming assignments. One or 2 of the quizzes had some incorrect values in the notebooks but a quick search of the forums showed the correct ones and the ability to reattempt the quizzes means it's not a big issue. Emily is an excellent lecturer and the constant use of graphical aids and annotations makes it very easy to follow even with some of the fairly advanced maths.

创建者 David M

•Sep 8, 2017

I enjoyed this course. I took Ng's original ML coursera course, and it was good, but this one was much more involved and helped me better understand essential concepts in machine learning and data science. I feel confident that I can apply the skills I have learned in this course to future applications. While the lecturer sometimes repeated herself, she did well to explain some of the more difficult concepts. I would recommend this to anyone who wants a better grasp of statistics and regression analysis. The only thing I found lacking was that there was no exploration of forecasting, extrapolation, or otherwise making predictions beyond the boundaries of the training data. I feel like this is an important skill, and believe it could have been included among what was covered here.

创建者 Sean S

•Feb 18, 2018

I really enjoyed this course. Emily is an excellent instructor and the material was well planned and straightforward to follow. The programming assignments were useful and I got a lot out of implementing the algorithms from (near) scratch. I would have liked to see SVR and ensemble methods as part of this class but I understand they will be covered in another course. I used graphlab for all of the assignments but I also used numpy and and pandas when I couldn't find the functions I was looking for in graphlab. I was not a fan of the coursera hosted notebooks with graphlab for the first course but running it off my own machine was a different experience and I could definitely be sold on a single solution in place of numpy, pandas, and scikit learn.

创建者 Craig B

•Nov 29, 2016

A well thought out and nicely paced introduction to Regression following on from the equally good foundation course. I particularly like the way that the assignments assume an improving knowledge and familiarity with Python as the course progresses. It will be interesting to see if the subsequent courses in the specialisation continue in this vein - I hope so. I note the concerns that some have expressed about the use of graphlab.create for examples and assignments, but tend to think there is benefit from gaining familiarity with a number of different data science ML tools and libraries. Also additional code and instructions are available for those determined to use other tools such as Pandas and Scikit Learn.