recommended for all the 21st centuary students who might be intrested to play with data in future or some kind of work related to make predictions systemically must have good knowledge of this course
Issues of every stage of the construction of learning machine model, as well as issues with several different machine learning methods are well and in fine yet very understandable detail explained.
创建者 Fernando L B d M•
创建者 Fábio A C•
创建者 Artem A•
创建者 Peter T•
创建者 Johan J•
创建者 Jeff L J D•
创建者 Pedro M•
创建者 Olawunmi G•
创建者 Sabeur M•
创建者 Md. R Q S•
创建者 boulealam c•
创建者 Tanmay S•
创建者 Sai P G•
创建者 Khairul I K•
创建者 Yi-Yang L•
创建者 Larry G•
创建者 Kidpea L•
创建者 Amit K R•
创建者 Reinhard S•
创建者 Rudolf N•
Thank you for inviting me to be a beta tester for Practical Machine Learning. I completed this course at the beginning of October of this year. When I was asked to be a "beta tester" I thought that I would be presented with new materials. However, the only thing that has changed is the look and layout of the Coursera web pages. The video lectures, quizzes, and assignment are the same as they have been for quite some time. Here are some specific comments:
1. The video lectures: To me, these are clear and easy to follow. However, like those in the other courses in the Data Science Specialization, this course covers a wide range of subjects but tends not to have much depth. When I compare this and other courses in the specialization to other moocs that I have taken including Machine Learning with Andrew Ng and the Stanford Online EdX Course Statistical Learning with Trevor Hastie and Rob Tibshirani, the somewhat cursory treatment of the topics in the Data Science Specialization becomes more noticeable. Perhaps in the interest of "truth in advertising" this course should be called "A Brief Introduction to Practical Machine Learning." In the interest of full disclosure, I should note that I have an undergraduate degree in economics and an MS and PhD in psychology with a quantitative bent. I have had lots of statistics courses, especially those related to ANOVA, MANOVA, nonparametric statistics, correlation and regression methods, and structural equation modeling. The latter is important in psychology because researchers in this field like to measure latent variables. I had been an analyst using SPSS for several decades and the courses in this specialization helped me to migrate to R. Also, there have been may new developments that have become more accessible through R packages (like the fancier tree methods) that were not available when I completed my PhD. Thus these courses (and others such as the ones by Ng and Hastie and Tibshirani) have helped me to keep abreast of these developments. So they are good for me, but I wonder to what degree do the courses in the Data Science Specialization actually make a person a "data scientist?"
2. The quizzes: I think these items are good practice and are at a reasonable level of difficulty. However, these items are the same ones that you have been giving for quite some time, with perhaps a few new ones added. A little googling will lead you to the answers to these quizzes posted online. I recommend that you put a little time and effort into writing all new items.
3. The final project: Again, this project is good practice and seems to be at a reasonable level of difficulty. And again, this is the same project that appears to have been given at the end of numerous iterations of this course. And again, numerous write-ups for this project can be found online. And again, I would recommend that you put a little time and effort into finding a new data set for people to analyze. This would help minimize some of the rampant cheating that I found in this and in other classes in the specialization.
On the subject of cheating, when I was doing the peer grading for the courses in the Specialization, I would enter the code of the students that I was grading into the Google search box and all too often I found links to submissions for the project by students who had taken earlier sessions of the class. That is, students were copying these earlier submissions by other students and submitting them as their own. And I don't mean that they were similar: students were copying other people's work line by line, character by character. I found that to be quite irritating and I always reported it to Coursera. Of course, if the instructors would change their assignments once I a while, then this sort of copying would be impossible. As it is, it appears that the good professors put a lot of time and effort into creating what are indeed a worthwhile set of classes. However, after they created the classes, they seem to have pushed the "autopilot" button and gone off to do their day jobs. I would suggest that re-engaging with these courses and reading some of the comments that other students have made would be helpful.
Overall, I appreciate the courses in the Data Science Specialization and specifically this course. I know that these class materials took considerable time and efforts to create. I wish the instructors continued success with these classes.
创建者 Robert O•
The course subject matter was great but like the course 6 & 7 scenarios i found the lectures didn't reiterate or reinforce key takeaways that are easily confused. For example is cross validation when you split the data into a training and testing, when you have a separate unknown results set to test final training model on. Or does it require doing folds and then breaking each of those up into training and testing chunks. Or why is it not okay to use a model training function that internally does cross validation similar like randomForest documentation suggests. Also things like what the prediction accuracy implies in contrast to the model oob [ in ] sample error estimate and if that estimate is akin to the 1 - prediction accuracy on test data set, i.e. out of sample error estimate. Seems like liitle coverage was given to whether or not there are well known training models to use or if you literally need to try and compare the 1/2 dozen or so common ones out there every time to find out which one to use for a given dataset. Also left confused about overlapping use of words classification model training, i.e. are they synonyms for the machine learning based functions we use to try and fit models to data.
创建者 Jason M C•
Of all the JHU Data Science specialization courses I've had, this was by far the most enjoyable. I really liked how the class was more in the style of 'here's some techniques, now do whatever you want on the project.' Prior courses are, and understandably so, more constrained in the assignments. It's not until here that the student really has the tools to be able to flex their analytical muscles, and it pays off.
Also, of the three instructors, I am most favorable to Jeff Leek, who teaches this class. He communicates much clearer than Roger Peng or Brian Caffo. I find I learn more from his content than the others.
Lastly, I will say that this class doesn't hold a torch to University of Washington's Machine Learning specialization. That's expected since this is one class and that's a whole series of classes. If you're hungry for more after this one, I highly recommend UWash's Machine Learning specialization.
创建者 Sheila B•
I've been working my way through the whole track, and this was by far the most complex material--but it was easy to understand because the videos were so clear.
I do have one bone to pick, though: the quiz material relies on very old packages. Again and again I had to finegle something so I could answer a quiz question. That makes you guys look like you are lazily sitting back collecting money but not really doing your job as far as teaching goes. It's time for an update. How hard is it to run your quizzes on updated packages and offer answers that are current?
Aside from that, I find that you explain material very clearly and you are my first choice for picking up a new data science skill.
创建者 Andrew K•
So why four stars vs five stars, of all the Data Science Certification courses that I have taken: i) some of the examples and quiz challenges don't work as they should, ii) Machine Learning is rapidly changing area - should be updated to reflect this and perhaps a high level taste of Deep Learning, iii) posting the Final Project is overly complicated relative to methods of the other courses - this should be cleaned up - still not clear how point to a github repo link and also have a rendered html page working from that same link - requires two links to present materials and must use default names like index vs. a project name.
创建者 Terry L J•
Lot of good material, however, on all of these courses, it would be very helpful if they were better organized for learning.
Overview of learning objectives in a step sequence for a more organized approach for learning (maybe even a process roadmap map sequencing activity to follow that you can reference back to.
Detailed information for each step in the learning process that can be followed that maps back to the roadmap.
A summary of the learning objective in the roadmap sequence.
Basically, just like writing a paper, > overview/objectives > Main topics >subtopics, etc. > summary