Chevron Left
Back to Getting and Cleaning Data

Learner Reviews & Feedback for Getting and Cleaning Data by Johns Hopkins University

4.5
stars
8,047 ratings

About the Course

Before you can work with data you have to get some. This course will cover the basic ways that data can be obtained. The course will cover obtaining data from the web, from APIs, from databases and from colleagues in various formats. It will also cover the basics of data cleaning and how to make data “tidy”. Tidy data dramatically speed downstream data analysis tasks. The course will also cover the components of a complete data set including raw data, processing instructions, codebooks, and processed data. The course will cover the basics needed for collecting, cleaning, and sharing data....

Top reviews

HS

May 2, 2020

This course provides an introduction of some important concepts and tools on a very important aspect of data science: cleaning and organizing data before any analysis. A must for any data scientist.

DH

Feb 1, 2016

Easy, mostly instructive Course. The Assignments and quizzes are quite good, and illustrates the lessons very well.

See the videos for general presentation, but use the energy on the excersizes.

Filter by:

801 - 825 of 1,306 Reviews for Getting and Cleaning Data

By Udom A

Dec 30, 2019

Good

By Pitak P

Oct 4, 2019

Good

By Santi M

Apr 21, 2019

good

By Sangdon C

Oct 11, 2018

good

By Anup K M

Sep 15, 2018

good

By Jay B

Aug 8, 2017

good

By Oleksandr F

Nov 24, 2016

Nice

By Abhinav S

May 29, 2016

Nice

By Tian A

Apr 5, 2016

cool

By James M

Mar 13, 2016

good

By 朱荣荣

Mar 9, 2016

good

By Isura N

Dec 8, 2017

yes

By Amit K R

Nov 21, 2017

ok

By Naveen J

Aug 9, 2019

0

By Ловягина Ю А

May 16, 2019

C

By Anil G

Oct 23, 2017

E

By Jihee Y

Nov 12, 2016

g

By Klipped K

Jan 2, 2017

The last course in this specialization had me second guessing myself on whether or not I could do this (based on how difficult the quizzes/course project was). I spent tons of time researching, looking at the forums, and programming documentation. Going into this course, I was surprised about how much I knew when it came to the quizzes and how my time looking at all the resources above decreased and I actually knew where to begin! The course project was still VERY difficult for this beginning specialization compared to what information is given in the lectures (spent over 16 hours on it and still didn't meet ALL criteria). However, it does challenge you to think and reflect on your knowledge. You probably won't understand 100% of what is in the course (especially the project) but you will LEARN especially if you are totally NEW to programming like me! My knowledge has increased and that is what I was hoping for. If you have those same hopes and do not mind not getting 100%'s on all quizzes/projects this course is for you. Do be mindful that I personally feel that this course requires more than 4-6 hours per week especially for beginners. I spent more than 24 hours on week 4 alone.

By Marcelo A M

May 2, 2020

Aunque el curso es muy bueno y lo recomiendo a cualquiera que quiera hacer una carrera como Data Scientist, tiene algunos problemas completamente solucionables, por ejemplo en algunas de las clases que muestran el uso de paquetes problemáticos y que son completamente prescindibles para el 99% de los casos. Además no hay orientación para los sistemas Linux, y siendo un curso sobre una herramienta open-source me parece una triste omisión. Dicho esto, el curso es muy bueno, y te va a enseñar a lidiar con datos desordenados y con distintos formatos de datos. No lo recomendaría para alguien que no tiene experiencia previa en R.

Although the course is very good and I recommend it to anyone who wants to pursue a career as a Data Scientist, it has some completely solvable problems, i.e. some of the classes that show the use of problematic packages and that are completely expendable for 99% of cases. Also there is no guidance for Linux systems, and being a course on an open-source tool seems to me a sad omission. That said, the course is very good, and it will teach you how to deal with messy data and different data formats. I would not recommend it for someone who has no previous experience in R.

By Luc R

Dec 18, 2015

I was pleased to be selected as a beta tester, but it turns out to be a little bit boring for a course I did already pass and succeed. Slides are unchanged, so my concentration drops after a while, even with some good will. It seems there are new swirl lessons, but they don't seem to be available yet, at beta test time, bad luck.I like the new overall presentation (provided by Coursera?) where we can pick transcripts without leaving the normal course flow. I tend to click the "finger up" icon for each lecture, because I just think the general organization of this specific course is quite good, and the contents consistent and not uselessly overlapping (unlike some other courses of the track such as "Reproducible research"). Also, the theme seems to perfectly fit the 4-weeks standard slot of the track. I have been busy those days, so I didn't fully review the course yet: I'll try to review as much lectures as possible till the end of the countdown. I'll try to do it better for my next beta test, if any. Anyway, thanks a lot for that very interesting track (I plan to go up to the capstone project, certified).

By Edgardo G

Sep 2, 2023

El contenido del curso es muy bueno, las explicaciones son claras y bien organizadas. Sin embargo, tiene el mismo problema que otros cursos de Johns Hopkins: en algunos casos es imposible dar con los archivos de datos que se emplean en el curso, porque las web desde donde se descargaban ya no están disponibles o cambiaron la estructura de los datos. Sería una buena práctica que en las lesiones pusieran a disposición los archivos de datos.

The content of the course is very good, the explanations are clear and well organized. However, it has the same problem as other Johns Hopkins courses: in some cases it is impossible to find the data files used in the course, because the websites from which they were downloaded are no longer available or the data structure has changed. . It would be good practice for injuries to make the data files available.

By Miguell M

Jun 30, 2018

This course was pretty useful for learning the various ways to acquire, clean, and manipulate data, which I think is an awesome real-world skill. The course project at week 4 was a good way to exercise some of these skills, but I have some qualms about the delivery of the course project - primarily the instructions. The course project involves getting and cleaning a dataset, but the instructions are rather vague in some key areas that I believe could lead to a great variety in submissions. I'm not sure if the vagueness in instructions was intentional (perhaps to mimic real-world scenarios?), but it certainly lead to a lot of confusion in the interpretation of the instructions, a sentiment reflected in the discussion forums. That being said, the course was useful!

By Haonan J

Feb 4, 2018

the content about getting data is too difficult for me, as I'm a student who just completed the R Programming course. It's hard for me to learn data mining from API, website and excel in only one week. So I don't reommend this courses for some starter like me.

However, the content on Cleaning Data is great. the dplyr package is more convenient than what I've learned in the last course, And the mentor is still great.

In all, This is a nice course and help me a lot. Thanks a lot to the mentor. Maybe somedays later when I have a better foundation on programming, I will review the knowledge and skills in this course again.

By Christian B

Nov 4, 2016

The course content is important. I felt the final assignment quite hard. I struggled a lot with R on it. Interesting enough, when looking at the solutions during the peer reviews, they seem to have found way easier solutions than I had. I am not sure why. I got the same result but my code looks way more complicated. Also the description of the final assignment was a bit unclear. For example, where we supposed to rename the features or not? Where we supposed to calculate the mean per activity , subject or activity subject combination? Where we supposed to select the mean() only or also the FreqMean(). etc.

By Cristobal M

Jul 5, 2022

Buen curso, el ingles es muy necesario pues las clases se quedan cortas en cuanto a contenidos y es necesario revisar foros y sitios externos para solucionar los quizes y proyectos.

Falta actualizar el curso y algunas fuentes de datos. Además, creo que algunos temas no estan bien explicados, pienso en web scraping que finalmente tuve que buscar informacion externa para entenderlo al nivel que requerían los test.

Independiente de todo buen curso, creo que aprendí a cargar la mayoría de tipos de datos más comunes , junto con manejar herramientas muy utiles para la manipulacion de datos.