Data analysis has replaced data acquisition as the bottleneck to evidence-based decision making --- we are drowning in it. Extracting knowledge from large, heterogeneous, and noisy datasets requires not only powerful computing resources, but the programming abstractions to use them effectively. The abstractions that emerged in the last decade blend ideas from parallel databases, distributed systems, and programming languages to create a new class of scalable data analytics platforms that form the foundation for data science at realistic scales.
- 5 stars
- 4 stars
- 3 stars
- 2 stars
- 1 star
Great course that strikes a balance between teaching general principles and concepts, and providing hands-on technical skills and practice.\n\nThe lessons are well designed and clearly conveyed.
I like the breadth of coverage of this class. Each of the exercise is a gem in that I get to learn something new also. I would highly recommend this even to experience practitioner also.
Good! I like the final (optional) project on running on a large dataset through EC2. The lectures aren't as polished and compact as they could be but certainly a very valuable course.
It's pretty tough in assignments especially when there are mistakes in the given description, but I do learn the basic concepts of relational algorithm and MapReduce from them.
Definitely need some background in R or Python and the lectures are a bit old. Seem to be from around 2013 when this first came out but most of the info is still relevant.
Comprehensive and clear explanation of theory and interlinks of the up-to-date tools, languages, tendencies. Kudos and thanks to Bill Howe.\n\nHighly recommended.
Its pretty decent. I liked the assignments. However there were some typos in the lecture slides and also the grader output is not very friendly.
Very good course, but lectures could be more tuned onto the home assignments. A lot of independent work for me at least. Teacher is very good.
关于 大规模数据科学 专项课程