课程信息
3.9
71 个评分
14 个审阅
专项课程

第 2 门课程(共 5 门)

100% 在线

100% 在线

立即开始,按照自己的计划学习。
可灵活调整截止日期

可灵活调整截止日期

根据您的日程表重置截止日期。
高级

高级

完成时间(小时)

完成时间大约为40 小时

建议:6 weeks of study, 6-8 hours/week...
可选语言

英语(English)

字幕:英语(English)

您将获得的技能

GraphsHiveApache HiveApache Spark
专项课程

第 2 门课程(共 5 门)

100% 在线

100% 在线

立即开始,按照自己的计划学习。
可灵活调整截止日期

可灵活调整截止日期

根据您的日程表重置截止日期。
高级

高级

完成时间(小时)

完成时间大约为40 小时

建议:6 weeks of study, 6-8 hours/week...
可选语言

英语(English)

字幕:英语(English)

教学大纲 - 您将从这门课程中学到什么

1
完成时间(小时)
完成时间为 12 分钟

Welcome to the Second Course: Big Data Analysis

...
Reading
8 个视频 (总计 12 分钟)
Video8 个视频
What is BigData Analysis?1分钟
Tools For BigData Analysis1分钟
Graph Data Analysis2分钟
Meet Alexey Dral2分钟
Meet Pavel Mezentsev37
Meet Natalia Pritykovskaya40
Meet Pavel Klemenkov40
完成时间(小时)
完成时间为 3 小时

Big Data SQL: Hive

...
Reading
15 个视频 (总计 105 分钟), 1 个阅读材料, 3 个测验
Video15 个视频
HTTP Web Service: Access Log Format4分钟
Business Use Cases: Solution with Hive6分钟
(optional) SQL: likbez10分钟
Hive Data Definition Language (DDL)11分钟
Hive Data Manipulation Language (DML)6分钟
Hive Analytics: RegexSerDe, Views7分钟
(optional) Regular Expressions, Likbez9分钟
Hive Analytics: UDF, UDAF, UDTF7分钟
Hive Streaming4分钟
Hive PTF (Window Functions)5分钟
Hive Optimization: Partitioning, Bucketing and Sampling8分钟
Hive Map-Side Joins: Plain, Bucket, Sort-Merge5分钟
Hive Optimization: Data Skew4分钟
Hive Optimization: Row-Columnar File Formats, Compression8分钟
Reading1 个阅读材料
Slack Channel is the quickest way to get answers to your questions10分钟
Quiz3 个练习
Hive: SQL over Hadoop MapReduce20分钟
Hive Analytics with UDF and Streaming20分钟
Hive final20分钟
2
完成时间(小时)
完成时间为 7 小时

Big Data SQL: Hive (practice week)

...
Reading
3 个视频 (总计 11 分钟), 6 个阅读材料, 5 个测验
Video3 个视频
How to Install Docker on Windows 7, 8, 104分钟
How to submit your first Hadoop assignment3分钟
Reading6 个阅读材料
Assignments. General requirements10分钟
Hive assignment. Intro and instructions10分钟
Grading System: Instructions and Common Problems10分钟
Docker Installation Guide10分钟
Copy of Assignments. General requirements10分钟
Copy of Assignments. General requirements10分钟
3
完成时间(小时)
完成时间为 2 小时

Spark SQL and Spark Dataframe

...
Reading
14 个视频 (总计 82 分钟), 2 个测验
Video14 个视频
What is Pandas DataFrame and how to create it4分钟
How to process a DataFrame as SQL4分钟
Working with Hive4分钟
Reading and Writing Files7分钟
RDD vs. DF vs. SQL3分钟
Projection and Filtering5分钟
Functions5分钟
Aggregates6分钟
Join8分钟
User Defined Functions8分钟
Time Processing4分钟
Window Functions7分钟
Two-Dimensional Distributions4分钟
Quiz2 个练习
Introducing DataFrame and SQL16分钟
Spark SQL and Spark Dataframe18分钟
4
完成时间(小时)
完成时间为 4 小时

Graph Analysis from Big Data Perspective

...
Reading
13 个视频 (总计 83 分钟), 5 个测验
Video13 个视频
Graph representation7分钟
Counting common friends. Part I2分钟
Counting common friends. Part II10分钟
Counting common friends. Part III5分钟
GraphFrames: Introduction6分钟
Motif Finding: DSL6分钟
Motif Finding: Counting Mutual Friends6分钟
Motif Finding: Under The Hood. Part 114分钟
Motif Finding: Under The Hood. Part 24分钟
Triangles Count: Introduction3分钟
Triangles Count: Edge Lists6分钟
Triangles Count: GraphFrame6分钟
Quiz4 个练习
Graph Representations10分钟
Motif Finding18分钟
Triangles Count8分钟
Graph Analysis from Big Data Perspective20分钟
3.9
14 个审阅Chevron Right
职业方向

50%

完成这些课程后已开始新的职业生涯
工作福利

50%

通过此课程获得实实在在的工作福利

热门审阅

创建者 SMNov 13th 2018

content of the course is remarkable and the way they explained concepts is very lucid. I just want to give suggestions please give link to the data set they are using for illustrating the concepts.

创建者 SSFeb 3rd 2018

I wish I could give more rating than 5 :). Excellent course. Thanks so much for such an excellent course. All the instructors are great.

讲师

Avatar

Pavel Klemenkov

Chief Data Scientist
NVIDIA
Avatar

Pavel Mezentsev

Senior Data Scientist
PulsePoint inc
Avatar

Alexey A. Dral

Founder and Chief Executive Officer
BigData Team

关于 Yandex

Yandex is a technology company that builds intelligent products and services powered by machine learning. Our goal is to help consumers and businesses better navigate the online and offline world....

关于 Big Data for Data Engineers 专项课程

This specialization is made for people working with data (either small or big). If you are a Data Analyst, Data Scientist, Data Engineer or Data Architect (or you want to become one) — don’t miss the opportunity to expand your knowledge and skills in the field of data engineering and data analysis on the large scale. In four concise courses you will learn the basics of Hadoop, MapReduce, Spark, methods of offline data processing for warehousing, real-time data processing and large-scale machine learning. And Capstone project for you to build and deploy your own Big Data Service (make your portfolio even more competitive). Over the course of the specialization, you will complete progressively harder programming assignments (mostly in Python). Make sure, you have some experience in it. This course will master your skills in designing solutions for common Big Data tasks: - creating batch and real-time data processing pipelines, - doing machine learning at scale, - deploying machine learning models into a production environment — and much more! Join some of best hands-on big data professionals, who know, their job inside-out, to learn the basics, as well as some tricks of the trade, from them. Special thanks to Prof. Mikhail Roytberg (APT dept., MIPT), Oleg Sukhoroslov (PhD, Senior Researcher, IITP RAS), Oleg Ivchenko (APT dept., MIPT), Pavel Akhtyamov (APT dept., MIPT), Vladimir Kuznetsov, Asya Roitberg, Eugene Baulin, Marina Sudarikova....
Big Data for Data Engineers

常见问题

  • 注册以便获得证书后,您将有权访问所有视频、测验和编程作业(如果适用)。只有在您的班次开课之后,才可以提交和审阅同学互评作业。如果您选择在不购买的情况下浏览课程,可能无法访问某些作业。

  • 您注册课程后,将有权访问专项课程中的所有课程,并且会在完成课程后获得证书。您的电子课程证书将添加到您的成就页中,您可以通过该页打印您的课程证书或将其添加到您的领英档案中。如果您只想阅读和查看课程内容,可以免费旁听课程。

还有其他问题吗?请访问 学生帮助中心