课程信息
33,102 次近期查看

第 2 门课程(共 5 门)

100% 在线

立即开始,按照自己的计划学习。

可灵活调整截止日期

根据您的日程表重置截止日期。

高级

完成时间大约为72 小时

建议:6 weeks of study, 6-8 hours/week...

英语(English)

字幕:英语(English), 韩语

您将获得的技能

GraphsHiveApache HiveApache Spark

第 2 门课程(共 5 门)

100% 在线

立即开始,按照自己的计划学习。

可灵活调整截止日期

根据您的日程表重置截止日期。

高级

完成时间大约为72 小时

建议:6 weeks of study, 6-8 hours/week...

英语(English)

字幕:英语(English), 韩语

教学大纲 - 您将从这门课程中学到什么

1
完成时间为 22 分钟

Welcome to the Second Course: Big Data Analysis

8 个视频 (总计 12 分钟), 1 个阅读材料
8 个视频
What is BigData Analysis?1分钟
Tools For BigData Analysis1分钟
Graph Data Analysis2分钟
Meet Alexey Dral2分钟
Meet Pavel Mezentsev37
Meet Natalia Pritykovskaya40
Meet Pavel Klemenkov40
1 个阅读材料
Slack Channel is the quickest way to get answers to your questions10分钟
完成时间为 3 小时

Big Data SQL: Hive

15 个视频 (总计 105 分钟), 3 个测验
15 个视频
HTTP Web Service: Access Log Format4分钟
Business Use Cases: Solution with Hive6分钟
(optional) SQL: likbez10分钟
Hive Data Definition Language (DDL)11分钟
Hive Data Manipulation Language (DML)6分钟
Hive Analytics: RegexSerDe, Views7分钟
(optional) Regular Expressions, Likbez9分钟
Hive Analytics: UDF, UDAF, UDTF7分钟
Hive Streaming4分钟
Hive PTF (Window Functions)5分钟
Hive Optimization: Partitioning, Bucketing and Sampling8分钟
Hive Map-Side Joins: Plain, Bucket, Sort-Merge5分钟
Hive Optimization: Data Skew4分钟
Hive Optimization: Row-Columnar File Formats, Compression8分钟
3 个练习
Hive: SQL over Hadoop MapReduce20分钟
Hive Analytics with UDF and Streaming20分钟
Hive final20分钟
2
完成时间为 6 小时

Big Data SQL: Hive (practice week)

3 个视频 (总计 11 分钟), 4 个阅读材料, 5 个测验
3 个视频
How to Install Docker on Windows 7, 8, 104分钟
How to submit your first Hadoop assignment3分钟
4 个阅读材料
Assignments. General requirements10分钟
Hive assignment. Intro and instructions10分钟
Grading System: Instructions and Common Problems10分钟
Docker Installation Guide10分钟
3
完成时间为 2 小时

Spark SQL and Spark Dataframe

14 个视频 (总计 82 分钟), 2 个测验
14 个视频
What is Pandas DataFrame and how to create it4分钟
How to process a DataFrame as SQL4分钟
Working with Hive4分钟
Reading and Writing Files7分钟
RDD vs. DF vs. SQL3分钟
Projection and Filtering5分钟
Functions5分钟
Aggregates6分钟
Join8分钟
User Defined Functions8分钟
Time Processing4分钟
Window Functions7分钟
Two-Dimensional Distributions4分钟
2 个练习
Introducing DataFrame and SQL16分钟
Spark SQL and Spark Dataframe18分钟
4
完成时间为 4 小时

Graph Analysis from Big Data Perspective

13 个视频 (总计 83 分钟), 5 个测验
13 个视频
Graph representation7分钟
Counting common friends. Part I2分钟
Counting common friends. Part II10分钟
Counting common friends. Part III5分钟
GraphFrames: Introduction6分钟
Motif Finding: DSL6分钟
Motif Finding: Counting Mutual Friends6分钟
Motif Finding: Under The Hood. Part 114分钟
Motif Finding: Under The Hood. Part 24分钟
Triangles Count: Introduction3分钟
Triangles Count: Edge Lists6分钟
Triangles Count: GraphFrame6分钟
4 个练习
Graph Representations10分钟
Motif Finding18分钟
Triangles Count8分钟
Graph Analysis from Big Data Perspective20分钟
4.1
29 个审阅Chevron Right

33%

完成这些课程后已开始新的职业生涯

25%

通过此课程获得实实在在的工作福利

来自Big Data Analysis: Hive, Spark SQL, DataFrames and GraphFrames的热门评论

创建者 SMNov 13th 2018

content of the course is remarkable and the way they explained concepts is very lucid. I just want to give suggestions please give link to the data set they are using for illustrating the concepts.

创建者 SSFeb 3rd 2018

I wish I could give more rating than 5 :). Excellent course. Thanks so much for such an excellent course. All the instructors are great.

讲师

Avatar

Alexey A. Dral

Founder and Chief Executive Officer
BigData Team
Avatar

Pavel Klemenkov

Chief Data Scientist
NVIDIA

关于 Yandex

Yandex is a technology company that builds intelligent products and services powered by machine learning. Our goal is to help consumers and businesses better navigate the online and offline world....

关于 Big Data for Data Engineers 专项课程

This specialization is made for people working with data (either small or big). If you are a Data Analyst, Data Scientist, Data Engineer or Data Architect (or you want to become one) — don’t miss the opportunity to expand your knowledge and skills in the field of data engineering and data analysis on the large scale. In four concise courses you will learn the basics of Hadoop, MapReduce, Spark, methods of offline data processing for warehousing, real-time data processing and large-scale machine learning. And Capstone project for you to build and deploy your own Big Data Service (make your portfolio even more competitive). Over the course of the specialization, you will complete progressively harder programming assignments (mostly in Python). Make sure, you have some experience in it. This course will master your skills in designing solutions for common Big Data tasks: - creating batch and real-time data processing pipelines, - doing machine learning at scale, - deploying machine learning models into a production environment — and much more! Join some of best hands-on big data professionals, who know, their job inside-out, to learn the basics, as well as some tricks of the trade, from them. Special thanks to Prof. Mikhail Roytberg (APT dept., MIPT), Oleg Sukhoroslov (PhD, Senior Researcher, IITP RAS), Oleg Ivchenko (APT dept., MIPT), Pavel Akhtyamov (APT dept., MIPT), Vladimir Kuznetsov, Asya Roitberg, Eugene Baulin, Marina Sudarikova....
Big Data for Data Engineers

常见问题

  • 注册以便获得证书后,您将有权访问所有视频、测验和编程作业(如果适用)。只有在您的班次开课之后,才可以提交和审阅同学互评作业。如果您选择在不购买的情况下浏览课程,可能无法访问某些作业。

  • 您注册课程后,将有权访问专项课程中的所有课程,并且会在完成课程后获得证书。您的电子课程证书将添加到您的成就页中,您可以通过该页打印您的课程证书或将其添加到您的领英档案中。如果您只想阅读和查看课程内容,可以免费旁听课程。

还有其他问题吗?请访问 学生帮助中心