课程信息
4.0
208 个评分
61 个审阅
专项课程

第 1 门课程(共 1 门),位于

100% 在线

100% 在线

立即开始,按照自己的计划学习。
可灵活调整截止日期

可灵活调整截止日期

根据您的日程表重置截止日期。
中级

中级

完成时间(小时)

完成时间大约为43 小时

建议:6 weeks of study, 6-8 hours/week...
可选语言

英语(English)

字幕:英语(English)...

您将获得的技能

Python ProgrammingApache HadoopMapreduceApache Spark
专项课程

第 1 门课程(共 1 门),位于

100% 在线

100% 在线

立即开始,按照自己的计划学习。
可灵活调整截止日期

可灵活调整截止日期

根据您的日程表重置截止日期。
中级

中级

完成时间(小时)

完成时间大约为43 小时

建议:6 weeks of study, 6-8 hours/week...
可选语言

英语(English)

字幕:英语(English)...

教学大纲 - 您将从这门课程中学到什么

1
完成时间(小时)
完成时间为 14 分钟

Welcome

...
Reading
8 个视频(共 14 分钟)
Video8 个视频
Issues BigData can solve1分钟
BigData Applications1分钟
What is BigData Essentials?2分钟
Course Structure2分钟
Meet Emeli1分钟
Meet Alexey2分钟
Meet Ivan1分钟
完成时间(小时)
完成时间为 8 小时

What are BigData and distributed file systems (e.g. HDFS)?

...
Reading
18 个视频(共 136 分钟), 10 个阅读材料, 5 个测验
Video18 个视频
File system managing6分钟
File content exploration 15分钟
File content exploration 213分钟
Processes4分钟
Scaling Distributed File System9分钟
Block and Replica States, Recovery Process 16分钟
Block and Replica States, Recovery Process 27分钟
HDFS Client9分钟
Web UI, REST API4分钟
Namenode Architecture8分钟
Introduction10分钟
Text formats9分钟
Binary formats 18分钟
Binary formats 28分钟
Compression7分钟
How to submit your first assignment3分钟
How to Install Docker on Windows 7, 8, 104分钟
Reading10 个阅读材料
Basic Bash Commands10分钟
Slack Channel is the quickest way to get answers to your questions10分钟
HDFS Lesson Introduction10分钟
Gentle Introduction into "curl"10分钟
File formats extra (optional)10分钟
Grading System: Instructions and Common Problems10分钟
Docker Installation Guide10分钟
Programming Assignment: Instructions and Common Problems10分钟
FAQ How to show your code to teaching staff10分钟
Slack channel "Bigdata-coursera" - the quickest to solve technical problems.10分钟
Quiz2 个练习
Distributed File Systems16分钟
Big Data and Distributed File Systems25分钟
2
完成时间(小时)
完成时间为 3 小时

Solving Problems with MapReduce

...
Reading
17 个视频(共 94 分钟), 1 个阅读材料, 3 个测验
Video17 个视频
Unreliable Components 28分钟
MapReduce4分钟
Distributed Shell8分钟
Fault Tolerance7分钟
Fault Tolerance. Live Demo3分钟
Streaming7分钟
Streaming in Python3分钟
WordCount in Python5分钟
Distributed Cache4分钟
Environment, Counters4分钟
Testing5分钟
Combiner5分钟
Partitioner7分钟
Comparator1分钟
Speculative Execution / Backup Tasks3分钟
Compression4分钟
Reading1 个阅读材料
Hadoop Streaming Assignments: Intro and Code Samples10分钟
Quiz3 个练习
Hadoop MapReduce Intro26分钟
MapReduce Streaming26分钟
Hadoop Streaming Final30分钟
3
完成时间(小时)
完成时间为 4 小时

Solving Problems with MapReduce (practice week)

...
Reading
1 个视频(共 3 分钟), 5 个阅读材料, 5 个测验
Reading5 个阅读材料
Hadoop Streaming Assignments: Intro and Code Samples10分钟
Hints to Debug Hadoop Streaming Applications10分钟
Grading System and Grading System Sandbox User Guide10分钟
Hadoop Streaming Assignments: Instructions10分钟
Hint to the "Stop words" programming assignment10分钟
4
完成时间(小时)
完成时间为 3 小时

Introduction to Apache Spark

...
Reading
16 个视频(共 95 分钟), 2 个阅读材料, 2 个测验
Video16 个视频
Welcome6分钟
RDDs8分钟
Transformations 16分钟
Transformations 27分钟
Actions5分钟
Resiliency6分钟
Execution & Scheduling6分钟
Caching & Persistence5分钟
Broadcast variables5分钟
Accumulator variables5分钟
Getting started with Spark & Python6分钟
Working with text files6分钟
Joins4分钟
Broadcast & Accumulator variables5分钟
Spark UI4分钟
Cluster mode3分钟
Reading2 个阅读材料
Spark Assignments Intro10分钟
Instructions for Spark programming assignment10分钟
Quiz2 个练习
Lesson 1 Quiz20分钟
Lesson 2 Quiz24分钟
4.0

热门审阅

创建者 SDJun 28th 2018

Absolutely essential for everyone who wants a proper introduction to HDFS, MapReduce and Spark. Brought to you by a great team of geniuses of their time ;)

创建者 MGOct 31st 2018

Interesting, useful, informative, accessible (and sometimes funny!) lectures.\n\nStimulating assignments.\n\nFast responses from instructors/mentors.

讲师

Avatar

Ivan Puzyrevskiy

Technical Team Lead
Avatar

Alexey A. Dral

Founder and Chief Executive Officer
BigData Team

关于 Yandex

Yandex is a technology company that builds intelligent products and services powered by machine learning. Our goal is to help consumers and businesses better navigate the online and offline world....

关于 Big Data for Data Engineers 专项课程

This specialization is made for people working with data (either small or big). If you are a Data Analyst, Data Scientist, Data Engineer or Data Architect (or you want to become one) — don’t miss the opportunity to expand your knowledge and skills in the field of data engineering and data analysis on the large scale. In four concise courses you will learn the basics of Hadoop, MapReduce, Spark, methods of offline data processing for warehousing, real-time data processing and large-scale machine learning. And Capstone project for you to build and deploy your own Big Data Service (make your portfolio even more competitive). Over the course of the specialization, you will complete progressively harder programming assignments (mostly in Python). Make sure, you have some experience in it. This course will master your skills in designing solutions for common Big Data tasks: - creating batch and real-time data processing pipelines, - doing machine learning at scale, - deploying machine learning models into a production environment — and much more! Join some of best hands-on big data professionals, who know, their job inside-out, to learn the basics, as well as some tricks of the trade, from them. Special thanks to Prof. Mikhail Roytberg (APT dept., MIPT), Oleg Sukhoroslov (PhD, Senior Researcher, IITP RAS), Oleg Ivchenko (APT dept., MIPT), Pavel Akhtyamov (APT dept., MIPT), Vladimir Kuznetsov, Asya Roitberg, Eugene Baulin, Marina Sudarikova....
Big Data for Data Engineers

常见问题

  • 注册以便获得证书后,您将有权访问所有视频、测验和编程作业(如果适用)。只有在您的班次开课之后,才可以提交和审阅同学互评作业。如果您选择在不购买的情况下浏览课程,可能无法访问某些作业。

  • 您注册课程后,将有权访问专项课程中的所有课程,并且会在完成课程后获得证书。您的电子课程证书将添加到您的成就页中,您可以通过该页打印您的课程证书或将其添加到您的领英档案中。如果您只想阅读和查看课程内容,可以免费旁听课程。

还有其他问题吗?请访问 学生帮助中心