课程信息
7,838

100% 在线

立即开始,按照自己的计划学习。

可灵活调整截止日期

根据您的日程表重置截止日期。

初级

完成时间大约为8 小时

建议:5 hours/week...

英语(English)

字幕:英语(English), 西班牙语(Spanish)

100% 在线

立即开始,按照自己的计划学习。

可灵活调整截止日期

根据您的日程表重置截止日期。

初级

完成时间大约为8 小时

建议:5 hours/week...

英语(English)

字幕:英语(English), 西班牙语(Spanish)

教学大纲 - 您将从这门课程中学到什么

1
完成时间为 1 小时

Big Data Rankings & Products

The first module “Big Data Rankings & Products” focuses on the relation and market shares of big data hardware, software, and professional services. This information provides an insight to how future industry, products, services, schools, and government organizations will be influenced by big data technology. To have a deeper view into the world’s top big data products line and service types, the lecture provides an overview on the major big data company, which include IBM, SAP, Oracle, HPE, Splunk, Dell, Teradata, Microsoft, Cisco, and AWS. In order to understand the power of big data technology, the difference of big data analysis compared to traditional data analysis is explained. This is followed by a lecture on the 4 V big challenges of big data technology, which deal with issues in the volume, variety, velocity, and veracity of the massive data. Based on this introduction information, big data technology used in adding global insights on investments, help locate new stores and factories, and run real-time recommendation systems by Wal-Mart, Amazon, and Citibank is introduced....
6 个视频 (总计 28 分钟), 2 个测验
6 个视频
1.1 Big Data Market Analysis1分钟
1.2 IBM / 1.3 SAP8分钟
1.4 Oracle / 1.5 Splunk / 1.6 Accenture / 1.7 Dell / 1.8 Teradata6分钟
1.9 Microsoft / 1.10 Cisco / 1.11 AWS3分钟
1.12 Big Data Landscape1分钟
2 个练习
Ungraded Quiz8分钟
Graded Quiz
2
完成时间为 1 小时

Big Data & Hadoop

The second module “Big Data & Hadoop” focuses on the characteristics and operations of Hadoop, which is the original big data system that was used by Google. The lectures explain the functionality of MapReduce, HDFS (Hadoop Distributed FileSystem), and the processing of data blocks. These functions are executed on a cluster of nodes that are assigned the role of NameNode or DataNodes, where the data processing is conducted by the JobTracker and TaskTrackers, which are explained in the lectures. In addition, the characteristics of metadata types and the differences in the data analysis processes of Hadoop and SQL (Structured Query Language) are explained. Then the Hadoop Release Series is introduced which include the descriptions of Hadoop YARN (Yet Another Resource Negotiator), HDFS Federation, and HDFS HA (High Availability) big data technology....
8 个视频 (总计 68 分钟), 2 个测验
8 个视频
2.3 Big Data's 4 Vs / 2.4 How is Big Data being Used?8分钟
2.5 HADOOP11分钟
2.6 MapReduce vs. RDBMS6分钟
2.7 MapReduce9分钟
2.8 Hadoop vs. SQL(RDBMS & RDSMS)12分钟
2.9 HDFS Enhancements4分钟
2.10 Hadoop vs. Hadoop YARN6分钟
2 个练习
Ungraded Quiz12分钟
Graded Quiz
3
完成时间为 2 小时

Spark

The third module “Spark” focuses on the operations and characteristics of Spark, which is currently the most popular big data technology in the world. The lecture first covers the differences in data analysis characteristics of Spark and Hadoop, then goes into the features of Spark big data processing based on the RDD (Resilient Distributed Datasets), Spark Core, Spark SQL, Spark Streaming, MLlib (Machine Learning Library), and GraphX core units. Details of the features of Spark DAG (Directed Acyclic Graph) stages and pipeline processes that are formed based on Spark transformations and actions are explained. Especially, the definition and advantages of lazy transformations and DAG operations are described along with the characteristics of Spark variables and serialization. In addition, the process of Spark cluster operations based on Mesos, Standalone, and YARN are introduced....
11 个视频 (总计 101 分钟), 2 个测验
11 个视频
3.2 Spark Architecture / 3.3 Spark Family9分钟
3.4 Spark vs. Hadoop11分钟
3.5 Spark RDD6分钟
3.6 Spark Transformations / 3.7 Spark Actions / 3.8 Spark DAG12分钟
3.9 Spark Programming7分钟
3.10 Spark Core / 3.11 Spark Variables & Serialization7分钟
3.12 Spark Cluster Operations / 3.13 Spark Standalone / 3.14 Spark Mesos14分钟
3.15 Spark YARN9分钟
3.16 Spark SQL / 3.17 Spark GraphX5分钟
3.18 Relational DB & Graph DB12分钟
2 个练习
Ungraded Quiz
Graded Quiz
4
完成时间为 1 小时

Spark ML & Streaming

The fourth module “Spark ML & Streaming” focuses on how Spark ML (Machine Learning) works and how Spark streaming operations are conducted. The Spark ML algorithms include featurization, pipelines, persistence, and utilities which operate on the RDDs (Resilient Distributed Datasets) to extract information form the massive datasets. The lectures explain the characteristics of the DataFrame-based API, which is the primary ML API in the spark.ml package. Spark ML basic statistics algorithms based on correlation and hypothesis testing (P-value) are first introduced followed by the Spark ML classification and regression algorithms based on linear models, naive Bayes, and decision tree techniques. Then the characteristics of Spark streaming, streaming input and output, as well as streaming receiver types (which include basic, custom, and advanced) are explained, followed by how the Spark Streaming process and DStream (Discretized Stream) enable big data streaming operations for real-time and near-real-time applications....
4 个视频 (总计 31 分钟), 2 个测验
4 个视频
4.2 Spark ML Algorithms part 18分钟
4.2 Spark ML Algorithms part 29分钟
4.3 Spark Streaming10分钟
2 个练习
Ungraded Quiz
Graded Quiz
5
完成时间为 1 小时

Storm

The fifth module “Storm” focuses on the characteristics and operations of Storm big data systems. The lecture first covers the differences in data analysis characteristics of Storm, Spark, and Hadoop technology. Then the features of Storm big data processing based on the nimbus, spouts, and bolts are described followed by the Storm streams, supervisor, and ZooKeeper details. Further details on Storm reliable and unreliable spouts and bolts are provided followed by the advantages of Storm DAG (Directed Acyclic Graph) and data stream queue management. In addition, the advantages of using Storm based fast real-time applications, which include real-time analytics, online ML (Machine Learning), continuous computation, DRPC (Distributed Remote Procedure Call), and ETL (Extract, Transform, Load) are introduced....
5 个视频 (总计 36 分钟), 2 个测验
5 个视频
5.2 Storm Applications14分钟
5.3 Storm Spouts & Streams6分钟
5.4 Storm Topology & Management5分钟
5.5 Storm Trident2分钟
2 个练习
Ungraded Quiz
Graded Quiz
6
完成时间为 24 小时

IBM SPSS Statistics Project

The sixth and last module “IBM SPSS Statistics Project” focuses on providing experience on one of the most famous and widely used big data statistical analysis systems in the world. First, the lecture starts with how to setup and use IBM SPSS Statistics, and continues on to describe how IBM SPSS Statistics can be used to gain corporate data analysis experience. Then the data processing statistical results of two projects based on using the IBM SPSS Statistics big data system is conducted. The projects are conducted so the student can discover new ways to use, analyze, and draw charts of the relationship between datasets, and also compare the statistical results using IBM SPSS Statistics....
1 个视频 (总计 9 分钟), 1 个测验
1 个视频

讲师

Avatar

Jong-Moon Chung

Professor, School of Electrical & Electronic Engineering
Director, Communications & Networking Laboratory

关于 延世大学

Yonsei University was established in 1885 and is the oldest private university in Korea. Yonsei’s main campus is situated minutes away from the economic, political, and cultural centers of Seoul’s metropolitan downtown. Yonsei has 3,500 eminent faculty members who are conducting cutting-edge research across all academic disciplines. There are 18 graduate schools, 22 colleges and 133 subsidiary institutions hosting a selective pool of students from around the world. Yonsei is proud of its history and reputation as a leading institution of higher education and research in Asia....

关于 新兴技术:智能手机、物联网和大数据 专项课程

This Specialization is intended for researchers and business experts seeking state-of-the-art knowledge in advanced science and technology. The 4 courses cover details on Big Data (Hadoop, Spark, Storm), Smartphones, Smart Watches, Android, iOS, CPU/GPU/SoC, Mobile Communications (1G to 5G), Sensors, IoT, Wi-Fi, Bluetooth, LP-WAN, Cloud Computing, AR (Augmented Reality), Skype, YouTube, H.264/MPEG-4 AVC, MPEG-DASH, CDN, and Video Streaming Services. The Specialization includes projects on Big Data using IBM SPSS Statistics, AR applications, Cloud Computing using AWS (Amazon Web Service) EC2 (Elastic Compute Cloud), and Smartphone applications to analyze mobile communication, Wi-Fi, and Bluetooth networks. The course contents are for expert level research, design, development, industrial strategic planning, business, administration, and management....
新兴技术:智能手机、物联网和大数据

常见问题

  • 注册以便获得证书后,您将有权访问所有视频、测验和编程作业(如果适用)。只有在您的班次开课之后,才可以提交和审阅同学互评作业。如果您选择在不购买的情况下浏览课程,可能无法访问某些作业。

  • 您注册课程后,将有权访问专项课程中的所有课程,并且会在完成课程后获得证书。您的电子课程证书将添加到您的成就页中,您可以通过该页打印您的课程证书或将其添加到您的领英档案中。如果您只想阅读和查看课程内容,可以免费旁听课程。

还有其他问题吗?请访问 学生帮助中心