Explore stock prices with Spark SQL

4.6
16 个评分
提供方
Coursera Project Network
在此指导项目中,您将:

Create an application that runs on a Spark cluster

Derive knowledge from data using Spark RDD and DataFrames

Store results in Parquet tables

Clock2 hours
Intermediate中级
Cloud无需下载
Video分屏视频
Comment Dots英语(English)
Laptop仅限桌面

In this 1-hour long project-based course, you will learn how to interact with a Spark cluster using Jupyter notebook and how to start a Spark application. You will learn how to utilize Spark Resisilent Distributed Datasets and Spark Data Frames to explore a dataset. We will load a dataset into our Spark program, and perform analysis on it by using Actions, Transformations, Spark DataFrame API and Spark SQL. You will learn how to choose the best tools to use for each scenario. Finally, you will learn to save your results in Parquet tables.

您要培养的技能

Spark SQLData AnalysisBig DataApache SparkDistributed Computing

分步进行学习

在与您的工作区一起在分屏中播放的视频中,您的授课教师将指导您完成每个步骤:

  1. By the end of Task 1, you will become familiar with the Jupyter notebook environment

  2. By the end of Task 2, you will be able to initialize a Spark application

  3. By the end of Task 3, you will be able to create Spark Resilient Distributed Datasets

  4. By the end of Task 4, you will be able to create Spark Data Frames in several ways

  5. By the end of Task 5, you will be able to explore data sets with Spark SQL

  6. By the end of Task 6, you will be able to write statistic queries and compare Spark DataFrames

  7. By the end of Task 7, you will be able to store DataFrames in Parquet tables

指导项目工作原理

您的工作空间就是浏览器中的云桌面,无需下载

在分屏视频中,您的授课教师会为您提供分步指导

常见问题

常见问题

还有其他问题吗?请访问 学生帮助中心