课程信息
4.3
666 个评分
142 个审阅
专项课程

第 1 门课程(共 4 门)

100% 在线

100% 在线

立即开始,按照自己的计划学习。
可灵活调整截止日期

可灵活调整截止日期

根据您的日程表重置截止日期。
完成时间(小时)

完成时间大约为21 小时

建议:4 weeks of study, 6-8 hours/week...
可选语言

英语(English)

字幕:英语(English)

您将获得的技能

Relational AlgebraPython ProgrammingMapreduceSQL
专项课程

第 1 门课程(共 4 门)

100% 在线

100% 在线

立即开始,按照自己的计划学习。
可灵活调整截止日期

可灵活调整截止日期

根据您的日程表重置截止日期。
完成时间(小时)

完成时间大约为21 小时

建议:4 weeks of study, 6-8 hours/week...
可选语言

英语(English)

字幕:英语(English)

教学大纲 - 您将从这门课程中学到什么

1
完成时间(小时)
完成时间为 6 小时

Data Science Context and Concepts

Understand the terminology and recurring principles associated with data science, and understand the structure of data science projects and emerging methodologies to approach them. Why does this emerging field exist? How does it relate to other fields? How does this course distinguish itself? What do data science projects look like, and how should they be approached? What are some examples of data science projects? ...
Reading
22 个视频 (总计 125 分钟), 4 个阅读材料, 1 个测验
Video22 个视频
Appetite Whetting: Extreme Weather2分钟
Appetite Whetting: Digital Humanities8分钟
Appetite Whetting: Bibliometrics4分钟
Appetite Whetting: Food, Music, Public Health5分钟
Appetite Whetting: Public Health cont'd, Earthquakes, Legal4分钟
Characterizing Data Science5分钟
Characterizing Data Science, cont'd5分钟
Distinguishing Data Science from Related Topics4分钟
Four Dimensions of Data Science6分钟
Tools vs. Abstractions7分钟
Desktop Scale vs. Cloud Scale5分钟
Hackers vs. Analysts2分钟
Structs vs. Stats5分钟
Structs vs. Stats cont'd5分钟
A Fourth Paradigm of Science3分钟
Data-Intensive Science Examples6分钟
Big Data and the 3 Vs5分钟
Big Data Definitions4分钟
Big Data Sources6分钟
Course Logistics7分钟
Twitter Assignment: Getting Started14分钟
Reading4 个阅读材料
Supplementary: Three-Course Reading List10分钟
Supplementary: Resources for Learning Python10分钟
Supplementary: Class Virtual Machine10分钟
Supplementary: Github Instructions10分钟
2
完成时间(小时)
完成时间为 5 小时

Relational Databases and the Relational Algebra

Relational Databases are the workhouse of large-scale data management. Although originally motivated by problems in enterprise operations, they have proven remarkably capable for analytics as well. But most importantly, the principles underlying relational databases are universal in managing, manipulating, and analyzing data at scale. Even as the landscape of large-scale data systems has expanded dramatically in the last decade, relational models and languages have remained a unifying concept. For working with large-scale data, there is no more important programming model to learn....
Reading
24 个视频 (总计 122 分钟), 1 个测验
Video24 个视频
From Data Models to Databases4分钟
Pre-Relational Databases5分钟
Motivating Relational Databases3分钟
Relational Databases: Key Ideas4分钟
Algebraic Optimization Overview6分钟
Relational Algebra Overview4分钟
Relational Algebra Operators: Union, Difference, Selection6分钟
Relational Algebra Operators: Projection, Cross Product4分钟
Relational Algebra Operators: Cross Product cont'd, Join6分钟
Relational Algebra Operators: Outer Join4分钟
Relational Algebra Operators: Theta-Join4分钟
From SQL to RA6分钟
Thinking in RA: Logical Query Plans4分钟
Practical SQL: Binning Timeseries5分钟
Practical SQL: Genomic Intervals6分钟
User-Defined Functions3分钟
Support for User-Defined Functions4分钟
Optimization: Physical Query Plans5分钟
Optimization: Choosing Physical Plans4分钟
Declarative Languages5分钟
Declarative Languages: More Examples4分钟
Views: Logical Data Independence5分钟
Indexes6分钟
3
完成时间(小时)
完成时间为 5 小时

MapReduce and Parallel Dataflow Programming

The MapReduce programming model (as distinct from its implementations) was proposed as a simplifying abstraction for parallel manipulation of massive datasets, and remains an important concept to know when using and evaluating modern big data platforms. ...
Reading
26 个视频 (总计 122 分钟), 1 个测验
Video26 个视频
A Sketch of Algorithmic Complexity5分钟
A Sketch of Data-Parallel Algorithms5分钟
"Pleasingly Parallel" Algorithms4分钟
More General Distributed Algorithms4分钟
MapReduce Abstraction4分钟
MapReduce Data Model3分钟
Map and Reduce Functions2分钟
MapReduce Simple Example3分钟
MapReduce Simple Example cont'd3分钟
MapReduce Example: Word Length Histogram2分钟
MapReduce Examples: Inverted Index, Join6分钟
Relational Join: Map Phase4分钟
Relational Join: Reduce Phase4分钟
Simple Social Network Analysis: Counting Friends3分钟
Matrix Multiply Overview5分钟
Matrix Multiply Illustrated4分钟
Shared Nothing Computing4分钟
MapReduce Implementation5分钟
MapReduce Phases6分钟
A Design Space for Large-Scale Data Systems4分钟
Parallel and Distributed Query Processing5分钟
Teradata Example, MR Extensions5分钟
RDBMS vs. MapReduce: Features6分钟
RDBMS vs. Hadoop: Grep5分钟
RDBMS vs. Hadoop: Select, Aggregate, Join3分钟
4
完成时间(小时)
完成时间为 3 小时

NoSQL: Systems and Concepts

NoSQL systems are purely about scale rather than analytics, and are arguably less relevant for the practicing data scientist. However, they occupy an important place in many practical big data platform architectures, and data scientists need to understand their limitations and strengths to use them effectively....
Reading
36 个视频 (总计 166 分钟)
Video36 个视频
NoSQL Roundup4分钟
Relaxing Consistency Guarantees3分钟
Two-Phase Commit and Consensus Protocols5分钟
Eventual Consistency4分钟
CAP Theorem4分钟
Types of NoSQL Systems4分钟
ACID, Major Impact Systems4分钟
Memcached: Consistent Hashing2分钟
Consistent Hashing, cont'd4分钟
DynamoDB: Vector Clocks5分钟
Vector Clocks, cont'd5分钟
CouchDB Overview4分钟
CouchB Views3分钟
BigTable Overview5分钟
BigTable Implementation5分钟
HBase, Megastore3分钟
Spanner5分钟
Spanner cont'd, Google Systems6分钟
MapReduce-based Systems5分钟
Bringing Back Joins4分钟
NoSQL Rebuttal4分钟
Almost SQL: Pig4分钟
Pig Architecture and Performance3分钟
Data Model3分钟
Load, Filter, Group5分钟
Group, Distinct, Foreach, Flatten5分钟
CoGroup, Join3分钟
Join Algorithms3分钟
Skew5分钟
Other Commands3分钟
Evaluation Walkthrough3分钟
Review6分钟
Context3分钟
Spark Examples5分钟
RDDs, Benefits6分钟
完成时间(小时)
完成时间为 2 小时

Graph Analytics

Graph-structured data are increasingly common in data science contexts due to their ubiquity in modeling the communication between entities: people (social networks), computers (Internet communication), cities and countries (transportation networks), or corporations (financial transactions). Learn the common algorithms for extracting information from graph data and how to scale them up. ...
Reading
21 个视频 (总计 91 分钟)
Video21 个视频
Structural Analysis4分钟
Degree Histograms, Structure of the Web4分钟
Connectivity and Centrality4分钟
PageRank3分钟
PageRank in more Detail3分钟
Traversal Tasks: Spanning Trees and Circuits5分钟
Traversal Tasks: Maximum Flow1分钟
Pattern Matching6分钟
Querying Edge Tables4分钟
Relational Algebra and Datalog for Graphs4分钟
Querying Hybrid Graph/Relational Data3分钟
Graph Query Example: NSA6分钟
Graph Query Example: Recursion4分钟
Evaluation of Recursive Programs3分钟
Recursive Queries in MapReduce4分钟
The End-Game Problem3分钟
Representation: Edge Table, Adjacency List4分钟
Representation: Adjacency Matrix2分钟
PageRank in MapReduce5分钟
PageRank in Pregel5分钟
4.3
142 个审阅Chevron Right

热门审阅

创建者 HAJan 11th 2016

Great course that strikes a balance between teaching general principles and concepts, and providing hands-on technical skills and practice.\n\nThe lessons are well designed and clearly conveyed.

创建者 SLMay 28th 2016

I like the breadth of coverage of this class. Each of the exercise is a gem in that I get to learn something new also. I would highly recommend this even to experience practitioner also.

讲师

Avatar

Bill Howe

Director of Research
Scalable Data Analytics

关于 University of Washington

Founded in 1861, the University of Washington is one of the oldest state-supported institutions of higher education on the West Coast and is one of the preeminent research universities in the world....

关于 Data Science at Scale 专项课程

Learn scalable data management, evaluate big data technologies, and design effective visualizations. This Specialization covers intermediate topics in data science. You will gain hands-on experience with scalable SQL and NoSQL data management solutions, data mining algorithms, and practical statistical and machine learning concepts. You will also learn to visualize data and communicate results, and you’ll explore legal and ethical issues that arise in working with big data. In the final Capstone Project, developed in partnership with the digital internship platform Coursolve, you’ll apply your new skills to a real-world data science project....
Data Science at Scale

常见问题

  • 注册以便获得证书后,您将有权访问所有视频、测验和编程作业(如果适用)。只有在您的班次开课之后,才可以提交和审阅同学互评作业。如果您选择在不购买的情况下浏览课程,可能无法访问某些作业。

  • 您注册课程后,将有权访问专项课程中的所有课程,并且会在完成课程后获得证书。您的电子课程证书将添加到您的成就页中,您可以通过该页打印您的课程证书或将其添加到您的领英档案中。如果您只想阅读和查看课程内容,可以免费旁听课程。

还有其他问题吗?请访问 学生帮助中心