0:00

In this video I'll talk about various aspects of the course, the topics that

we'll cover, the kinds of skills you can expect to acquire, the kind of background

that I expect, the supporting materials and the available tools for self

assessment. Let's start with the specific topics that this course is going to cover.

The course material corresponds to the first half of the ten week Stanford

course. It's taken by all computer science undergraduates, as well as many of our

graduate students. There will be five high level topics, and at times these will

overlap. The five topics are first of all, the vocabulary for reasoning about

algorithm performance, the design and conquer algorithm design paradigm,

randomization and algorithm design, primitives for reasoning about graphs, and

the use and implementation of basic data structures. The goal is to provide an

introduction to and basic literacy in each of these topics. Much, much more could be

said about each of them, than we'll have time for here. The first topic is the

shortest, and probably also the driest. But it's a prerequisite for thinking

seriously about the design and analysis of algorithms. The key concept here is big-O

notation, which, conceptually, is a modeling choice about the granularity with

which we measure a performance metric like the running time of an algorithm. It turns

out that the sweet spot for clear high level thinking about algorithm design, is

to ignore constant factors and lower-order terms. And to concentrate on how well

algorithm performance scales with large input sizes. Big O notation is the way to

mathematize this sweet spot. Now, there's no one silver bullet in algorithm design.

No single problem solving method that's guaranteed to unlock all of the

computational problems that you're likely to face. That said, there are a few

general algorithm design techniques. High level approaches to algorithm design that

find successful application across a range of different domains. These relatively

widely applicable techniques are the backbone of a general algorithms course

like this one. In this course, we'll only have time to deeply explore one such

algorithm design paradigm, namely that of the divide and conquer algorithms. In the

sequel course as we'll discuss, there's two other major algorithms on paradigms to

get covered. But for now, divide and conquer algorithm, the idea is to first

break the problem into smaller problems which then gets solved recursively, and

then to somehow quickly combine the solutions to the sub problems into one for

the original problem that you actually care about. So for example, in the last

video. We saw two algorithms of this sort, two divide and conquer algorithms from

multiplying two large integers. In later videos we will see a number of different

applications. We'll see how to design fast divide and conquer algorithms for problems

ranging from sorting to matrix multiplication to nearest neighbor-type

problems and computation of geometry. In addition, we'll cover some powerful

methods for reasoning about the running time of recursive algorithms like these.

2:37

As for the third topic. A randomized algorithm is one that, in some sense,

flips coins while it executes. That is, a randomized algorithm will actually have

different executions if you run it over and over again on a fixed input. It turns

out, and this is definitely not intuitive, that allowing randomization internal to an

algorithm, often leads to simple, elegant, and practical solution to various

computational problems. The canonical example is randomized quick sort, and that

algorithm and analysis we will cover in detail in a few lectures. Randomized

primality testing is another killer application that we'll touch on. And we'll

also discuss a randomized approach to graph partitioning. And finally

we'll discuss how randomization is used to reason about hash functions and hash maps.

One of the themes of this course, and one of the concrete skills that I hope you

take away from the course, is, literacy with a number of computational primitives

for operating on data, that are so fast, that they're, in some sense, essentially

free. That is, the amount of time it take to invoke one of these computational

primitives is barely more than the amount of time you're already spending just

examining or reading the input. When you have a primitive which is so fast, that

the running time is barely more than what it takes to read the input, you should be

ready to apply it. For example, in a preprocessing step, whenever it seems like

it might be helpful. It should just be there on the shelf waiting to be applied

at will. Sorting is one canonical example of a very fast, almost for-free

primitive of this form. But there are ones that operate on more complex data as well.

So recall that a graph is a data structure that has, on the one hand, vertices, and

on the other hand, edges. Which connects pair of vertices. Graphs model, among any

other things, different types of networks. So even though graphs are much more

complicated than mere arrays, there's still a number of blazingly fast primitives

for reasoning about their structure. In this class we'll focus on primitives for

competing connectivity information and also shortest paths. We'll also touch on how some primitives have been

used to investigate the structure of information in social networks. Finally,

data structures are often a crucial ingredient in the design of fast

algorithms. A data structure's responsible for organizing data in a way that supports

fast queries. Different data structures support different types of queries. I'll

assume that you're familiar with the structures that you typically encounter in

a basic programming class including arrays and vectors. Lists, stacks, and queues.

Hopefully, you've seen at some point both trees and heaps, or you're willing to read

a bit about them outside of the course, but we'll also include a brief review of

each of those data structures as we go along. There's two extremely useful data

structures that we'll discuss in detail. The first is balanced binary search trees.

These data structures dynamically maintain an ordering on a set of elements, while

supporting a large number of queries that run in time logarithmic in the size of

the set. The second data structure we'll talk a fair bit about is hash tables or

hash maps, which keep track of a dynamic set, while supporting extremely fast

insert and lookup queries. We'll talk about some canonical uses of such data

structures, as well as what's going on under the hood in a typical

implementation of such a data structure. >> There's a number of

important concepts in the design and analysis of algorithms that we won't have

time to cover in this five week course. Some of these will be covered in the

sequel course, Design and Analysis of Algorithms II, which corresponds to the

second half of Stanford's ten week course on this topic. The first part of this

sequel course focuses on two more algorithm design paradigms. First of

all, the design analysis of greedy algorithms with applications to minimum

spanning trees, scheduling, and information theoretic coding. And

secondly, the design analysis of dynamic programming algorithms with example

applications being in genome sequence alignment and the shortest path protocols

in communication networks. The second part of the sequel course concerns NP complete

problems, and what to do about them. Now, NP complete problems are problems that,

assuming a famous mathematical conjecture you might have heard of, which is called the

"P not equal to NP" conjecture, are problems that cannot be solved under this

conjecture by any computationally efficient algorithm. We'll discuss the

theory of NP completeness, and, with a focus on what it means for you as an

algorithm designer. We'll also talk about several ways to approach NP complete

problems, including: fast algorithms that correctly solve special cases; fast

heuristics with provable performance guarantees; and exponential time

algorithms that are qualitatively faster than brute force search. Of course there

are plenty of important topics that can't be fit into either of these two five-week

courses. Depending on the demand, there might well be further courses on more

advanced topics. Following this course is going to involve a fair amount of time and

effort on your part. So it's only reasonable to ask: What can you hope to get

out of it? What skills will you learn? Well. Primarily, you know, even though

this isn't a programming class per se, it should make you a better programmer.

You'll get lots of practice describing and reasoning about algorithms, you'll learn

algorithm design paradigms, so really high level problem-solving strategies that are

relevant for many different problems across different domains, and tools for

predicting the performance of such algorithms. You'll learn several extremely

fast subroutines for processing data and several useful data structures for

organizing data that can be deployed directly in your own programs. Second,

while this is not a math class per se, we'll wind up doing a fair amount of

mathematical analysis. And this in turn will sharpen your mathematical analytical

skills. You might ask, why is mathematics relevant for a class in the design and

analysis of algorithms, seemingly more of a programming class. Well let me be clear.

I am totally uninterested in merely telling you facts or regurgitating code

that you can already find on the web or in any number of good programming books. My

goal here in this class, and the way I think I can best supplement the resources

that you probably already have access to is to explain why things are

the way they are. Why we analyze the algorithms in the way that we do, why

various super fast algorithms are in fact super fast, and so on. And it turns out

that good algorithmic ideas usually require nontrivial mathematical analysis

to understand properly. You'll acquire fundamental insights into the specific

algorithms and data structures that we discuss in the course. And hopefully, many

of these insights will prove useful, more generally, in your other work. Third, and

perhaps the most relevant for those of you who work in some other discipline: this

course should help you learn how to think algorithmically. Indeed after studying

algorithms it's hard enough not to see them pretty much everywhere, whether you

are riding an elevator, watching a flock of birds, buying and selling stocks out of

your portfolio, even watching an infant learn. As I said in the previous video

algorithm thinking is becoming increasingly useful and prevalent if you

are outside of computer science and technology like in biology, statistics and

economics. Fourth, if you're interested in feeling like a card carrying computer

scientist, in some sense, then you'll definitely want basic literacy in all of

the topics that we'll be covering. Indeed, one of the things that makes studying

algorithms so fun, is, it really feels like you're studying a lot of the greatest

hits from the last 50 years of computer science. So, after this class, no longer

will you feel excluded at that computer science cocktail party when someone cracks

a joke about Dijkstra's Algorithm. Now you'll know exactly what they mean.

Finally, there's no question that studying this material is helpful for technical

interview questions. To be clear, my sole goal here is to teach you algorithms, not

to prepare you for interviews, per se. But over the years, countless students of mine

have regaled me with stories about how mastering the concepts in this class

enabled them to ace every technical question they were ever asked. I told you,

this is fundamental stuff. So, what do I expect from you? Well, honestly, the

answer is nothing. After all isn't the whole point of a free online class like

this one that anyone can take it and devote as much effort to it as they like.

So that said, as a teacher it's still useful to have one or more canonical

students in mind. And I thought I'd go ahead and be transparent with you about

how I'm thinking about these lectures. Who I have in mind that I'm teaching to. So

again, please don't feel discouraged if you don't conform to this canonical

student template. I'm happy to have the opportunity to teach you about algorithms

no matter who you are. So first, I have in mind someone who knows at least some

programming. For example, consider the previous lecture. We talked about a

recursive approach to multiplying two numbers and I mentioned how in certain

mathematical expression, back then we labeled it star and circled it in green.

How that expression naturally translated into a recursive algorithm. In particular,

I was certainly assuming that you had some familiarity with recursive programs.

If you feel comfortable with my statement in that lecture, if you feel like you

could code up a recursive integer multiplication algorithm based on the high

level outline that I gave you, then you should be in good shape for this course.

You should be good to go. If you weren't comfortable with that statement, well, you

might not be comfortable with the relatively high conceptual level at which

we discuss program in this course. But I encourage to watch the next several videos

anyway, to see if you get enough out of them to make it worth your while. [sound].

11:24

Now, while I'm aiming these lectures at people who know some programming, I'm not

making any assumptions whatsoever about exactly which programming languages you know. Any

standard imperative language you know, something like C, Java or Python, is

totally fine for this course. Now, to make these lectures accessible to as many

programmers as possible, and to be honest, you know, also to promote thinking about

programming at a relatively abstract conceptual level, I won't be describing

algorithms in any particular programming language. Rather, when I discuss the

algorithms, I'll use only high-level pseudo-code, or often simply English. My

inductive hypothesis is that you are capable of translating such a high level

description into a working program in your favorite programming language. In fact, I

strongly encourage everyone watching these lectures to do such a translation of all

of the algorithms that we discussed. This will ensure your comprehension, and

appreciation of them. Indeed, many professional computer scientists and

programmers don't feel that they really understand an algorithm until they've

coded it up. Many of the course's assignments will have a problem in which

we ask you to do precisely this. Put another way, if you're looking for a sort

of coding cookbook, code that you can copy and paste directly into your own programs.

Without necessarily understanding how it works, then this is definitely not the

course for you. There are several books out there that cater to programmers

looking for such coding cook books. Second, for these lectures I have in mind

someone who has at least a modest amount of mathematical experience though perhaps

with a fair bit of accumulated rust. Concretely I expect you to be able to

recognize a logical argument that is a proof. In addition, two methods of proof

that I hope you've seen before are proofs by induction and proofs by contradiction.

I also need you to be familiar with basic mathematical notation, like the standard

quantifier and summation symbols. A few of the lectures on randomized algorithms

and hashing will go down much easier for you if you've seen discrete probability

at some point in your life. But beyond these basics, the lectures will be self

contained. You don't even need to know any calculus, save for a single simple

integral that magically pops up in the analys of the randomized quick sort

algorithm. I imagine that many of you have studied math in the past, but you could

use a refresher, you're a bit rusty. And there's plenty of free resources out there

on the web, and I encourage you to explore and find some that you like. But one that

I want to particularly recommend is a great set of free lecture notes. It's

called Mathematics for Computer Science. It's authored by Eric Lehman and Tom

Layden, and it's quite easy to find on the web if you just do a web search. And those

notes cover all of the prerequisites that we'll need, in addition to tons of other

stuff. In the spirit of keeping this course as widely accessible as possible,

we're keeping the required supporting materials to an absolute minimum. Lectures

are meant to be self-contained and we'll always provide you with the lecture notes

in PowerPoint and PDF format. Once in a while, we'll also provide some additional

lecture notes. No textbook is required for this class. But that said, most of the

material that we'll study is well covered in a number of excellent algorithms books

that are out there. So I'll single out four such books here. The first three I

mention because they all had a significant influence on the way that I both think

about and teach algorithms. So it's natural to acknowledge that debt here. One

very cool thing about the second book, the one by Dasgupta, Papadimitriou and Vazirani, is that the authors

have made a version of it available online for free. And again, if you search on the

authors' names and the textbook title, you should have no trouble coming up with it

with a web search. Similarly, that's the reason I've listed the fourth book because

those authors have likewise made essentially a complete version of that

book available online and it's a good match for the material that we're going to

cover here. If you're looking for more details about something covered in this

class, or simply a different explanation than the one that I give you, all of these

books are gonna be good resources for you. There are also a number of excellent

algorithm textbooks that I haven't put on this list. I encourage to explore

and find you own favorite. >> In our assignments, we'll sometimes ask you to

code up an algorithm and use it to solve a concrete problem that is too large to

solve by hand. Now, we don't care what program and language and development

environment you use to do this as we're only going to be asking you for the final

answer. Thus, we're not requiring anything specific, just that you are able to write

and execute programs. If you need help or advice about how to get set up with a

suitable coding environment, we suggest that you ask other students for help via

the course discussion forum. Finally, let's talk a bit more about assessment.

Now this course doesn't have official grades per se, but we will be assigning

weekly homeworks. Now we're going to assign homeworks for three different

reasons. The first is just for self-assessment. It's to give you the

opportunity to test your understanding of the material so that you can figure out

which topics you've mastered and which ones that you haven't. The second reason

we do it is to impose some structure on the course, including deadlines, to

provide you with some additional motivation to work through all the topics.

Deadlines also have a very important side effect that synchronizes a lot of the

students in the class. And this of course makes the course discussion forum a far

more effective tool for students to seek and provide help in understanding the

course material. The final reason that we give homeworks is to satisfy those of you

who, on top of learning the course material, are looking to challenge

yourself intellectually. [sound]. Now, this class has tens of thousands of

students. So it's obviously essential that the assignments can be graded

automatically. Now, we're currently only in the 1.0 generation of free online

courses such as this one. So the available tools for auto graded assessment are

currently rather primitive. So, we'll do the best we can, but I have to be honest

with you. It's difficult, or maybe even impossible to test deep understanding of

the design and analysis of algorithms, using the current set of tools. Thus,

while the lecture content in this online course is in no way watered down from the

original Stanford version. The required assignments and exams we'll give you, are

not as demanding as those that are given in the on campus version of the course. To

make up for this fact, we'll occasionally propose optional algorithm design

problems, either in a video or via supplementary assignment. We don't have

the ability to grade these, but we hope that you'll find them interesting and

challenging, and that you'll discuss possible solutions with other students via

the course discussion forum. So I hope this discussion answered most of the

questions you have about the course. Lets move on to the real reason that we're all

here, to learn more about algorithms.