0:00

In this session, we will introduce stream as a new data structure.

Streams are like list but they tailored evaluated only on demand.

We would see how this lets you write very elegant solutions to search problem.

In the previous sessions, you've seen a number of operations on immutable

collections that provide powerful tools for communitorial search.

For instance, if you wanted to find the second prime number between 1,000 and

10,000, you could write an expression strictly according to this specification.

Go from 1000 to 10000, Filter with this prime predicate,

Take the second element. This is much shorter than the recursive

alternative, which you see down here, where there's a function second prime,

which finds the second prime number in a given interval between from and to, and

that in turn calls a more general function nthPrime, which takes the nthPrime number

in again, an interval between from and to, and that nthPrime has the usual recursive

set up to iterate through the interval between from and to.

Feasible, but definitely much archaic and less elegant than the simple expression

here. However, the shorter expression also has a

serious problem. It's is evaluation is very, very

inefficient, because what we do here is we construct all the prime numbers between

1000 and 10000, only ever to take its second element.

Presumably there are many more prime numbers between 1000 and 10000.

So you could say, well, Maybe my bound, 10,000, is too high, I

should reduce that, But without knowing a priori where the

prime numbers are, I would always risk that the bound is too low and I would miss

finding the prime number. So we are in the uncomfortable position to

be either really bad in performance because the bound is too high or risking

finding the prime number at all because the bound is too low.

2:03

However, there's a special trick that can make the short code also efficient.

The idea is that we want to avoid computing the tail of a sequence until

that tail is needed for the evaluation result, and of course, that might be

never. That idea is implemented in a new class

which is called the stream. Streams are similar to lists but their

tail is evaluated only on demand. So, here's how it can define streams.

They are, can be built from a constant Stream.empty, so the empty value and the

stream object, and a constructor Stream.cons.

So to build a stream that consists of the number one and two, you could write

Stream.cons one, Stream.cons two, Stream.empty.

And, of course, streams can also be defined like the other collections by

using the stream object as a factory, So you could write simply stream of one,

two, three. A third way to produce streams is with a

toStream method, which can be applied to any collection to produce a stream.

So in example, you see here, we have a range one to 1000 and turn it into a

stream with a toStream method. It's interesting to see what the result of

that call is so what we see here is a stream of Int, which is written like this.

It's a, it's a Stream(1, ).? What that mean is that a stream is

essentially a, recursive structure like a list so we have a one here, but the tail

is not yet evaluated, so that's why the interpretor worksheet has printed a

question mark here. The tail will be evaluated if somebody

wants to know explicitly. Let's look at stream evaluation in a

little bit more detail using ranges as an example.

Instead of the usual range and then toStream expression, I have decided that I

wanted to work from first principles and I wrote a streamRange function directly

here. So that's the usual recursive function, if

the lower bound is greater or equal to higher bound, I return the empty stream,

Otherwise, it's a cons of the lower bound and a recursive call of streamRange(lo +

one, Hi)). If you compare that to the function that

does the same thing for lists, here is listRange and turns out that the two

functions are actually completely isomorphic.

They have exactly the same structure, Only one returns a stream, the other

returns a list, And the empty stream here corresponds to

the Nil case for the lists and the cons case for the streams corresponds to the

cons operator for the lists. Yet their operational behavior is

completely different. If we look at listRaange first, what would

that function do if we have listRange (one, ten).

What would the thing do? Well, it would create a list with the one

here and so on, Until I have the ten here and I have a

Nil, So it would generate the complete list in

one go. Whereas for a streamRange, what would

happen is it would create a first cons pair with a one here and then the rest

would be a ?. So it wouldn't be generated, instead,

there would be an object that would know how to reconstitute the stream if somebody

demands it. If I take the tail then of this

streamRange result, Then, somebody wants to know I would

create the second element of the stream and the third element would have a ?,

And so on until potentially somebody forces wants to know all the elements in

this stream, in which case, in the end, I would have the same structure as for the

lists. But before I would have partially constructed streams, it always end in,

essentially the ?, which stands for unevaluated stream.

In most respects, Streams are actually like List, in particular, Streams support

almost all the methods of a List. So for instance, to find the second prime

number between 1000 and 10,000, the problem we started with, it could simply

write it this way, so instead of writing the range directly, we convert the range

to a string, then we apply the filter method on stream and we apply the apply

method on a stream with the one as the argument.

There's only one exception where streams don't follow list and that's the cons

operator. So if you write x :: xs, that's always

produces a list, never a stream. But there is an alternative operator which

is written #, :: which produces a stream. So x.

#, :: xs is actually the same as Stream.cons (x, xs),

And that operator can be used in expressions, as you see here, but also in

patterns.. So let's look at the implementation of

Streams. It's actually quite close to the

implementation of Lists. So let's start with a base trait, there's

a trait Stream[+A] and it extends a Seq[A] just like lists do, and it has the same

fundamental operations as lists, namely isEmpty, head and tail.

And again as for lists all other methods can be defined in terms of these three

fundamental ones. So if we look at complete implementation

of Streams than actually these also follow closely the ones for lists.

There's one difference, however, that for Streams, the economical implementations

are defined as members of the Stream object, so that's why we write

Stream.empty, which corresponds to list Nil and Stream.cons,

8:07

Which corresponds to the :: class. The implementations of the empty and cons

is actually very close to the ones in List, so, cons would have the following

implementations of isEmpty head and tail, isEmpty is falls.

Head is the first parameter you pass and tail is the second parameter you pass in.

For empty, it's the same thing as for lists again, isEmpty is true in head and

tail, were true exceptions because, of course, they're not define for, empty

strings. So the one thing where the cons class and

the cons method here differ fundamentally is this here.

So for the cons method for streams, the tail parameter is a by name parameter,

As you, as it shown by the leading arrow here.

Whereas for the list cons class, the tail parameter is a normal call by value

parameter. That's the only difference that matters

between strings and lists, and that's the only thing that explains this dramatic

difference in runtime behavior. So because of, tail is a call by name

parameter, it means that, when I first construct a cons cell for a stream, the

tail is not evaluated. It, it will be evaluated the first time

somebody dereferences the name tail, and that's here.

So that if somebody calls a tail method, the tail parameter will be dereferenced

and the rest of the stream will be evaluated.

That's it. The other stream methods are then

implemented analogously to the list counterparts.

So for instance here, you see the filter method, it does the usual thing.

If the string is empty it returns it. If the head satisfies the predicate P, then

we do a cons(head, tail.filter(p)), and otherwise we do a tail.filter(p).

So what in particular happens here is that if I do a filter on a stream whose head

satisfies the predicate, then I do a computation tail.filter(p) here, but that

computation is the second tail parameter of a cons construction.

So that means that the evaluation of filter down the spine of the stream will

be delayed again, until somebody wants to find out what the result of the taking the

tail of the result stream is. So here's a little quiz where you can test

whether you understood how streams evaluate.

I have modified the StreamRange method by adding a print statement here that prints

out the low bound every time stream streamRange is called.

So, when you now write streamRange of 110 with this method and then take(3) and then

toList, What would you expect gets printed?

Nothing or one of these results here? So let's see how we would reason about it.

Have you seen when we take the streamRange(1, ten), we just evaluate the

first element here and the rest is, is yet unknown.

The take method on Streams, if we look at its definition and evaluate it, then it

would do nothing special, it would essentially again return as a stream where

the only node the first element. But then if we finally convert the stream

to a list, then of course, we need to force it because our list can't be left

unevaluated. So by the time we do that, we create a

list with three elements one, two, three and the rest is Nil,

That's the result. And to produce that list, we have to go

down three elements in the orig, original argument stream in the streamRange.

So what I expect is that we would print one, two, three, And you can test that

yourself by submitting the streamRange program at this expression to a worksheet.