0:00

In the next series of lectures, we'll take a close look at collections, in

particular, immutable collections that are used in purely functional programs.

An important tool in the functional programs was tool box, because they enable

the expression of algorithms in a very high level and concise way that, in

addition, has a high chance of being correct.

In this session, we'll start with lists. You've already seen lists as a fundamental

data structure last week, where we discussed the core concepts.

In this week, now we are going to show you how lists are defined in this Scala

standard library, and what kind of operations they support.

So let's see how lists are defined in the Standard Library and what you can do with

them. First, to construct the list having

elements X1 to Xn you simply write list and the elements in parentheses.

So here you see some examples. A list of fruit consisting of apples,

oranges, and pears. A list of numbers consisting of the

numbers one, to four. And here's the diagonal of a three by

three matrix represented as a list, Of lists.

And finally, the last example is the empty list, which is simply return list of open

parens, close parens. Lists are sequences just like arrays are,

and you might know arrays from Java or C but there are two important differences.

The first is that lists are immutable, so you can't change an element of a list.

And the second is that lists are recursive, while arrays are flat.

In fact, you see the lists you see in the Scala standard library, are very much like

the lists that we have constructed from scratch over the last week.

The basic construction is exactly the same.

The data structures are the same but the lists in the Scala library carry many more

operations that you can do with them. And in the, in this session and the next,

we are going to find out what these operations are.

3:33

So, let's have a quick look at the list types.

It is essentially the same as what we have constructed last week.

So, a list with string element is written as list of string, a parametrized type.

So, fruit would be a list of string, numbers would be a list of int, diagonal

three would be a list of list of int. And finally, empty would be a list of

nothing. A list that does not contain elements and is therefore a list of the

bottom type, of as, as element type. Now we've seen lists constructed

homogeneously, refer it in list and just all the elements in the list but in fact,

that's syntactic sugar for something more fundamental.

All lists in Scala are constructed from the empty list Nil and the construction

operation, which is written as a double, double point and is pronounced Cons.

So the operation x cons xs gives you a new list with the first element x and it's

followed by the elements of xs. So that means that our previous, lists of fruit

and numbers and empty can also be written like this.

So fruit would be apples, corns, a list oranges, corns, a list pears and then nil.

That corresponds exactly to what we've drawn in the slide before.

Numbers then would be similarly and finally empty.

The empty list would just be nil. So again note the similarity of what we've

done last week with our simple list hierarchy.

It's just where we were writing new cons x xs.

Now we write simply, x :: xs. In fact, there's a convention for Scala

syntax which makes the cons operation look nicer.

We have the universal convention that all operators that end in a colon associate to

the right. Whereas all other operators would

associate to the left, as usual. So that means that if you have two double

colons like here A :: B :: C then that's really interpreted as A :: and then B :: C

so the parentheses go to the right. And that means when you write a list like

this, you can omit the parentheses because that would be redundant.

So this list really means the same as what we've drawn before with parenthesis going

here, Like that.

The second difference concerning operators that end in colon.

Concerns that raise their scene of method calls.

In fact an operand that ends in a call is seen as a method call of its right hand

operand. Remember all other in fix operators are

seen as method calls on the left hand operands.

So the left hand operand was the receiver. For operaators ending in a colon, this is

reversed, so it's the right hand operand. And that makes a lot of sense, because it

means that an expression like this will really be expanded by the compiler to this

sequence of method courts. So it's Nil followed by the colon

operation, that's a method colon Nil and we pass four as the argument to it.

Then we have another colon method, colon we pass three, then we have another one

and we pass two, and finally, we pass one. So each application of the double colon

cons operation pre-pens the argument to the list that was constructed so far.

So, double colon is really with this convention, the same as the preprened

operation we defined last week. You also know from last week that there

are three fundamental operations on lists. And that all other operations can be

expressed in terms of these three. They are, isEmpty which finds out whether

a list is empty. It returns true if it's empty, false

otherwise. Head, which returns the first element of

the list. Exception if the list is empty. Tail which returns the list compose of all

the elements except the first. So these operations are as you've seen

last week defined as methods on objects of type list.

So for instance you would write fruit.head and get apple or you would write

fruit.tail.head and get oranges. The second element in the fruit list.

If you take the diagonal of size three then its head element would be the first

row. So that would be the list of one, zero,

zero. And if you take head of the empty list,

then you would get an exception. A no such element exception which would

tell you that you have tried to take the head of the empty list.

Its also possible and in fact, often preferred, to decompose lists with pattern

matching. The patterns that you can apply to lists

are exactly the same as the construction methods of lists.

So there will be a middle pattern that corresponds to the middle constant,

there's a cons pattern that has a pattern. And in front of the double colon and the

pattern afterwards. And the idea of that would be that this

pattern matches any list whose head matches P and whose tail matches the

second pattern, PS. And finally there is a short hand list of

P1 to PN, and that's, as usual, the same as the pattern P1 cons P2 and so on cons

PN, and finally, a nil at the end. So, let's see some examples.

The first pattern here would match lists whose first element is a one.

The second element is a two and the rest of the list is arbitrary and is bound to

the variable xs. The second paren.

Here would match lists of length one. And the first element can be arbitrary and

is bound to the variable x. The pattern list of x is exactly the same

as x cons nil. So again, it's lists of length one whose

element is bound to the variable x. The list of open parens., closed parens.,

that's the empty list, the same as nil. And the paren.

List of two colon xs, what would that be? While that would match a list with a

single element which is another list that starts with two and whose tail is bound to

the variable xs. So let's do an exercise.

Consider this list pattern here x cons y cons list xs, ys, cons zs.

What is the condition that describes most accurately the length l of the list it

matches? Does it match lists of length three or

four or five? Or any list that, whose length is greater

or equal to three or greater or equal to four or greater or equal to five?

Well, let's have a look at the pattern again.

So. What we see here is a list of three

elements, the first named x the second named Y, the third is a list by itself.

And then, the rest of the list is captured by the variable zs so that means that the

list. Must have a length that is greater or

equal to three. The variable zed s might be empty, nil, in

which case the list would have a length of exactly three or it might be non-empty in

which case the length of the list would be greater than three.

So in any case the condition that we're looking for would be length greater or =

three. As another example let's suppose we want

to sort a list of numbers in ascending order.

In fact the standard class list in, in the Scala library has a sort function but

let's pretend it hasn't and we have to do it ourselves.

So one way to sort the list say, list of 7392 would be to sort the tail of the

list. 392 sorted would give us list of 239, and

then to insert the head of the list at the right place.

At the right place means that, the, it, it, all elements that precede the inserted

elements are smaller or equal, and all elements that follow are larger or equal.

That idea is insertion sort. So.

We would write it as follows. We would say, well, we have a function I

sort for insertion sort. It takes a list of ints, it gives us back

a list of ints which is the sorted version of xs.

And we would say, okay, if xs is the empty list, then, let's return the empty list.

If xs is a list that consists of at least one element, call that y, and arrest ys.

Then, what we would do is, we would sort recursively the rest, the tail, YS and we

would insert Y into the tail. What you've seen here is by far the most

standard way to decompose a list. You would typically ask first, is the list

empty, and if it's not empty, you would ask, well, what is its head and what is

its tail? And all of these questions are expressed

in the two patterns list of open parens, close parens and the cons pattern that you

see here. So the definition of insertion sort is not

quite done yet because we still have to define the function Insert, that inserts

an element x at the right place of a list x-s which is already sorted.

I'll leave that to you as an exercise. As a hint, we would apply the same

decomposition of lists that we've seen before.

The standard decomposition into a case where the list is empty, and where it's a

cons of a heads and a tail. So all that remains is fill in the triple

question marks here. Once you've done that, I'd like you to

answer the following question. What is the worst case complexity of

insertion sort relative to the length of the input list N?

What I mean by that is what is the number of steps insertion sort performs in our

substitution model as a function of the length of the input list N.

Does sort always take the same amount of time no matter how large N is?

So we would say that sort takes constant time or does sort take a number of steps

that's proportional to the input list N? Or is it maybe proportional to N times

logarithm of N or proportional to N squared?

So lets see how we would answer this. Lets first fill in the insert function.

So inserting an element in an empty list. What would we expect to get back?

Well, that would be the list that contains just the element to be inserted.

Inserting X in a non-empty list, well there would two cases.

The first case would be that the. Element to be inserted is, in fact, less

than or equal to the first element to the list.

In that case, we can simply give you X followed by XS.

So we know that the element X will be the head element of the new list.

Otherwise. What do we need to do?

Well, the first element of our list in the other case where x is greater than y,

would be y, because that's the smallest element of all the elements that we've

seen. And that would be followed by.

Now what we need to do is we need to have a call of insert of x into the remainder

of the list, into y s. So lets look at the complexity.

Now, looking alone at insert first, we see that worst case would be that the element

X is greater than all the elements of the lists.

In which case, we would need N recursive calls for the insert functions.

So the number of steps of insert would be proportional to N where N is the length of

the list. Going back to insertion sort here.

We ask ourselves how many calls to insert would be expect for list of length N.

Well the answer is there would be one call for each element that we have in the list.

So that would be another factor of N. So what we would get at the end is that

the complexity of insert is proportional to N squared.

Which is actually not a good. So we will see in a couple of sessions

another way to sort lists whose complexity is better.