0:00
[MUSIC]
Okay, welcome to one of my favorite segments in the entire course,
whereby extending our definition of pattern-matching just a little bit,
we're gonna learn the truth about a lot of what we've been doing.
So what we're about to find out is that we've been using a lot more syntactic
sugar in our ML programs than we even realized.
That in fact, every value binding and
every function binding can use pattern-matching, and
that in ML, as a result, every function takes exactly one argument.
Not zero, not two, not three.
And this is a very flexible, elegant way to design your language.
But before we get to that great punch line,
I need to extend our definition of pattern-matching a little bit.
So so far, we used pattern-matching only on our one of types, on our data type
bindings, and that's because we needed some way to access the values.
The only way we could take something that had one of our data type types,
see which constructor it was built from, and
get the underlying pieces out, was with pattern-matching.
And that's still the case, but
it turns out that pattern-matching also works for each-of types.
It works for records, and since tuples are always just syntactic sugar for
records, it works for tuples as well.
So let me tell you how that works, and then I'll show you a number of examples
which we'll use in increasingly good style as we go along.
So here's a new kind of pattern.
What you can do, is in a place where you put a pattern,
you can put a tuple pattern, which looks a little bit like a tuple expression,
except between the commas we have variables.
And what that's going to do, is if you pattern match one of those expressions
against a tuple value, it will bind.
It will introduce a local variable where x1 is the first component of the tuple,
x2 is the second component of the tuple and so on,
up to xn being matched against the last component of the tuple.
And the type checker will insist that the number of pieces of your tuple correspond
to the number of variables in your pattern.
You're not allowed to mess that up.
Similarly for records, we have a new kind of pattern where you write field name
equal variable name, separated by commas inside of braces, and
that matches a record value that has those same field names.
As usual the order of the field names doesn't matter.
And what happens when you do this match, is the variable x1 will be bound to
the value in the f1 field, x2 in the f2 field, and so on, all right?
So that was a little abstract, let's see some examples.
This is poor style, but it helps explain how the pattern-matching works, and
we'll get to the better style in just a second.
2:48
So first, suppose I had a triple, a threetuple, where each component
was an integer, and I just want a little function to add those three numbers up.
So what I could do, is I could have a function sum_triple,
that took in some argument, call it triple.
Then I could have a case expression where my pattern
matches against tuples that have three parts.
And in fact, since I use x, y, and z to bind to the corresponding three pieces
of data inside that triple, I can use those variables on the right-hand side,
and since I add them together, those pieces must all have type int.
And so the type of sum_triple is int star int star int, the type of three,
in tuples holding three integer things, and the return type is int, all right?
And similarly here's a function full_name.
It takes a record where you have fields first, middle, and last.
Here I'm concatenating them all.
This caret character is ML's operator for concatenating strings, so
in fact, the three fields are going to have to hold data of type string.
And this pattern here is extracting, it's binding to x whatever is in the first
field, binding to y whatever is in the middle field,
and binding to z whatever is in the last field.
So we're using pattern-matching to get the pieces out,
we're just doing it on each-of types, instead of one of types.
Okay, so
now let me tell you another feature that gives us a better way to write this down.
So it turns out, ever since the very beginning of section one,
I told you that val-bindings were written by val variable name equals expression.
And it turns out that's true, but it's not the whole truth.
That what you can actually write is val pattern equals expression,
and variables are just a special kind of pattern, where the variable matches
against the entire result of evaluating the expression e.
But if you put a different pattern there,
then it will match against the expression e as well, and extract the various pieces.
This is great for extracting the pieces of an each-of type,
you can even only get some of the pieces out that you need.
It's bad style to do this where that e
is one of the constructors of a data type binding.
And the reason is, this is gonna be like a one-arm case expression,
a case expression with only one branch.
That's great for each-of types, cuz you only want one branch, but for
data types you really want all the different branches.
If you just put one here, you will get an error at run time if you're wrong,
and you're giving up the advantage of using case expressions where the type
checker makes sure you cover all the possibilities.
Okay, so here's how our examples would look now.
Instead of using a one-arm case expression,
which is bad style, let's use a let expression instead.
So now my function sum_triple is gonna take in this triple, and
what I'm gonna do is pattern match it with this val binding here.
So that will indeed evaluate triple.
It will look up triple in a dynamic environment, it's gonna get some value,
and it's gonna make sure that x is bound to the first piece, y to the second piece,
z to the third piece.
Then x, y and z will all be local variables that are in scope in this let
expression, and I can add them together.
So the type checker will make sure that indeed triple is a tuple with three parts,
all of type int,
because otherwise the rest of this expression wouldn't type check.
And my full_name function works similarly, except now I have a record pattern, and
since I'm concatenating x, y, and z, r has to be a record that has
exactly the fields first, middle, and last, all holding strings.
So this is actually quite nice style, but I'm gonna show you something even better.
So here's the final version of how I wanna write these things, and
I have to tell you one more truth about ML.
6:47
It turns out that a function argument can be a pattern and not just a variable.
So the same way I used to tell you that function bindings looked like
name of function variable equals body,
it turns out what goes right there is actually a pattern.
Again if it's just a variable,
it's gonna be matched against the entire argument to the function, but
if it's a pattern, we'll go ahead and extract pieces of the argument.
So now what I have are these beautiful functions.
Sum_triple takes an argument that will then be pattern-matched
against this tuple pattern, binding x to the first piece, y to the second piece, z
to the third piece, and then the body can do whatever it wants, add them together.
Full names similarly,
will have to be passed something that can match against this pattern.
Only record expressions with first, middle and last fields will match appropriately,
and since x, y and z are used as strings in the body,
the type checker will require full_name to take a record where first, middle,
and last all hold values of type string, okay?
So now that we can do this, and now that we have pattern-matching for
extracting pieces of tuples and
records, we can write this much more concise, elegant, simple code.
And to make sure you get practice with this, we're gonna have a rule on Homework
2, which says that you're not allowed to use anything with the # character,
this pound character on your keyboard, Shift+3 for me.
That's how we used to get pieces of tuples out, with hash one and hash two,
how we used to get pieces of records out with hash fieldname, but
on Homework 2 we want you to instead always use pattern-matching,
cuz that's what we're working on.
And one of the nice benefits, is because you're writing down these patterns,
the type checker will always be able to figure out the type of thing you're
matching against, so you will no longer need to write down any explicit types for
the arguments to your functions or any other variables that you're using, okay?
So now that we know how to do this, let me switch over to the code and
show you something really interesting.
So here in the file I have the three versions of both full_name and
sum_triple we've been working on.
So this first version is bad style cuz it uses one-armed branches which don't make
a lot of sense.
The second version is okay because it uses a val binding instead,
and then this third version is the simple one where in fact,
we put the pattern directly in the function argument.
Now these are just three ways of implementing the same thing, so
one is really syntactic sugar if you like, for another.
So if I go over here and run this, or
at least load all these things in, you'll see that
all three versions of sum_triple have type give me a tuple that takes three ints and
returns an int, and that's the type of all three versions of sum_triple you see here.
And similarly for full_name, they always take in a record of first, last,
middle, string and return a string.
And indeed, these are three ways of writing the same thing, so
you get a 12 if you do it that way, and I'll try one other here.
You get a 12 if you do it this way.
So in all cases take a triple, return an int.
But, I hope some of you are almost ready to jump out of your chair saying,
wait a minute, wait a minute, wait a minute,
let's look at this third possibility here, this sum_triple.
That looks to me like a function that takes three arguments.
That's how we used to write things that take three arguments, and
now you're telling me that it's a function that takes one triple as an argument, so
how can ML tell which of it I'm trying to do?
When I define the function there and also, oh sorry,
when I call it as I meant to here with the third version, not the first one,
how does it know to pass in one triple and pattern match against it,
instead of passing in three arguments, a 3, a 4, and a 5, okay?
So let me put this another way.
Here on the top of the slide is a function that takes one triple of type int*int*int,
and returns an int, and
it does it via pattern-matching, as I've just taught you in this segment.
Here on the bottom, is a function that like we learned a long time ago,
takes three arguments, x, y, and z, and adds them together and returns it.
So if you see some difference between the top half and
the bottom half, you're seeing something I don't, because they're exactly the same.
And in fact, Every function in ML takes one argument, not three, and
we've actually been using pattern-matching all along, I just didn't tell you.
So here is the big truth.
In ML, every function takes exactly one argument, and
what we have been calling multi-argument functions are really
just functions that take one argument that is a tuple.
And those functions are implemented by using pattern-matching on that tuple so
that you get the different pieces of that tuple out.
This is extremely elegant and flexible, and I'm not gonna kid you.
Everyone when they talk about ML code, talks about a multi-argument function,
a three argument function, a four argument function, and so on.
But we know that what's really going on in terms of the language,
is that that's just syntactic sugar.
We're passing one argument that is a tuple, and then we're using
pattern matching to get the pieces out, and this is actually a useful thing to do.
So I have a short example here on the slide that I'll use to finish up
this segment.
Here is a very simple program to rotate the pieces of a tuple.
So how about I just implement that real fast over here, rotate_left
(x,y,z) = (y,z,x), all right?
So now if I call rotate_left (3,4,5),
and I actually spell it correctly, I get (4,5,3), all right?
So I took in some tuple, and I returned a tuple, and that's all great, but
what I could do is I could take that thing that was returned, and pass it into
another function, like rotate_left, in which case I would get (5,3,4),
or directly into say sum_triple3, in which case I should still get
12 because the order doesn't matter when you're summing things up.
This is something you can't do in a lot of programming languages.
It's usually very awkward to return multiple things from one function, and
immediately pass them to another, but in ML every function takes and
returns one argument.
Rotate_left takes one triple as an argument, so
if I have some expression that returns a triple, like rotate_left does,
I can immediately take that thing and pass it to another function.
So I have another example here on the slide which is,
I could write a rotate_right function that takes in some triple t,
and just implements it by rotate_left(rotate_left t),
cuz it turns out if you have a triple, two left rotations is one right rotation.
So it's a silly example, but it's nonetheless an elegant example of how
having this policy of every function takes and returns one argument, it becomes much
easier to take the results of one function and pass them directly to another.
So now we know the truth about ML functions.
Everything takes one argument.
This is one of those great segments where we made our language bigger by adding new
kinds of patterns, pattern-matching over each-of types, but
as a result we made our language smaller, because we now know that there's no need
to have a special concept of multi-argument functions.