0:00
[MUSIC] In this segment we're going to port our ML list library to C, which is a
language that does not have objects or closures.
We're going to use some fairly advance C, so if you're not an expert C programmer,
then maybe some little parts that don't make sense to you, but perhaps you can
still get the high level idea. If your'e more novice C programmer, you
might want to leave this segment for another day.
But give it a try and see if you get the sense of it.
So as I mentioned, closures and objects during the programming objects have
something that's seed is not built in by default.
And that's that you can have parts of your things, your closures your objects,
that do not show up in their type. The fields of an object that implement an
interface, not relevant to the interface you're implementing, and for everyone,
the ML function types do not mention the types of the private fields, the things
that happen to be in the environment. Now in C, in case you didn't know this,
functions really are first class in the sense that you can pass them around, you
can put them in data structures, you can pass one function to another.
But they're just function pointers. Closures in ML have code and environment.
Function pointers in C are only the code. So if you've just passed the function
pointer with the normal sort of arguments that you expect, it will only be able to
use those arguments and any global variables.
It has no access to any other environment, certainly not the func the
environment where the function was defined.
And in fact functions have to be defined at top level.
Right, you can't define one function inside of another.
Although some C compilers support some extensions to allow that.
and so we have to do this ourselves, and so there are various ways to do this, but
the common technique, the technique I usually see in idiomatic C code, is that
whenever you're using function pointers, have them take an extra argument and that
extra argument should be the environment and it can be some, whatever is needed
and it should be passed along with the function pointer and then passed back to
the function pointer when the function pointer is used.
Now I'm going to show you an example of that.
The other high level idea in our code, is that we don't have type variables.
We don't have quote A, quote B. We don't have generics like in java, or C
sharp, or Skala. And so, we're going to have to just use
this type void star, which is going to lead to lots of casts between types.
Because we don't really have a better choice.
Okay? So with that, let's look at the code.
here I'm just defining a little list type.
It's just something with two fields. A head field and a tail field.
My head field, there's no good type for it.
because I want it to be generic, so I'll just use void star.
My tail will be a pointer to another list.
I have a little helper function here for constructing lists.
It just takes in the field the head and the tail.
mallocs allocates room, initializes the field and returns a pointer to the
allocated thing. I am using null here, or some sort of
sentinel. Zero, if you will for representing the
empty list. Okay? So that's the end of the simple
stuff. Now our library has to implement map
filter and length. So, here is map.
And the syntax for function pointers in C is not something I've ever been
particularly fond of. But it gets the job done.
I want to take in a function argument. And the list argument.
We're used to that, okay? What we're generally used to is that the, we have
this list, and this function would take one argument, which would be the element
of the list, and return one argument, which is the thing to put in the new
list. But in, if you just do it that way, if
you just add one argument here, and you don't have the second argument I'm about
to explain, that function pointer cannot be a full closure.
It can only refer to its argument. That will work for some things.
Like our example where we want to double every element to the list.
because we can multiply by 2, if nothing else.
But it won't work for other clients. Like, our account ends, because there's
no way to get that value env back to f. So, the idiom in C is to always have an
extra argument in 2 places. f takes an extra argument and map takes
an extra argument. And what map is going to do is, every time it calls f, it's
just going to pass env back, and so whatever the caller to map needs f to
know about, those private fields from our closure aren't so private anymore.
They're now in this env argument, and then we can pass it to f, and that makes
map, much more useful. Now again, we're using the type void
star, because we have no idea what the type of env should be, and that's going
to lead to a lot of annoying type casts when we go to use map.
And in C that's unavoidable if you want map to work for lists and functions of
any type. All right, so given all that, the actual
body of map is pretty easy if you write it in a recursive way.
It's not particularly conventional to write a lot of recursive functions in C.
it's considered less efficient. But it's much more elegant.
I'm much more confident this is correct than if I wrote a complicated version
with while loops and pointers and all that sort of stuff.
So, if xs is null, the empty list is passed in.
Return the empty list. Otherwise, make a new list.
Out of calling f not just with the head of the list, but also with env and then
the tail of the list is mapping f with the same env across xs arrow tail, so
that is map. Filter is similar, the function we want
pass in should now return a bool or an int if you preferred.
Bool is now a standard in C. again I want this have to take not just
the list element which I will put right here but also an environment which is the
second argument to filter. And then of course filter needs to take
in a list. And, if given the empty list, it returns
the empty list. Here, I call this predecet function f
with the environment passing back whatever data it needs and the head of
the list. If that returns true, then make a list
that includes the head where the tail is filtering f across the tail.
Otherwise, just filter with f across the tail, alright? for length, there's really
no reason for recursion, no function pointers, this is pretty easy.
I went ahead and allowed myself to use a mutable variable and a while loop and
just walk down the list, xs = xs->tail, incrementing ans and then, returning it.
mutating local variables is a, it's a fairly reasonable thing to do in
C, and actually in any programming language.
Okay? So that is our library, and the thing I really want to emphasize before
we get to the clients of the library, is that whenever you're using function
pointers in C, I don't want you to just take in the
argument like the head of the list that you think the function needs.
Always add one extra argument, like you saw in the code and you see in this
second yellow box on the slide, and have the caller take in that extra argument.
Now, if you don't want that extra argument, you can code this up with strux
and other things, but always have the function pointer take the extra argument.
Now why am I emphasizing this? It turns out list libraries like this are not
common in C, okay? People don't write list libraries like this, they just kind
of live without map and filter to be honest with you.
But they do have callbacks, a different closure idiom that I think is just as
important. And when you're writing interfaces and
libraries that use callbacks, please do the same thing.
Let the functions that are registered as callbacks have access to an environment,
so they have the private state they need. Conversely, if you're using these
libraries, and you see these extra void* arguments, now you know why they are
there, and hopefully you can use them effectively to get the benefit of
closures, with the tiny detail that you have to cast to and from void* all over
the place, because the library implementer can't predict what type you
need for your private data. So let's see that typecasting issue by
coming back here and seeing some clients. here is my client of map.
So this is a function that takes in a list, presumably of integers that should
really be documented somewhere, that we assume xs here holds ints and returns a
list of ints where every element is doubled and what I want to do is I want
to call map with this function double int I defined right here.
So that's how you pass a function pointer in C.
Just like ML, we've defined a function. Now we can pass it to another function.
We don't need an environment here. So, double int can just ignore it.
So let's just pass a null for that argument, map takes requires that
argument there. We just don't care what it is and, of
course, the list. And so double int to be the type that map expects, has to take in
a void star for the argument. I've called it ignore, because I'm not
going to use it, and a void star for the list element, which we know will be an
int, but that's not the type that C expects, so now, in the body, we have to
take that i, cast it to an inpointer, then multiply it by 2.
The cast it back to avoid star. If you do that, you'll avoid all compiler
warnings related to the types. and then that was double all.
Now we need to count ns. So, the way we did it in ML was we
filtered our list, so that it only held the elements that were equal to n to
begin with. And then we took the length of that, so
I'm doing the same thing here, length of filter of something, so we just need to
understand the argument to filter. Clearly, we pass in the list, we pass in
this function pointer, is N, that just decides if, The argument is given, is the
same as aha, this n that were passing in the
environment. The private data we need here is n.
That was what we use from our environment in ml, and now we're going to pass it as
this extra argument. And so now, here's our function, is n.
It takes an environment, which will be the n we're looking for.
And the next element of the list, which is i.
Those both, those, have to have type void* to appease the type checker.
So we cast them both to ints. This is the type you're supposed to use
when casting to and from pointers. We see if they're equal.
And we return the corresponding. In bool.
Alright. Bottom line.
Extra argument to your function pointers. Get used to casting to and from void
star. And you can actually use closures in C by
manually constructing the environments as you need them.
If you think that's painful, then you agree with me that you want a language
that helps you automatically construct your closures so that you can use them
without all this pain.