0:00

[music] When we're writing an interpreter for some language B, an important question

is, what kind of errors, programs in B might have, and how our interpreter should

check for them. So I want to cover that in this segment.

This will help us distinguish between the language B that we are implementing, and

the language A in which we're implementing the interpreter and it's also a

distinction that's important for your homework to know what kind of errors you

need to check for. So here's what we know so far.

We know that if we're going to implement some language B, we're going to define the

syntax for that language abstract syntax using Racket structs.

Then, the language B programmer will write directly in Racket using constructors to

write their language B programs. And, we'll implement our interpreter as

some sort of recursive function like the eval X that we've see for our arithmatic

expression. So now, the important distinction I want

to make is that interpreter can go ahead and assume That the abstract syntax tree

it gets is a legal B program. That it makes sense as a syntax tree and

if it doesn't, then it's fine to just sort of crash and give a strange message.

But the interpreter must check, must not assume that the types of different data

used in the language B program are correct.

In particular, when it recursively evaluates a sub-expression, it needs to

check the result for what kind of thing it got.

So in this segment I'm going to explain what I mean by this.

So what is a legal AST? So these are the sets of trees that the

interpreter must handle and it's a subset of all possible trees that Racket would

let us make. So Racket is a dynamically type language.

So when I declare these four structs for const, negate, add and multiply, there's

nothing except comments and documentation that indicate that this int field should

hold a number. This e field should hold another

expression, which is another legal AST in this language.

E1 and e2 should be legal ASTs in this language, and e1 and e2 for multiply,

should be legal ASTs in this language. We, in implementing our interpreter, can

go ahead and assume that we are given a legal AST.

And if we're not, then we'll just crash and that will be fine.

So, we can assume that const holds a number, negate holds a legal AST and add

and multiply hold 2 legal ASTs. Some examples were then, is not the case

over here at the bottom of the slide. In this first example, I have a multiply

expression, allegedly, but if you look inside the second argument to add here is

a string. That does not make sense, this is not a

programming language B and we don't have to worry about detecting this error.

Similarly, a much more common error that language b programmers tend to make is

writing negate of minus 7 instead of negate of const of minus 7, right?

Negate of minus 7 is not a legal program. Negate is supposed to have an expression

inside of it. So, we should say negate of constant minus

7. And again, our interpreter can go ahead

and assume that this sort of thing doesn't happen and if it does happen, it can just

give a bad strange message. What's going to happen and what I'm about

to show you in this segment is that our interpreter does return expressions, but

not just any expressions. If you think about the recursive result of

any call to our interpreter of evaluating an expression, it's always a, something we

call a value. A value is a kind of expression that

evaluates to itself. If your interpreter ever returns something

that is still an addition, or a negation, or a multiplication, your interpreter has

a bug. For our simple language, our interpreter

would always return, from any call, any recursive call, a constant, like constant

17. Where life gets more interesting is when

our language has more than one kind of value.

So far, the only kind of value you have are constants, but in the language you

implement in your homework we're also going to have pairs of values.

So not all pairs are values, but if the 2 components of a pair are themselves values

then the result is a value. You could also have Booleans, you could

have strings. Closures are values.

We'll have closures in our homework. So we recursively evaluate a sub

expressions, we could get one of a number of values.

Any kind of expression that is something that's finished evaluating, and it doesn't

just have to be a number. So, the code I'm about to show you in this

segment is for a language that doesn't just have numbers.

It has, booleans, it has numerical comparisons, and it has conditionals.

So we're going to add to our language bool, which has a b field which should

always be true or false, an eq-num field which has two sub expressions, these

should be expressions that evaluate to numbers we see if they're the same, and

then an if-then-else in the language we're implementing.

So what if the program is a legal AST? So b is true or false.

E1 and e2 are legal, AST's. E1, e2 and e3 are legal AST's.

But when you run this program, you try to apply the wrong construct to the wrong

kind of value. For example, suppose we try to add an, a

const, like const 17, to a boolean, like bool of false.

This is something that your interpreter should detect and should give an

appropriate error message explaining, this is what happened.

Not in terms of tried to take car or something, or didn't find something or

whatever. You shouldn't be exposing your interpreter

details, you should give a good message, this kind of problem.

Okay? So, let's just switch over to the code and

see what I'm talking about. So right here I have the definition of my

language and it's exactly what I told you on the slides.

So I have constants, negations, additions, multiples, bools, whether two numbers are

equal and an if-then-else. Notice on the two numbers are equal.

These aren't numbers. They're not even constants.

They're expressions, sub-expressions that we evaluate and then we compare the

results. Just like for add, we have two expressions

here that we evaluate and then we add the results.

Okay. So here are 2 test programs I want to

consider. This first test program right here, is

perfectly reasonable, it multiplies the negation of adding 2 and 2 with adding 7

and this should return minus 28, if you follow at all.

And notice, it's completely legal AST, so you know add takes in 2 smaller

expressions. Negate takes in 1 smaller expression.

Const always has a number and so on. Here's the second test that is also a

legal AST but when you run it you should get an error because it tryies to use the

wrong kind of value. So it starts the same it multiplies

something that will eventually evaluate to the constant negative 4.

But then in this if-then-else, I end up returning from the if-then-else true,

because when I evaluate this I'm going to get true because I don't have a false here

so I'll take the false branch, I'll get true and so I'm going to try to end up

multiplying minus 4 and true. Now, I haven't done that yet.

If I just run this, test1 is this abstract syntax tree, test2 is this abstract syntax

tree, these are both legal abstract syntax trees.

But if I have this version of eval-exp, if I call test1, I should get minus 28.

If I call test2, I should get an error message that's appropriate, like you tried

to apply something that wasn't a number. Now, going back to the examples, I have

one other thing here. This is not a test.

This is not something we have to handle gracefully or handle well.

In this program, if you actually look at it, I have const of true.

That's just not a legal AST, so we don't have to deal with this sort of thing.

And so, if I called eval-exp with non-test, it's fine to just get a terrible

message like expects number as 1st argument given true, blah, blah, blah.

This is just something that's basically the underlying implementation are getting

stuck and reporting an error, okay. So that is the distinction.

Now let me show you two interpreters one that gets it right and one that gets it

wrong. So this first version the wrong one Is a

perfectly fine interpreter for things like test1, okay?

It just does the wrong thing in terms of producing bad error messages for test 2.

So, here's how it works, it gets a little bigger because our language is bigger,

takes in some expression, just has a big con with one branch for each kind of

expression. If e is itself a constant, we just return

the entire thing. Whenever you have any kind of value, you

just return it. That's the result of interpreting a value.

If instead we have a negation, then what it does is it gets out the sub expression,

recursively evaluates it, gets the integer out, negates it, so now we have a Racket

number. And then we build up a value in our

language by putting the const consructor around that.

So what is wrong about this, is right here, the call the const-int.

We are assuming that the result of the recursive call will be a const.

That is not the only kind of value in our language, this is the assumption we should

not make, we should check this recursive result to see if we get a internum.

The other cases are all similar. If you have an add, we get out the first

field, recursively evaluate, wrongly assume that it will be a const, that will

be the integer i1. Again, wrongly assume we'll get a const

back from e2, that will be the integer i2. Add those two things together, that will

give a racket number, put it in the const constructor so that we have a value in our

language and that's the result. Multiply is the same.

The exact same procedure except with star instead of plus.

If we have a boolean, that is a value so just like for cost we return the entire

expression. For e we return the entire expression.

Eq-num is exactly like add and multiply assuming wrongly that it's sub expressions

are constants. Then it gets out those two racket numbers,

compares them with rackets equal, that will give back true or false.

Racket's true or false is not an appropriate thing for our interpreter to

return, so we use the bool constructor because that's the appropriate result for

an eq-num expression. And then for if-then-else, we grab the

first sum expression, recursively evaluate it, wrongly assume it's a boolean.

If we get back true from reading out the b field of the boolean, then we recursively

the second sum expression, otherwise we recursively evaluate the third sum

expression. So, there's a lot to like here.

This interpreter is not terrible, it just doesn't do the error checking that we

want, okay. So, let me now show you the fixed version

this is better. Some cases don't need error checking and

some do. So, if you have a constant no problem just

return it. We know we have a legal AST we're allowed

to assume that. So, this should be a perfectly fine value

to return. If we have a negation, okay, read out the

negate field, recursively evaluate it. But now, let that be some local variable

v, and now check. If it is a constant, then read out the n

field, negate it. Build up a new const, return that.

Otherwise give an appropriate error message.

Negate applied to non-number. Similarly for add, read out the e1 field,

recursive evaluate it. Leave out the e2 field, recursive evaluate

it. Make sure they're both numbers.

If they are, get out the corresponding numbers, const v2, const v1, add them

together, build up in your const, return it, otherwise appropriate error message.

Similarly for multiply. For bool, no error checking to be done,

it's a value just return it. For eq-num it's just like add and

multiply, we need those things to be cost. Then it's appropriate to read out the

underlying numbers compare them with equal and return bool of that, which is the

appropriate value in our language. And for if-then-else, read out e1,

recursively evaluate it. And now check that it's a bool, right?

We don't want a const there, that's an error.

If it's a const or any kind of value except a bool, then give this error

message. Otherwise it, we did recursively get the

right kind of result. So read out the boolean and then either

recursively evaluate e2 or recursively evaluate e3.

Okay? And so that is why eval-exp, on our first

test, which was legal, gave const 28. But eval-exp on our second test gave a

good error message. And eval x on our thing that wasn't even a

legal AST, can just give a bad message. Whereas our wrong version, eval exp wrong

does just fine with the program that doesn't have an error.

But if I give this one, I get a bad error message and it's fine that on the one

where I'm allowed to give a bad error message, that I do.

So that's the kind of error checking we need to do our interpreter.

It all boils down to When you get a value as a recursive result back from your

interpreter, you may need to check what kind of value you got.