[MUSIC] Let's now discuss what static checking is as part of a programming language. We absolutely need to do this before we discuss the advantages and disadvantages of having static checking. After all, nothing is more frustrating than arguing about whether something is good without understanding what that something actually is. So I will define static checking as anything that is done to reject some set of programs even though they successfully parse but before you start running. So we have the parsing, it's done. We have the abstract syntax and our language says, even though it parse, this program is rejected for some reason. That rejecting process is the static checking. And which programs are rejected and which ones are allowed to run is absolutely part of the programming languages definition. It's in essential part of the definition so that people know what the legal programs are. Now you could do static checking in addition to a programming language definition. You could have some tools on the side, that also detect bugs and warn you about them. And that's a wonderful thing to do that's become very popular, and I'm very much in favor of it. But here, let's focus on static checking as part of a programming language definition. Now, probably the most common way to have static checking as part of your language definition is via a type system. And for type systems, I want to distinguish in this segment the approach they take from their purpose. So the approach most types systems take is every variable gets a type. We use that to type check each expression. Maybe functions have a type signature that we're interested in and so on. And that's all fine, and then some programs type check and some don't. But what is the purpose of this? We should be very clear about what a particular type of system is trying to prevent. For example, maybe it's trying to prevent you from passing a string to an arithmetic operation like division. That is one thing that a type system prevents. It also prevents violating the abstraction of a module like an ML's module system. It prevents having to do runtime checks like the number question mark primitive in racket. It prevents using undefined variables and so forth, okay? So a dynamically typed language is just a language that doesn't do this sort of static checking. Now it's not an absolute line, racket actually checks a couple things. When you click the run button, it turns out you're not allowed to have an undefined variable. But there are very, very few things that it checks statically, and so we tend to still call it a dynamically typed language. So to understand this idea of the purpose of a typed system, let's look at ML a bit more specifically. Here, I have a long list of things that MLs type checker has the purpose of preventing. We know that if an ML program runs, it will never have any of the following errors. It will never have a primitive operation used on a value of the wrong type. For example, you'll never have e1 applied to e2 and have e1 not evaluate to some closure. You'll never have an if then else expression where the expression between if and then doesn't evaluate to a Boole and so on. You'll never have a variable used that's not defined in the environment when the interpreter looks it up in the environment, it always finds it. You'll never have a pattern match with a redundant pattern and so on and so forth. And a lot of these are standard to type systems, but they really are part of each programming languages definition. And different languages can use their type system to prevent different things. Something prevented by one type system might not be prevented by another, and that's okay. It is also the case that no type system prevents everything. So here are a number of errors, a number of bugs that MLs type system does not prevent. It does not prevent calling the head function with the empty list. It does not prevent array bouncers. We never say arrays in ML, but it has them. It makes sure that the subscripting into the arrays always done with an int. But it does not make sure that that int is small enough to referred to an element that is actually in the array. And MLs type system does not prevent division by zero and so forth. There's a bunch of errors like that, that you can imagine a type system preventing. But most type systems in today's languages do not. Now in general, even behind those sort of things, you never can really expect the type system to find all of your bugs. Because it doesn't know what your program is trying to do. Unless a static checker is given a full specification of exactly how your program is supposed to behave on all inputs. It cannot possibly read your mind about what the program is supposed to do. So if you have a perfectly good if then else expression except the then branch and the else branch backwards. You put wrong one in the wrong place, how is the type checker to know that you aren't supposed to do that? Similarly, if you have two functions of type int arrow int f and g, and you call one when you meant to call the other one. Again that's just not the sort of things that a type system is ever designed to prevent. So type systems and static checkers are pretty much not substitutes for the sort of testing and dynamic evaluation of programs that we come to expect. And unless you had full specification of exactly what you're program is supposed to do. So I'm really emphasizing a particular view of type systems here which is technically accurate and an important perspective on it. Which is that the purpose of a type system is to prevent certain behaviors from happening at run time. And that's a separate issue from how the type checker is defined in terms of expressions. So having types and how it's implemented in terms of some recursive procedure that runs over your program, does type and prints and all the rest of it. So language design includes two issues. First, what are you checking, what is the purpose of your type system? And then second, how are you going to enforce these things? How are you going to make sure that you never pass a string to the plus operator? And the hard part of designing type systems is to make sure that they achieved their purpose. That while being useful and flexible, and the sort of language people want to use that the how you do it, correctly accomplishes the what? What we'll study in the next is the next segment is the exact definition of correctness and how to think about that. But before we move on, I like to point out this idea of you can have static checking, or you cannot have static checking is not the only choice, all right? But there 's really continuum of options here. But fundamentally, catching a bug early such as giving a type error before you've ever run the coding question is always an inherent tension. Well maybe I shouldn't call it a bug because it's never going to matter. Programmers don't need to know that, there's some reason that bug isn't going to occur and so on. And so static checking, a type system like you're used to and dynamic checking, when you hit that bad operation at run time are only two points on a continuum. So to emphasize this, let me pick a strange example, just so it's not something you're used to. Suppose we wanted to prevent, the bad things we're trying to prevent is division by zero. So you would never have a program get to a point where I try to take 3 and divide it by 0. Let me give a range of ways you might prevent this. The extreme static end would be even before the normal compiled time, like we've been talking about. What do we do is every time you try to even type a little bit of your program. Before it shows those characters on the screen, is would run some analysis to make sure that the program is currently construed to never divide by zero. You see how that catches the bug even earlier than we are used to. We're used to where you type the program and then you chose the compiler and it give you an error message. There are also things that are before runtime, but less eager than we are used to in our type systems. Like maybe it allows each individual file to type check. But then it waits and right before you go to run the program, you have a particular main with a particular set of arguments maybe even. Then it does some checking and says, you know I think you might divide by zero, so I'm not going to let the program run. That would be more eager than the run time we're used to which is sort of how rapid would do this. If it would do / space 3 space 0 in parenthesis you would get some sort of error. And it seems like, you would think, that's as dynamic as you can go. You can't go past the division by 0. We learned in school that you're not allowed to divide by 0. It's an illegal operation. And yet we could do lower than that. We could just somehow, return some answer that indicates we divided by 0. Return that to callers and have them deal with it. And that might sound strange, but it turns out it's exactly how floating point numbers work in pretty much every programming language. Because the people who do scientific computation decided that dividing by 0.0 was actually a useful thing to do. Because if you return infinity then callers might not need it. It might be that this computation in the case where the denominator is 0.0, will end up getting cancelled out somewhere else. Or you'll take some conditional that causes you not to end up using this variable. And so it turns out programming images typically do this. And they could do the same thing with integer 0, if they wanted to. Because this issue of eagerness, when do you detect bugs and how do you turn them into errors, is a language design choice and you have a full continuum. It just so happens that based on our experience in designing languages over the years. The compile time and run time are the two most common points on that continuum to indicate an error.