Refinement Types With Niki Vazou - CoRecursive Podcast

Introduction

Adam: Hello, this is Adam Gordon Bell. Join me as I learn about building software. This is called CoRecursive.

Niki: It is a thing that languages like Haskell adapt 20 features. Maybe in five years two of these features will become mainstream and will be ported to Javascript.

Adam: That was Niki Vazou. She wants to find ways to bring formal verification to mainstream programming. Today, she teaches me about refinement types. Refinement types are available in some form in several languages. Haskell with Liquid Haskell, Scala with the refined library, and perhaps even Ruby. Niki is an assistant professor at IMDEA Software Institute and the creator of Liquid Haskell. If you like to hear me ask simple questions to very smart people, you’re going to love this interview. We talk about type systems, formal verification, SMT solvers, working with GHC, and dependent types versus refinement types versus liquid types, and so much more. Super fun interview, I hope you enjoy it.

Niki Vazou, welcome to CoRecursive.

Niki: Thank you for having me.

Adam: So I have a lot of questions to ask you about refinement types and proofs, software verifications, but I wanted to start with this quote I found on your website. It says, “My goal is to make theorem proving a useful part of mainstream programming.” So what does that mean to you?

Making Theorem Proving Practical

Niki: To me, it means that when I say all these theorem proving, refinement types, and general verification goal is that in some years, some developer is going to write code and then some verification will run on the back that will not only detect errors but also it can be used to suggest code or to make coding easier, faster, and safer, but what is important is that the developers should not know exactly what is going on in the background because my impression is that now, people, especially mainstream developers, are always scared of this formal verification and type checking and refinement types.

What Is Formal Verification?

Adam: So what is formal verification?

Niki: That’s a very broad research area I assume but the high level idea is that you have a code, you have some specifications. So you start with some specification that you have in your mind that you want your code to satisfy, like a program will not crash or the most simple one is that you will not divide something with zero. And then, you find a way, like refinement types to express these specifications formally using some syntactically formal language. And then formal verification is the process of trying to prove that the code satisfies the specifications that you wrote.

Adam: Okay. You mentioned types. So I’ve had a number of episodes about types before. So I had an interview, episode six, I talked Edwin Brady about Idris and episode 15, I talked to Stephanie Weirich about dependent types, and episode 23 I talked about the Pie language. So listeners should definitely check those out but just to get some basics so that we understand. What is a type?

Niki: What is a type? For me, a type is basically a set of expressions in the language. So the language I talk when I say programming is Haskell, that’s okay. So for example in Haskell, we have the type of integers that basically express all the potential integers starting from zero, one minus one, et cetera. So you group them all together and you call them integers and then you have, for example, functions that are typed as a function from integer to integer. So you ready your codes and then you write the types for the codes to satisfy, and then if the program satisfies this type or type check, then you know that certain errors are not going to happen. So you cannot, for example, apply an integer to an integer because the type checking performs some basic checks for you.

Adam: So types are then a form of software verification?

Niki: Definitely.

Adam: So types give us some limited form of verification, right? So I’m trying to make sure I understand this. So if a function can only handle integers, then the static typing that happens during compile time and the language it verifies that nothing’s calling it with something that isn’t an integer. Is that the sense of which its?

Niki: Yes. So this is a very good example. And also, what is interesting is that in, what we say, strong type languages like Haskell, that makes sure that all your types are satisfied at compile time, then the compiler can sue all these information that it knows about the types to, for example, allocate proper memory when it is to run your code. Or it knows that your code satisfied the types and then it can use these information to optimize the runtime. And also to make sure that certain errors will not happen.

Adam: So what then are Refinement types?

Niki: The standard types only depend on the structure of your expressions. As we said before, zero has a type int. So Refinement types basically take the standard types and refine them with some logical predicates and then we can have more refined sets that define types. So for example, the standard example is division by zero and the standard Haskell type for the divisor operators is give me two integers and I’m going to give you back an integer. So we can go in and circle object to this type and say give me an integer, then give me an integer that is probably different than zero and I’m going to give you back an integer. And then if you give these Refinement type to your program and if you run a refinement type checker and make sure that all your programs satisfy this type, then you have a proof that at runtime, there is going to be no division by zero [inaudible 00:06:43].

Adam: So if I have this divide function, it takes two integers. I want to make sure the bottom one’s not zero, right?

Niki: The second argument.

Adam: The second argument is not zero. So how does refinement types solve that?

Niki: So you can refine the standard types of your language with logical predicates. And of this case, the logical predicate says that V, where V is the value that I’m refining, is different than zero. And you give this refinement to the second argument of the divisor operator and now you have a specification for this division function. And then you run the refinement type checking with again, the standard type checking checks that all the types the refinement now types at[inaudible 00:07:30] are satisfied by your program. So for the Refinement type checking to pass, there needs to be a proof that for every time you apply the divisor operator, the second argument is probably different than zero.

Adam: So, how I would solve this problem without refinement types, right, is I would use something to wrap the results and maybe put a check in place, is how I’m thinking right? So in my divide function, I would be like if the second value is zero, return none or something or some sentinel value maybe. So how is refinement types differ from that solution?

Niki: So the refinement type checking happens on compile time. From soundness, in the sense that division with zero will never happen, these solutions are exactly the same but your solution says that in the definition of the divisor operator, I run one, a runtime check to check that the second argument satisfies a predicate. And what refinement types do is basically, at compile time, trying to prove that this check always accepts. So they basically make your program more efficient, faster, because you get rid of one runtime check.

Performance Considerations

Adam: That’s interesting. So there can be performance ramifications because we’re moving things to compile time.

Niki: Exactly. And one of the standard examples that I use is basically a simplification of the [inaudible 00:08:56] example. That basically you have a memory and you want to index it and you need the runtime check to say that whenever you are indexing, you are only indexing the part of the memory that you have ability to index and you don’t go out of memory. And this check, as you suggested with the divisor operator, could be happen at runtime with an assertion or could be lifted to the type system and give the indexing operator a refinement type that says that the index would always be inbound. And then at compile time again, a refinement type checker is trying to prove that this indexing constraint is actually satisfied.

Adam: It seems like I could come up with a way to do this with normal types so I’m thinking of wrapping them in something. So what I’m thinking is, for your divide example, all I need is a type for numbers that aren’t zero, right? Like a natural number type or non-negative or something, right?

Niki: Yeah. You need the definement of an integer. If you say type sets, you need a subset right. And this is what refinement types give you. They take the types and they add logical predicates to define subset of the initial type that satisfies some predicates. So what is cool is that these logical predicates are expressed in a logic and then a refinement type checker is basically using SMT solvers to decide that all these logic that you are using to define types is actually satisfied. Where SMT solvers stand for this Satisfiability Modulo Theory and in my work, I treat them as black boxes that I ask them questions like, can you prove that in this case, the second argument of the device is different? And hopefully the SMT solver will come back and tell me yes and then my program would type check.

SMT Solver

Adam: So how do they do that? Let me narrow it down. What does the SMT solver do?

Niki: So again, SMT solver stands for Satisfiability Modulo Theory. Basically what they do is they take as input logic and they answer if the logic to give them is satisfiable. We are in formal verification so we are using a lot of Math in the back, some of which I don’t even know exactly but that’s a duty right? There is a lot of work on SMT solvers. Me, as a user, I know how to use them. I don’t know how to develop them exactly but there is a lot of engineering work to make this thing work. To briefly go back at the beginning of your first question, this is how I see to it right? That me, as a researcher in type theory, I want to build formal verification in type systems that makes the user will use without exactly knowing what happens here. So this SMT solver takes as input a logical predicate and checks if this is satisfiable. For example, if I tell you can you check if a value is greater than zero, is it satisfiable? It will come back to me and it will tell me, yes.

If you replace V with one, then you have one greater than zero and so your initial formula is satisfiable. Basically, if you negate what you ask it, you can ask it can you take if not be greater than zero, it is satisfiable. And this will say no, the negation is not satisfiable. And again, going back to the field of logic, we know that the negation formula is not satisfiable, then this formula is valid, which means the formula has variables. So if you substitute the variables with any value then this formula holds. So we use SMTs as a black box to ask them like can you prove that this formula is valid? And guess that they can say yes or no.

Adam: The one usage of SMT solvers I’m familiar with is probably like a toy but it’s like Sudoku puzzles if you give it all the rules. Do you know these puzzles?

Niki: Mm-hmm (affirmative). Yes.

Adam: If you give it all the rules like I have a nine here and a seven there or whatever, it will infer what all the other values must be.

Niki: Yes because I guess assignment of its values that satisfy the constraints. So I guess there already is that you take the Sudoku rules, you translate it as an expression in the logic that is satisfiable if and only the initials of Sudoku is a valid solution. And then the SMT solver will give you the assignment from the free variables to values. I mean, we do something like this in refinement types. If they reprogram, you take some specifications that you can see the specifications as when is the Sudoku true. And then you translate this combination of program and specification into logical formulas. And then we give them decency and we ask them is the formula valid? And if it’s valid then there are no refinement type errors. Otherwise, we say that that’s something we couldn’t prove that this application is correct so maybe something is wrong with your program.

Adam: So in the example of divide by zero, if I say this has to be greater than zero, then the SMT solver has to look at all the places this is called and see if it can figure out the value of those?

Niki: Kind of. So this is the goal of the type system. So for example when you type check a pure Haskell or any programming language with type code, then the written algorithm decides if your code type checks or not. And we call this algorithm a type system and it has some rules. So in the refinement type system, there are some rules. And this is exactly what they do. They check for specifications when they’re called and they basically reduce it or translate it to SMT queries. So it’s not that we give to SMT the whole thing. We take the whole thing, we apply the refinement typing rules and this is exactly barely that I’m working on and then this refinement type asks SMT queries, asks questions to check validity.

Adam: If I called the divide function and I give it a seven, so I want to divide by seven as the second argument, then it sees that seven and knows that it’s decided that it fits the constraint?

Niki: Yes. And it will tell you yes, everything is good.

Adam: So what if the value is passed through several layers of calls?

Niki: Then it needs to have a type that implies that it’s different from zero. So for example I get an input from the user and I transform it, I add it, I multiply it, I pass it through various checks and finally at some point, I divide it by zero right?

Adam: Yes.

Niki: So all these transformation that I’m getting to the input needs to have proper types so that the entries that I’ll type pass through the divisor implies that it is different than zero. And during the process, I can, for example, add some runtime checks to make sure that the final value that I passed to the division operator is greater than zero. So for example I take my input. I check if it is greater than 10 and then I add runtime, I make this check, and then I remove nine and then I pass it to the divisor operator and then the minus operator has a proper type to satisfy that the end result is greater than zero. And the idea is that instead of having this runtime check at the division operator, then they have much fewer runtime checks at the beginning of my code where I take the input from the user for me to know nothing. And then at the beginning, I make sufficient runtime checks to ensure that at compile time, I can prove that all the end constraints are satisfied.

Runtime Checks

And also this is cool because I make one runtime check at the beginning so if my problem is to crash at some division or at some indexing, then I note early on and not after I have done all the work at runtime.

Adam: So yeah, on a value that’s supplied at runtime, there’s no way to get around that you need to runtime check but the refinement types will detect this runtime check and make sure that that restriction, that constraint, travels via the type system to where you have the restriction in place?

Niki: They will not detect the runtime check. It is more by the refinement error methods that you get, maybe you will be helped to understand what runtime check you need. It is like type checking in Haskell. If there is an error, you will get a message. And then the fix is up to you. To understand the message and what you need to fix. And it’s the same for refinement types. You can get a refinement type error. It will not tell you add this runtime check at the beginning but the type error will give you some information that might be help you to understand what runtime checks you need to put.

Adam: But I’m wondering. If I add a runtime check, can it detect it or do I have to tell it somehow that, oh I’ve covered this condition?

Niki: No, it will detect the runtime check.

Adam: Can I also feed it my own assumptions? Don’t worry…

Niki: Yes.

Adam: Yes?

Niki: Yes. Boolean, the alternative else other alternative, it will understand that the code that you run inside the [inaudible 00:19:13] like the if condition, is satisfied. So you can add inside the ifs, you can add arbitrary expressions and then the then and the ifs will be checked. Assuming that the condition and respectively its negation, holds.

Adam: I had a couple interviews about dependent types and they have some of the same use cases that will be talked about right? Like using a list of unknown length so that if I take the hat of it then I know that its length is greater than zero for instance. So how does refinement types compare?

Niki: So this is a very complicated question because I think you can have many different opinions in this question. I can tell you my opinion. So the first time that these, basically, dependent types, for your type systems to be more expressive, they allow you to express properties like the safe indexing like the division by zero that some other type systems don’t let you do. On the other hand, what refinement types do is they refine an existing type system. So you have a type system and you have this logical predicate. You cannot change the language or the existing type system. You start from the real thing and you refine the types. So basically you don’t allow the types more expressive. On the other hand, you catch more errors. So we have this rule that you take a language and you add refinement types. So every program that passes the refinement types also passes the standard type checking of that language. So dependent types may extend the type system to accept more programs while refinement types constrain the type system so that more programs are rejected.

This is one thing and the thing is that all refinement type systems start from an existing language and refine it. And the other very important difference is that refinement types use an external solver. The most common thing is to use an SMT solver. But in dependent types, you write the proofs yourself or you have a tactic whether you have some automation but the proofs are things that exist in your language. While in the refinement types, the proof happens externally from the solver. So you cannot have proof terms inside refinement types, which is good and bad the way I see it. Having this external solver is good because many things are proven automatically so the user doesn’t need to prove things. There is an opinion that it is bad because trusting the refinement type system makes you trust a way that refinement types checked and also use the SMT solver or whatever external solver that you use. But if for example you do approve to call, you have very fewer things to trust but you have more things to proof yourself.

Adam: So it seems to relate to what you were talking about, practicality and mainstream programming. Do you think that refinement types, they can be added on to existing mainstream languages like you said, Haskell? I think that you even had something on your website about Ruby. Is this a technology that we can add, like a Linter type of process on top of an existing language?

Niki: I do believe that. Yes, I do have a work on refinement types for Ruby. And I am actively working on Liquid Haskell which is a refinement type checker for Haskell programs. So I think that in the Haskell community, refinement, Haskell and refinement types are quite accepted because the Haskell users have realized that types are good. So I think that if you go all the type checking of Haskell than spending a little bit more time in refinement typing, then Liquid Haskell is not bad. But for dynamic language it is very difficult to persuade somebody if you don’t understand. So that is what I’m saying. This is a goal to make all these typing and refinement typing so easy that you don’t actually have to persuade somebody. So I mean, I see it going there but I don’t know how long it will take.

Adam: So tell me, how was it working with Ruby and adding refinements to it?

Niki: As I said before, I believe that this is where all these should go. This is where verifications should go, this is where refinements should go because if you want verifications to actually have impact and become mainstream, then it is the option or programming language is that’s too easy to target. But the problem is that, for me, I find it more pleasing to work in Haskell. It is not actually the programming language, the community. I feel that I know category theory without having read category theory. So it is challenging because it is a very nice language and so it has a very enthusiastic community. So for me, I would like to stay there and collaborate maybe with people from other languages. Languages like Haskell adapt 20 features. Maybe five years, two of these features will become mainstream and will be ported to Javascript. So hopefully, verification could be one of these features that at next years, will be adapted at more mainstream programming languages. I don’t know if it makes sense.

Adam: It does make sense. I get what you’re saying. You don’t want to work with Javascript. I mean I don’t want too, either but the refinement types, because they sit outside of the language, they seem like particularly well suited to some of those problems.

Niki: Definitely.

Adam: So what languages have some sort of support for refinement types? Obviously, you primarily work with Liquid Haskell but is there support in other languages that you know of?

Niki: So interestingly, all these started from Ocaml. Yeah so we were with Haskell and then there is some more for Javascript that they know. And we’ve worked with UCMI website. We have support for Ruby. And I don’t know any other languages but this doesn’t mean that they don’t exist.

Adam: So I know that there is a refined library that maybe is slightly different because it’s in…

Niki: For Scala.

Adam: For Scala, yeah.

Niki: Yes. That is true.

Adam: And also for Haskell I think. If I understand, they work differently because they use macros to do it rather than something that’s outside the language.

Niki: So Scala is an interesting cave. I guess this is happening in Haskell too, but Scala’s type system is arleady semi-dependent. You can express a lot of properties because you have a list of typing then type checking by self is using a lot of reasoning to decide if a problem type checks or not. If I have understand it correctly, type checking depends on the refinement or vice versa. So it’s not such clean separated as in literature that you say that refinement types does refine the existing types and it is done in two separate steps. First by taking in the refinement type checking but there is some support and a lot of automation in Scala too.

Adam: Yeah, and there’s some overlap. As you were saying with Haskell, I think in Scala, some of these issues people take a different approach to right? So taking the head of an empty list. The default solutions that I think is that either people will use something that uses some sort of option type or maybe you’d use a non-empty list where in the list cannot. It always has at least one element. It’s interesting that you’re adding refinement types to these languages with such rich type systems that some of the scenarios can be covered in different ways.

Niki: But again, so I think the alternative of option types. Again, every time you want to access it, you need to basically open it so you’re spending a lot of runtime on it.

Adam: That’s a good point.

Niki: Yeah and you have nothing to lose again, you’ll need to check it. Basically the concept behind refinement types is perform the most text you can at compile time and using an SMT solver as a black box that is there to develop reasoning for you. But the other point is that it’s not exactly coincidence that I am extending Haskell with refinement types because first of all, Haskell is pure. I don’t know if I need to explain.

Adam: It doesn’t have side effects?

Niki: Yeah. You can easily map a Haskell program into this SMT logic whether we said before, without having to deal about side effects. So I don’t have to reason about memory. I don’t have to do a lot of reasoning. [inaudible 00:28:31] logic is much easier. That’s the one thing. These strong click type languages offered a very nice background so that [inaudible 00:28:41]. And then as I said before, Haskell has a group of people that already like all these [inaudible 00:28:50] logic type system but it is easier to try and experiment with refinement types too.

Adam: They’re eager for it where when you go to Ruby and they’re like we don’t even want static typing at all. Why would we go on refinements to static typing?

Niki: Yeah. That’s true. If you need to pursue it, somebody that use types then it’s more difficult to pursue it by using refinement types.

Adam: So a previous guest I had, John De Goes, he said that he thought dependent types were very challenging to use for a professional programmer, not somebody like an academic like you. And that refinement types were more approachable. Do you agree with that?

Niki: So refinement types are awesome, specifically when your program is correct. So again, from the beginning, you’re writing your program, you have some specifications in your mind and you write down the specifications and there are two points you can make in there. You can either have an error in your problem. You can have an error in the specifications that you wrote or you can have nothing. So most of the time because there is this automation, if both the program and the specification is correct, and because we use this SMT solver, it’s like everything works perfectly. Now if there is an error, it is much more or less like standard type checking. So you’ll get an error, you’ll get a location but then how you go from there to fixing the type of specification, it’s not a very easy task. For me, refinement types are definitely easier than dependent types because there is a lot of automation and in most of the cases, you don’t need to explicitly proof things that it is a little bit more difficult than type checking a strongly type language like Haskell for example.

Adam: Yeah. Because you’re putting these predicates right into your type.

Niki: Yeah, and the type checking algorithm in the back is a little bit more difficult. Basically, error reporting is the process of trying to explain to human why the algorithm failed. And the more difficult the algorithm is, the more messy the error message is because it is trying to abstract from everything that it did and explain you what went wrong. I think to make refinement types even more approachable, error reporting is a way to go. And the other thing, what you have mentioned I guess already many times is to give more motivation. So we said how refinement types can be used to speed up your code, but again, you can do that manually. But if for example the compiler would see refinement and use this information to speed up your code by removing runtime check, then I think that this would be a very good motivation for people to use such systems.

Adam: You lost me there because I thought the compiler isn’t aware of these refinements because they’re sitting outside…

Niki: Yeah. Right now, it is not aware. What I said is that one future direction is for the compiler to know the refinement types and use these types to optimize runtimes or memory of the code.

Adam: Like for instance, if your integer is always greater than zero, then maybe memory doesn’t need to be large as large because it’s actually just a positive value that take up half as much space or something or at least…

Niki: Yes.

Adam: I don’t know how it ends a represented memory but some less amount of space.

Proofs

Niki: Something like that but yeah, we can stay in the present.

Adam: So the other thing that I said I found on the Liquid Haskell website is something mentioning proving laws by rating code. So what does that mean and why would I want to do that?

Niki: Okay. So I think that why would you want to do that is easier so we can start from that. The laws are basically some properties that class methods satisfy. For example, when you have no idea you can say that it is associative. And you can use these properties for the whole reason why your code is correct and also maybe you could use these properties to optimize your code. For example, a common optimization is map fusion that is actually expressed in one of Haskell class laws. So the question why prove this is functional correctness, that your function is doing more or less what you have in your mind. And second, maybe you want to use manually or even more compiler optimization that use these laws.

Adam: Yes. If you have a type class and it says okay, this operation needs to be associative so that you can apply it one way or the other. Right now, at least in Scala, that might be a comment or something that somebody puts in but you’re saying we could use refinement types to actually express that as a verifiable statement?

Niki: Mm-hmm (affirmative)

Adam: And is that statement in the refinement code comments or where does it live? What does it look like?

Niki: Basically what we do, I think it’s been one year since we added this feature in Liquid Haskell, is we use Haskell functions that return just the unit value and refine them with a logical predicate to express theorems the least append is associative. And the way we do it is we say okay, I’m going to define a function. As for example, take as input at least to X’s, at least to Y’s, and at least Z’s and returns Haskell unit value and this unit value is refined in my Harman logic to satisfy associativity of the input X’s, Y’s and Z’s. So it would say X’s append to (Y’s and Z’s) is equal to (X’s append Y’s) Z’s. So we use Haskell functions that return units usually because these functions are never going to be used at runtime, we don’t care. And we define them to express theorems about their inputs and other Haskell functions. And we do this because for example, [inaudible 00:35:25] associativity is not a property that [inaudible 00:35:28] can automatically prove.

But to prove this, we need to tell the prover some information about how the function that we want to prove properties about behave. So for example to prove associativity, basically the proof is in tactic proof. It has a base case, it has a syntactic case that is using the syntactic hypothesis. And now these things about intaction, we don’t want to trust there are some things to prove them automatically so we use some Haskell functions to provide some of the structure of the syntactic proof and some information that the [inaudible 00:36:08] doesn’t know. And then again, we give all these information to the SMT and we let them figure out the details and conclude the proof. So in short, when you use refined Haskell types to express theorems about functions and then the body of this function basically provides a template for the proof of the theorem and combined with the SMT and the theory of refinement types, we prove this theorems.

Adam: I know that because I’ve done a bunch of episodes about types that I have some listeners who are interested in proofs and how they relate to types. Is Liquid Haskell and refinement types a way that you would recommend to approach learning about proofs? Or is it more a side effect of its implementation?

Learning Formal Proofs

Niki: You mean about format rules?

Adam: Yeah. Is Liquid Haskell a tool that people could use to learn about formal proofs?

Niki: Yes. Actually, my last Haskell paper was named Theorem Proving For All and in this paper, we basically claimed that, especially if you know Haskell, and then it’s very easy to learn using Click with Haskell, how to express formal reasoning and proving formally properties about your Haskell program. But again, as we already mentioned, Haskell and logic are very close. Actually, they are just like an error. The only thing that exists in Haskell and does not exist in logic is this error and divergence. If you exclude that then basically you can treat Haskell as your logic and there develop proofs like [inaudible 00:37:50] about Math properties.

Adam: What was the thing you said is missing?

Niki: So what exists in Haskell and doesn’t exist in the logic is divergent and runtime errors. So these are the only two effects that Haskell have. You can create divergent programs and then you can crash, you can give in there. And so actually, when we use Liquid Haskell to type proofs, we make sure that these two effects are excluded. So again, as I said, you can use refinement Haskell functions to express theorems and then the proof is basically a Haskell function and Liquid Haskell will check that these Haskell function cannot divert and cannot surrender. So you cannot [inaudible 00:38:34] associativity property that the extra argument and the returned units, refined, satisfies associativity and you could develop a proof that basically says give me three inputs and I’m going to return error.

Error satisfies every type so this would be a valid proof. So this is why we need to exclude errors and also is the same for divergence. In Haskell, we use a lot of this undefined. Normally, we use it to say I’m going to define it later. And the type of undefined is A, [inaudible 00:39:07]. Wait in the refinement as it translates like forever in refinement.

And it’s definition says basically undefined is equal to defined. So divergent, again like such [inaudible 00:39:19] types, such [inaudible 00:39:20] types and such [inaudible 00:39:21] theorems. So if you combine there divergence, you cannot have the theorem prover. A formal tool that gives mathematical proofs. But Liquid Haskell excludes. It’s having the termination checker, its having the palette checker and it checks that your proofs are actually mathematical proofs.

Totality Checking

Adam: So yeah. No error, no underscore for undefined. So you’ve mentioned totality checking. Back when I interviewed Stephanie Weirich, I asked her if there should be mainstream programming languages that do totality checking. So why does Liquid Haskell bring that to the table?

Niki: Why we have a totality checking?

Adam: Yeah and how is that useful? How is it useful, in fact, in terms of not just proofs but how is it useful in terms of verifying your code as performing correctly verification? What is totality checking? What’s its total function?

Niki: So I thought the function is basically a function that was said before, a mathematical function. It comes out diverge. It doesn’t call error and all the cases are complete. It is defined for all its inputs

Adam: The example we used of taking the head of the list, that is not total.

Niki: Yeah. This is what I was going. I’m writing a module after some calls. I am calling head, right? And using it like it’s a very natural specification to say that had this called the non-empty list. But the question is how did this specification came from? And taking that all your functions are total, it’s basically giving you what is giving you the specifications that are propagated. So head is not a total function because it’s not defined for empty lists but for the example, you can use refinements to say that the input of shares is at least whose length is always greater than zero for example. And then the specification will be propagated.

Adam: This is how the totality checking lets us do better verification right?

Niki: Because it is tracking the potential sources of error.

Adam: And then for the proofs that we were talking about, totality’s also required?

Niki: Yes because if it’s something that you could define syntactic proof without the base case or without the syntactic case. So if I was not checking totality, I could have the associativity proof only defined for the base case. And the thing that precludes me from that is I need to check that is defined for all potential inputs, both for empty list and non-empty list.

Infinite Loops

Adam: The other thing, what about infinite loops? What if I just write something, the loops forever and claims to return something but never returns?

Niki: So that is a very interesting question. So you could reason about that and I think that many formal system reasons about infinite loops as long as they are… so I think infinite loops are good if they are what we call productive. So they made divert, they may never finish, but they should always produce something. Something like refinement types. Maybe I should talk about liquid types, Liquid Haskell are using these SMT so map Haskell to launch it. And then logic not being diverged. They just think those two make sure you don’t have infinite loops. So what Liquid Haskell is doing is that it is checking that all your code is not diverging and if it cannot prove that, it will tell you this function, I cannot prove that it terminates so maybe something will go wrong in your logic. It practices differently, diverging something that we won’t carry in programs. You can formally reason about that but it is difficult.

Adam: So yeah. We talked a lot about the refinement types somewhere, they’re useful and stuff but your project is Liquid Haskell so usually before I interview somebody, if they have a paper or whatever I’d like printed out, I guess that’s your dissertation, it was too big. I didn’t print it out. It’s like 240 names.

Niki: Yeah it is a bunch of Liquid Haskell [inaudible 00:43:39].

Adam: So how big is the Liquid Haskell code base? How long did it take to construct? How big of a project is this?

Niki: It’s a big project. It’s been some years now. It is my first and only project of my PhD. I went to start it 2011, so I guess, I think it started 2012 or something like this, and I have to say that liquid types are exclusive for Ocaml at the beginning. And when I joined, my PhD. and my supervisor, [inaudible 00:44:11] was at the point that he was supporting liquid types from Ocaml to Haskell. This is where we started and it’s been, how many? 18 years now and we’re still [inaudible 00:44:21]

Adam: That’s awesome.

Niki: But that’s the thing. I think if we stayed in Ocaml, maybe we wouldn’t continue that. I don’t know.

ASTs and GHC

Adam: Yeah. You’re getting uptake from the community which is awesome.

Niki: Yeah.

Adam: I’m doing an interview soon about building a compiler. Do you have to do compiler-like stuff in Liquid Haskell? Build an AST and somehow feed it to the solver? Does it end up having components like that?

Niki: Yes. We have an AST but we have an AST only for types. So again, what Liquid Haskell is doing mostly, it has a lot of weird engineering but it takes Haskell codes, some refinement types and then it uses this to SMT solver language. And then if the SMT solver finds that something is not valid, it maps back the result. So we have an AST for the types and what is really cool is that we’re using the AST for expressions, we’re using GHC’s AST. And this reduce a lot of work because instead of parsing the code, we’ve compiled it with DT, GHC’s [inaudible 00:45:35] simplifications. And then we asked DT, give me the AST of the program that you compile. And what is amazing is that the intermediate language of GHC is seven alternatives. So it has variables, literal scales and in total, I think you have seven cases. We’ve simplified a lot the analysis of Haskell programs.

And the cool thing is that if you ask GHC, it gives you this very short and intermediate representation and you analyze this and you can target the whole Haskell. So on this class inspects that you write from these cards, everything reduces to this very timely intermediate representation.

Adam: That’s very cool. Yeah, because it’s such a large language with all the extensions and stuff. If you had to build your own tokenizer and this stuff, it will be a lot right?

Niki: GHC has this API that you can use and you can call it and you can ask it questions.

Adam: Great. That sounds like that would make your project a lot easier or that it does.

Niki: Yes. It does because you can definitely, rest assured, that you target the whole Haskell. It’s not that easy than syntax or something that you haven’t thought about. You really don’t care what the user writes as long as you can analyze the small intermediate language, you’re fine.

Adam: So do you think that other people should be trying to build verification tools that sit outside of the language? Maybe not refinement types but just in general, like a tool to validate that my team’s Javascript doesn’t hit certain conditions. Is that something that is under explored in the world of professional software development?

Niki: I’m not sure I understand what you’re asking. Is it if we should use verification or if it should be part of the language are external for the language?

Adam: My question is should people try to build verification tools on their own? Professional software developers to try to verify code? Should I try to build a verification tool that sits outside my language?

Niki: Yeah. I think yeah. And I think actually they are doing it so I know Facebook has a tool called Infer, right? And specifically big companies are moving towards verification. Even the part people are already using in production languages like Haskell or Scala, all of these [inaudible 00:48:03] means that people do care about verification and they are willing to invest maybe more time to learn to understand a complicated but safer language. So I think that adding verification in industrial products is definitely something that it can happen and I think that we are going there. You will spend a lot of time writing tests anyway. I don’t know if it’s a small step to going from all these specifications and the tests like what you have in your mind into acts on verification but this is our goal for, I guess, the more academics to make it be a small step from the tests to actual verification.

Adam: A test, it can only check for the absence of a problem right? It checks one specific case. What you’re talking about is verifying for all inputs right?

Niki: Yes.

Adam: So the power is there.

Niki: It is.

Adam: Well I think that’s all my questions Niki. Thank you so much for your time. Is there anything else you wanted to mention?

Niki: No. Just, thank you very much for this interview.

Adam: Awesome. It’s been great fun.

Niki: Thank you.

Adam: That was the show. I hope you enjoyed it. I think it ties in with all of the conversations we’ve been having on this show’s slack channel. The dependent types and constraint propagation and type level of programming. If you any feedback, let me know. There’s someone named [inaudible 00:49:26] who gave a good review on iTunes and [inaudible 00:49:31] gave a great show on Twitter. Mark Weiss, the host of Using Reflection also wrote a great review for the podcast and [inaudible 00:49:39] had mentioned the show on Twitter. So thank you so much everybody. I’m sure I missed a whole bunch of other people. Podcasting is this somewhat a solitary endeavor and my enthusiasm for podcasting lives and dies by the little dopamine rush I get every time someone says here’s an iTunes for you or mentions me on Twitter or LinkedIn or I get a Google alert saying that somebody mentioned me on some list of podcasts. So I really appreciate it everybody. Thank you very much. Until next time. Take care.

CORECURSIVE #031

Refinement Types

With Niki Vazou

Transcript

Introduction

Making Theorem Proving Practical

What Is Formal Verification?

What Are Refinement Types

Performance Considerations

SMT Solver

Runtime Checks

Refinement in Scala

Proofs

Learning Formal Proofs

Totality Checking

Infinite Loops

ASTs and GHC