Justin: For me, it’s like truly a language that actually becomes better and better the more you use it. Whereas I don’t think I can say the same about most I’ve used.
Before we get into the interview, if you are a skele developer, my team at Tenable is hiring. We are a distributed team working on static analysis of docker containers for security vulnerabilities. Tenable itself is a great place to work and we are looking for smart people to take on interesting projects. I will put a link with some details in the show notes.
Justin Woo, you are a bit of an evangelist for PureScript, so welcome to the show. I keep hearing about PureScript, so what is it? At a high level.
Adam: What brought you to PureScript?
Adam: Is PureScript strictly Haskell for the client side, or could you use it with NoJS?
Adam: Is PureScript just Haskell in JS, or?
Justin: I mean, it’s definitely easier and the ID tooling being built into the compiler is a big factor for why it’s easier. But being able to work with [inaudible 00:03:01] also just makes it much more easier than to write than to learn about weird Haskell product type records and such. So I don’t know, I guess you could say that it basically is Haskell in JS, but I find that the row type features just make everything a lot easier to use.
Adam: So let’s dig into that. Record types are kind of hard to do in Haskell, but something that’s very easy, I guess, in like an OO world where you can have a person and an employee and so on and so forth. How does PureScript handle records?
Justin: In Haskell, if you want to work with records, you have to learn about how you have to have a data constructor associated with it and the position of the arguments actually matters, and then about how the labels generate these weird static functions within the module. Whereas in PureScript when you work with records, the fields can be in any order because it’s row type. The ordering of the fields is completely ignored and only the structure, coming together and unifying, is the important part. I think this is the big thing that makes it feel more familiar to work with. I was coming from [inaudible 00:04:35] background, so I wanted to work with basically hash maps that were heterogeneous and guaranteed for equality. If that makes any kind of sense.
Adam: Yeah, I think bringing up hash maps puts some meat on it. Maybe we could go through an example. So I was mentioning like, let’s say you have a person record and an employee record, and they both have first name last name but employee has a start date. So what does the row polymorphism give us here?
Justin: In this case, you could model it such that any functions where you work with these… Sorry, the employee was the subtype of person, right? The structural subtype. Or was it the other way round?
Adam: Do they need to be subtypes?
Justin: Well, not necessarily. So you could write a function that says, “Okay, I can take any record which has this given label name with the type string, and then it’s extensible by any other rows.” So you could define functions that work like this, and it’s kind of what you really wish you had when you work with normally type languages. When you say, “Oh, I wish I could work with functions.” I could say, “I just need a class that has two strings in it, instead of this whole weird object hierarchy.”
And then say if you’re working with a person record and then you have an employee record that’s a supertype of that structurally, so it has all the same fields but then more, you could also have functions where you define it as, “You want to work with any kind of type where there is some kind of union between the fields that are in person and some kind of open rows that can be defined by the concrete context and this is the type you want to work with.”
I don’t know if this necessarily makes sense when I put it in words, but with a crayon diagram it makes more sense.
Adam: The part I understand is that I could have this employee which has first and last name, person which has first and last name, and I can have a function that operates on anything that has a first and last name, right? And that’s the structural typing?
Justin: Yeah. So it’s like, if you had functions that work on crayon boxes, you could define them as, “You need any kind of crayon box that has blue in it”, or “work with a crayon box where blue and red are in it” and there’s a complementary colors that could be like an empty set of crayons, and all kinds of fun stuff you could come up with.
Adam: You could work with these properly statically type records, but you can use them sort of like they were a hash map?
Justin: Yeah. I mean, for me it’s like what I’ve always dreamed about, being able to use hash maps as… I don’t know. If you define component properties, there’s like a total set of maybe a hundred different attributes you could find, but I want to be able to pass in records that have like four of them. After I pass that n, I want to have that concrete type information still, so I can do other things. Say if, for whatever reason, I wanted to take the same row type and use it to decode [inaudible 00:08:45] or get the keys out or compare between two different records, like the fields that have the same type, I can do that. And if you have some noble text and it’s a whole bunch of runtime checking, you can actually implement a lot of these as being concretely typed. You have a whole bunch of maybe something everywhere in your record if you count on this.
Adam: So how is PureScript implemented? Like the language.
Justin: I’m the wrong person to ask about that, right? Because I’m like an enthusiastic consumer. It’s written in Haskell and it’s a fairly small code base as far as I know, I think there’s only a couple dozen thousand lines. It’s not like a very big project. As a result, there’s a lot of features from Haskell that it doesn’t have, and then some things that Haskell has that it’s kind of changed in some ways.
And the second case, that’s the debugging that comes up for me often, or not too often but sometimes, is if I’m writing a parser and I accidentally blow the stack or I accidentally write some kind of loop or something, because PureScript is strictly about evaluating language, you just have these situations sometimes and you have to figure out, “Okay, where did I mess up?” And “Do I need more direct or something to prevent from blowing the stack?” Or “What kind of horrible design did I come up with that needs to be fixed?” Or something. So in this case I debug the output. But even then it’s, I don’t know. How often do you write parsers is the question. Sure, some people have the philosophical question where like any time you do anything with strings to something that’s parsing, but if you’re writing code using PureScript string parsers, which is kind of like Parsec for PureScript. If you’re writing Parsec combinators, yeah you’re going to run into that more often than others. But usually you only need to write a parser a couple times a month or something at most, I feel like. But I don’t know.
Adam: You mentioned stack overflows. Does that just happen with recursion or is there some sort of abstraction difficulties with going from PureScript to JS when evaluated?
Adam: PureScript is strictly evaluated. Is that a good decision, are you happy that it’s strictly evaluated? It sounds like you’re missing laziness.
Justin: Yeah. But I miss it every time I have these alternative cases, and both branches get evaluated because of my own laziness. So I have to find workarounds for that. It’s not very costly in the end, it’s usually quite cheap, but it still feels like a bummer.
Adam: I see. So you mentioned the FX system. Haskell has an IO monad and everything that side effects is wrapped in that. How does PureScript work?
Justin: Yeah, I mean it’s about the same thing. There’s this F type, and right now the F type is a prometized type with row of your FX and then your actual type coming last. So the idea is that it’s a [inaudible 00:18:34] type where you can actually declare what kind of effects happen when you run a function. In practice, this hasn’t been so nice and it’s been more of a nuisance for many. So in the next version of PureScript we’re planning on going into this effect type that gets rid of this role type parameter. At least this part will be gone, but overall, PureScript will still be a purely functional language, just like Haskell. Well, minus the other bits.
Adam: Well, it has “pure” right in the name. It’s putting down some stakes right there. So I had a previous interview with John de Goes, he was talking about Scala, I think, but he was saying that rather than having everything, like a function that returns an end with some side effect rather than it being IO int, that he prefers to separate this a lot more. So that he might have a type class for random, and it might return random int, and kind of splitting the IO up into a bunch of different types of side effects that could happen. Is this what the F, the EFF effect, is about?
Justin: I mean, F is just a phantom type, so it doesn’t do anything different. Mechanically, there’s nothing different and there’s just this type parameter that you use for some simple checking and compile time. The approach for using type classes and other things to actually slice down what specific operations happens, that kind of NTL approach does exist in Pure Script and there’s various practitioners of this approach, and there’s some helper libraries for that. But the main F itself is just this phantom type n. You can coerce the F-type row into anything else, like unsafely coerce it, and just representationally it’s the same thing. There’s no difference.
Adam: I thought that the F type let you subdivide the types of side effects that you were doing. So that’s wrong?
Justin: Well, it basically lets you write notes in Sharpie on masking tape and tape it over something, but it doesn’t actually mean anything, right? So unsafely coercing, just tear off all the tape and it’s still the same thing. Whereas the really cool thing about NTL base approaches is that you can have the compiler synthesize the code for you, that give you the code for running FX.
Adam: Okay yeah, I understand. I guess if you were doing unsafe coerce though, you could do anything right?
Justin: Unsafe coerce F will just coerce the effect rows, and then the type of the actual item inside will still be checked correctly. But this will go away with 0.12 where we’ll get rid of the F type and it’ll be named effect and there’ll be no more of this weird row type parameter. If you do want to look for something that hooks up the correct effect handlers with these effect rows, like say if you wanted to have console effects that you could send messages of what kind of console effects you want to have, instructions for what should happen, and have the interpreter for actually realizing these messages into actual actions, you might look into PureScript Run, which is a library that Nate Faubion made for doing this kind of stuff. There’s actual extensible algebraic effects in PureScript. But personally I don’t have any experience with that. I usually only use NTL approaches or I use Free Monad to do whatever. Or often I just write normal programs in F and it’s like writing a program with only IO, but specific functions that are pure that I test or I care about, and the rest I’m just doing some point to make it all work.
Adam: That makes sense, so you’re just trying to limit the amount of your code that actually does side effects, and you can kind of test the rest because it’s all pure.
Justin: And generally I’m also a believer in actually clicking through things and doing intermission tests, so I would rather have a headless browser on my application and click around than to have unit tests. Not that unit tests aren’t good, it’s just that oftentimes I end up writing these meaningless tautologies in unit tests, and I really need to write either property tests for these or have some kind of proof-based system that generates routines and can correctly do these or [inaudible 00:24:14] more on the types. So do I need these three things in addition to unit tests?
Adam: So you think unit tests are a horrible idea, I got that correct, right?
Adam: So in this case, the avocado and pumpkin are two different types?
Justin: Or two different values, right?
Adam: Oh, okay.
Justin: Avocado is like something that’s trivial to cut, then a pumpkin, depending on what it is, it could be quite hard. Japanese kabocha isn’t the easiest thing to cut in the world.
Adam: The avocado pit though, I don’t know if you’re going to get through that. No, I see what you’re saying, yeah. That’s funny, that example kind of threw me off.
So PureScript does have some sort of property-based testing library, I think. Does it, yeah?
Justin: Yeah, there are some various ones. I mean, I usually write code at home, so I’m not using them, but I probably will end up using them for work at some point. I’m also a big believer in writing something really nice at first, and then whenever I notice problems or I run into some problems, if they are things that should never happen, then I want to model that in my types instead of writing some kind of test. So the problem for me for writing test is that it’s more work to write them than to write more correct types.
So I guess one case would be like, I have this vid checker thing where I keep track of what shows I’ve been watching, and I mark shows as being watched or whatever, and at first I wrote this really naively, each handler was whatever effect type, and it just returned a string. That string I would send back as a response. But the problem became if I only returned a string then it could be any string, so I would often, when changing code around, make the mistake of applying the same handler to multiple occasions or returning the wrong type altogether.
And then my front end would make these requests and think they were supposed to be this autotype and it would parse these, and then I would get these code batches that would always fail with the wrong type. It would always say, “We failed to decode this because it had the wrong time” and whatever, and vice versa, the same thing would happen with my front end code where I would call for resources but then use the wrong URL, so of course I’m never going to get the right thing back from the server when I call the wrong URL.
So to solve these problems, I’ve been gradually upgrading my vid checker program so that I have more type-level evidence about what kind of things I’m looking for or not. So over time I went from a model that said, “Okay, I’m just going to request at this string URL, then I’m going to pretend the output should be this, then parse to that, and then handle the success for error” and then moving all the way to, “Okay, I have types where at the type level I know what request it needs, what response it needs and I know statically what the URL string will be”. I like moving more stuff to the type level because it’s both documentation about what I want, and I can write more supplementary documentation on top of that, whereas if I write it as a test, then it’s like something I have to update and I have to know more about in the future. If this long story makes any sense.
Adam: No, it does, right. I forget who said this, but “Make invalid states unrepresentable”. So it’s like, it doesn’t matter what language you’re using if everything just takes inputs of strings and returns inputs of strings.
Justin: Exactly. Do you know about my Twitter meme where I had the guy bicycling and he puts a pipe through his spokes, and it’s like string to string to string to string. And he’s lying on the ground and he’s holding his knee, and he’s saying, “Types are a lie”. If you have very imprecise types then it’s like, of course it can’t help you, you haven’t tried to help yourself. Or you purposefully hurt yourself.
Adam: That makes a lot of sense. There’s also the documentation part you were mentioning. Also a unit test can only show the absence of a specific bug, right? Or a type… If this is a type integer, it just can’t be a string coming in here. The types prevent a whole class of problems where a unit test can just check that one thing doesn’t happen.
Justin: Yeah. Continuing on that, it’s like, if you don’t have the refinement in there, if you don’t say that it’s restricted to a non-zero or non-negative number, then if you run those kinds of problems, then it’s like of course it’s going to happen. There was no guarantee that this wasn’t going to happen. Or rather, it’s not like “Of course it’s going to happen,” but it’s “Very well, it could happen, and if it does happen, you can’t be too surprised.”
Adam: Another case, say you have some sort of some type, so things can be A or B, but if you have code that only handles the A case, it’s still typed, but at runtime that could explode if it hits B.
Justin: Yeah. So I consider this a total anti pattern. You see sometimes people using languages that aren’t as powerful. They write these cases where the other branch just give you back junk or just says, “Debug crashed.” This shouldn’t happen. If you write code that type checks with this, then it’s like, mechanically it can happen, and at some point it very well might and it just might crash. Of course there’s different ways of then augmenting it with more types and being able to coerce certain values and certain situations because you know that you’ve provided evidence for it, but it’s like… I think this very much gets to the thing where, I usually say that the problem with programming languages, type ones especially, isn’t necessarily that they don’t have their features, it’s that they don’t have the right culture for it. If you don’t have a culture of caring about, figuring out these things, finding out about these things and taking inspiration from other places, then it’s like, of course you’re going to end up getting a lot of incorrect modeling everywhere.
Adam: This kind of ties into pattern matching and totality checking. Very few languages do actually enforce totality so that you have to check all cases. How about PureScript, does it have any sort of functionality in that area?
Justin: I mean, I’d have to split that up and say that generally as a culture, PureScript programmers care a lot about durable functions, and probably even more than Haskellers care about total functions. Haskellers accept that asynchronized exceptions might happen, and then there’s different thoughts about how to handle these. But in PureScript, almost everybody believes that you should always only have these total functions. Totality here meaning that, yes, all exhaustedness checks are done, and all the obvious runtime errors are avoided, and there’s purposefully no triggering of exceptions and runtime errors. But you’ll still have cases, it’s the halting problem where you literally can’t just do everything on earth, right? So you’re going to run into these sometime. Say when you’re writing recursive parsers, if you don’t use monad rec or something, then you might get [inaudible 00:34:14] and such, and it’s going to happen. There’s only a handful of completely total languages that are used out there.
Adam: PureScript has some totality checking, then, but not all of it? Is that what you’re saying, or?
Justin: It does the standard exhaustedness checking, but it’s just that it can’t check the totality stuff that isn’t in the type system, right?
Adam: Yeah. What were the stumbling blocks for you when you came to PureScript.
Justin: So, when I picked up PureScript, even though I’d used [inaudible 00:34:57] before, I literally did not know what an ADT was. So I didn’t know what a sum type was, I didn’t know what a product type was, I didn’t know what a data type constructor was. There were a lot of things that people will probably laugh at me when they hear this, but I just didn’t know them. I don’t know. Part of it is my own fault, but also I was never exposed to the right terminology to even learn these things, so I kind of saw it as inevitable that I wouldn’t know these things. Just slowly learning what an ADT actually is, what constructors actually are. Those kind of things took time, mostly because I didn’t ask anyone.
Other than that, for the everyday coding things… I think it’s just experience, really. The base language, when you use it to write applications, isn’t too difficult. It’s just being more patient with yourself, thinking through it and not being too afraid to think about the symbol substitution that goes on when you apply functions like map. Map takes function A to B and it gives it some functor A, and it gives you the same type functor back with B that’s gone through a transformation. And doing that symbol substitution in your head or just writing it out even, I sometimes write it out in that buffer, these kinds of things. It’s like, I don’t know, it feels kind of silly but it’s actually quite useful to further your own understanding of it, so.
Adam: That makes sense I guess.
Justin: I don’t know if I have a good answer.
Justin: I don’t know, I don’t think that’s happened to me.
Justin: Yeah, I mean, the individual statements, you can tell where they are and how they map, right? But then that doesn’t mean that the actual function applications make a lot of sense.
Justin: Yeah. And then if you want to write normal algorithms and functions out of it, then you can convert it from array to list and back. And there’s functions that are in the foldable package, where you say, “Okay, I know my array isn’t foldable, and I know that lists are foldable,” so I can say, “From foldable convert this to that,” and it converts the array to list and it can do that vice versa. So conversions are quite easy, because most instances that make sense already use this.
Adam: Are people writing things in PureScript and releasing them to the greater NoJS community? It seems to me that you could, right? That you could use these guarantees to write something that’s really solid and then just release it as a JS library.
Adam: So what’s AF?
Justin: It’s an asynchronous effect library in PureScript. That’s about the gist of it.
Adam: Okay. And you think it would be a good solution for just general node use, but it’s not going to happen, is what you say?
Justin: Yeah, I mean it’s quite nice and you can spin of fibers from it and manage and kill fibers. It’s a nice way to be able to work with asynchronous data. But I think it will never be popular. Understandably so.
Adam: How about the other way? Is it a benefit to you as a PureScript developer that there’s this rich NoJS ecosystem that you can pull things from and kind of do your FFI stuff?
Adam: So who should use PureScript?
Justin: Who should use PureScript?
Adam: Yeah, you want to grow the PureScript ecosystem, who should be checking it out? What kind of developers? JS developers, Java people, anybody?
Adam: So PureScript has this functionality called type holes, correct?
Adam: And so previously I had an episode about Idris, and we were talking about in Idris you can have a hole, you know, you say, “I don’t know what goes here,” and then you’ll be told the type. And also they have an expression search, so it could tell you, “Hey, what goes here is actually this function that exists, and I’m just going to write in the definition for you.” Does PureScript go in this direction, or?
Justin: It can’t be as powerful in that there’s no as much information to work with, but roughly speaking, I do use the type hole feature a lot, I just write question mark, I don’t know what this is, and then the compiler can tell you, after searching through your environment, “Okay, these functions actually meet the requirements, actually are the same type as that hole is, and it could use these functions properly.” And then usually it’s like some suggestion, traverse or sequence. So I use it quite a bit. But yeah, it’s not going to be anywhere near as powerful as Idris, but also I don’t have very much experience writing Idris.
Adam: And also, I think PureScript has type class driving, is that right?
Justin: Yeah, some of the type classes can be drived. There are some obviously nice ones like being able to drive the new type of type class, so you get these operations that work on any generic new type, being able to drive equal or [inaudible 00:48:04]. For a lot of very simple cases, it’s quite nice to be able to just drive these and you throw your data set into a set or something. For me, the most useful is the ability to drive generics rep.
Adam: Is this like serialization or no, something else?
Justin: I mean, it’s kind of like serialization, but to an actual data type that’s driving the compiler. Just the whole topic of data type generics where you can generically work with the representation of a data type and it just write generic functions, fills repetitions, and the compiler will give you the stuff to translate to and from the repetitions to copy data types.
Adam: Oh I see, this is like generic programming, like getting the first element of some product type, for instance, and writing it in such a way that it’s generic across all product types, or?
Justin: Yeah, basically. Any product where you derive the generic rep, then you use the function accordingly, to cover to and from. G is generics, generic rep, and I guess in Scala land there’s a few libraries that do this.
Adam: Like Shapeless I think is in a similar vein. So it could be used for serialization, right? Because once you have that generic representation, you could use that then to derive some sort of serialization code?
Justin: Yep. And in Haskell I’ve used it to work with record types and create the deserializers for INI files and also to do type-level routes, so I would have the correct types for registering handlers for scotty server routes. Just a whole bunch, anything you want to do, really.
Adam: Yeah. Very cool. That’s about all I got for questions here, is there anything else you’d like to mention, or?
Justin: I guess I should shout out that I live in Helsinki and it’s nice here, I’d like to meet anyone who actually listens to this show from Helsinki, because there’s just so few people in Helsinki, it feels like.
Adam: Perfect! So this is a call out for anyone listening to this in Helsinki.
All right, well it’s been great having you on, Justin. Thanks so much for your time.
Justin: Thanks for having me.