Open Source Health and Diversity

With Heather C Miller

Open Source Health and Diversity

Heather Miller is an Assistant Professor at CMU. She is concerned that key open source projects are at risk of failure and no one is paying attention. Adam talks to her about open source, how it grows, the diversity problems it has and much more.

Heather also shares some interesting stories about the early days of Scala and her ideas for increasing diversity in tech.


Note: This podcast is designed to be heard. If you are able, we strongly encourage you to listen to the audio, which includes emphasis that’s not on the page


Heather Miller: And we’ve got some teenage girls who show up once and turns out that they blew the socks off these JavaScript developers who kept on trying to mutate state, but these 16 year old girls were like, “Why do you keep trying to change things? Just make a new one.” It was obvious to them. The JavaScript dudes were like, “Oh, sorry, teenage girls.”

Adam Gordon B.: Hello. This is Adam Gordon Bell. Join me as I learn about building software. This is CoRecursive. Today’s guest is Heather Miller. She’s an assistant professor at CMU. She is also writing a book about distributed programming. And I wrote down all these questions about distributed programming. But she’s also an expert on building communities in open source. And we end up talking about problems with open source and contributor burnout, and increasing diversity, and also about how we’re all just humans and we should get together and talk sometimes. So I think you’re going to enjoy this interview. And I never got to any of my questions, so I will save those for another time. Oh, yeah, if you like the podcast, likely you know someone else who might like the podcast. So yeah, maybe let them know about it. All this’s the beginning. So-

Heather Miller: All right. That’s the beginning.

Adam Gordon B.: … you mentioned in [inaudible 00:01:19] watch of yours that you felt that open source is digital infrastructure. What did you mean by that?

Digital Infrastructure

Heather Miller: So I adopted this term. The term digital infrastructure is actually coined by a woman named Nadia Eghbal, when she put together this report called Roads and Bridges. And she actually made the argument in that report that open source software in many ways, or it’s infrastructure that we just expect to be there and that we depend upon. And unlike actually Roads and Bridges, which we yell at our local government to fix when there’s a pothole. We can’t yell at anybody to fix something if there’s a problem in some open source thing, it’s just kind of… If somebody has time, they might do it or you should do it, which is kind of a different model, yet it’s still similar in that we expect these things to exist and to be working to do our jobs.

Nowadays, it’s really hard to find any piece of software that we use for anything that is not either consisting of many open source pieces or entirely open source itself. And we just kind of expect a lot of these things to be there and ready for us to use. And that people are maintaining that thing and that there are releases of the thing, right? And it’s infrastructure in that sense. It’s like to do our jobs, we expect this thing to be there.

Adam Gordon B.: Yeah. So Nadia, she worked at GitHub.

Heather Miller: Yeah.

Adam Gordon B.: I tried to get her on the podcast. If you’re listening, Nadia, please [inaudible 00:02:35].

Heather Miller: Get on the podcast, [inaudible 00:02:37].

Adam Gordon B.: So do you feel like the digital infrastructure isn’t being maintained?

Heather Miller: I would argue that to a large extent, much of the digital infrastructure that we care about is maintained and there are companies storing money behind stuff. And in some cases, you’ve got companies fighting with other companies for control over something, right? They’re all putting energy into it, but the main issue obviously, is that because this is not something easy to see, but it’s easy to see if there’s a pothole. It’s not easy to see if something is not maintained or the person who’s maintaining it is totally stressed out and quitting all open source. Right? You can’t see that very easily. So there are obviously cases here and there where there are projects that a lot of people care about or that are important for some reason that are at risk that we don’t realize are at risk.

And I think that maybe what people should carry away from this observation is that as developers, we should perhaps A, try to be a little bit more careful and conscious of the fact that a lot of this is still volunteer effort that people invest in things. They do this on their free time, which doesn’t give people the right to show up on an issue truck or be insulting and all of that, right? As much on us to kind of observe when there is an open source project that people need or care about or that is important, that is struggling. It’s something that we just can’t pretend is not a problem. I mean, obviously, there’s a whole bunch of hilarious examples that we can cite. Right? Like the Equifax breach, right? They tried to blame it on… was it Apache Struts or something? They tried to blame it-

Equifax and OpenSSL

Adam Gordon B.: Oh, really?

Heather Miller: Yeah. The whole Equifax thing is blamed on an Apache project, right? And then of course, Equifax got into trouble and they had to pay up, but they didn’t apply a security fix in time. And they were like, “Well, it’s not our problem because it’s open-source.” Well, it is your problem because you lost a lot of people’s data. But then obviously, there was OpenSSL. That whole thing with everybody, just the whole internet depending on it and then nobody realizing that it was just one guy who was super stressed out and doing contracting and getting paid minimum wage to keep the thing alive.

Adam Gordon B.: He was getting paid minimum wage?

Heather Miller: The discussion that I have read about what he was getting paid was something shockingly embarrassingly low. And the guy who was commenting on how much he was getting paid said that it did not even characterize it as getting paid. It’s not enough to support one person’s full-time work, is what he described it as.

Adam Gordon B.: Oh, wow.

Heather Miller: So I don’t know what the number is, but it’s embarrassingly low, is all I can say.

Adam Gordon B.: The Struts guy, whoever’s maintaining Struts, I feel bad for them because that seems like it’s a very unsexy maintenance project to like-

Heather Miller: Oh, yeah. Absolutely. But let’s just think about all of the terribly unsexy things, right? Let’s think about YAML or something. Everybody complains about it, right? These are hard things to maintain too, right?

Unsexy Opensource Projects and SBT

Adam Gordon B.: Yeah.

Heather Miller: But everybody needs it, but there’s things that people like to complain about on the internet a lot that people need and must be maintained. And think about how unsexy that is, right? Because then you have all of these people complaining on the internet about how much they hate the thing that you are trying to keep alive because if it disappeared, people would have nothing, right? I know that the folks who maintain a Scala’s build tool have dealt with… I mean, first of all, nobody likes build tools. Everybody hates every build tool, right? They get beat up all the time about SPT and it’s hard, right? To have everybody hating on the thing that you’re building or keeping maintaining, but everybody needs it. People aren’t using something else. So what do we do? Do I stop maintaining it?

Adam Gordon B.: I’ve never thought of that. Yeah. Nobody likes SPT-

Heather Miller: No.

Adam Gordon B.: … but somebody… Yeah.

Heather Miller: But people maintain it. Yeah.

Adam Gordon B.: Yeah.

Heather Miller: You have to. I mean, what happens if… I mean, well, we can stop maintaining SPT. People are going to get even more angry. There’s no wins, right? It’s hard.

Adam Gordon B.: This specific example is interesting. Are these people volunteers or paid or-

Heather Miller: Of course, SPT began as a volunteer project. It was actually a PhD student. So his name is Mark Hara. He was a PhD student at Boston University, I believe. And he was doing chemistry and then he just made SPT. It was like 2010 or something. I don’t remember. It was a while ago. Then everybody just started using it, they loved it. And it passed hands a few times. He ended up… eventually got hired by Typesafe, now Lightbend. And then there was a team of people who worked on SPT, but the team was always two, maybe three max people. So at one point it was Josh Suarez, now it’s Eugenia Cota and Del Weinand. They’re both engineers at Lightbend. So this is one example of something that is funded. And people, this is their job, is to maintain this tool at this company. But it’s still hard because of pretty hates on it.

Scala Center

Adam Gordon B.: Yeah. I guess you’re in an interesting place to perceive this problem with your relationship with the Scala Center, right? Because I have to assume that there’s somebody angry at Twitter because some standard library list moving around is unperforming. And then somebody else’s listen, “I just made this for graduate project. I didn’t ask you to run Twitter on it. That’s not my fault.”

Heather Miller: Yeah. Right. It’s your fault. You built Twitter on it. He’s like, “I didn’t.” Yeah. There’s all kinds of stuff like that, right? I mean, so I think we’ve gotten past this because I think people have stopped obsessing over what is in the standard library because I think people have accepted at this point that the standard library beyond collections, is not really being updated or changed or evolved in any way. And we should just build our own libraries because the standard library, it was exactly what you just described. It was like, huh, we’re at a university and we’re developing a programming language and we’re using the programming language to teach our classes with, so we need some library things. Who’s going to write the IO?

And so, people just did this stuff because we needed these things to be able to write simple programs with. And clearly, Scala, all of a sudden got popular and we couldn’t change a bunch of things now that a bunch of people are depending on it. The philosophy at some point, I don’t know, probably like three or four years ago, became like… People are building better things than we have and they’re maintaining it and publishing it. And we should not try to compete with them or anything. We should just give them all of the props in the world and let them go forward with their library for… I don’t know, parsing or this or that. Right?

Life for Contributors

Adam Gordon B.: So how do we make this world a better place? Be nicer to contributors? Is that the solution or-

Heather Miller: No, I don’t think we have an answer either. So I’m at Carnegie Mellon university now. I’m a professor here. And there is another professor at Carnegie Mellon that he was doing research on how do you build trust between people on a remote team? He’s been sort of studying how people interact on developing software for a really, really long time. And I was just having a talk with him today actually, about some of these things that nobody is looking at, right? There’s not researchers looking at this. Maybe companies are looking at these things internally and developing opinions about collaboration and how open source works internally. But with the exception of a couple of research papers in the Nadia’s report, there’s just not a lot of people looking at this stuff and not a lot of answers about how to make all of this stuff better.

So we have all kinds of observations that seem to have done something or caused something to be different in the context of Scala, but it really seems that interacting in person and just hanging out with people, like having lunch with somebody, things like this, they do amazing wonders for establishing comradery. I’ve watched a whole bunch of situations where there was an us versus them situation happening. In pull requests and all kinds of other… whatever on Twitter via all of the usual internet channels. But when people are stuck together in the same room for a couple of hours and they’re having a beer or they’re having lunch or something together, the whole them, us thing flips upside down and it’s just us. And people are a little bit less adversarial and they work together towards something more readily. And that’s really the opposite thing that I think everybody wants to hear because the internet was supposed to save everything, right?

Meet People in Person

The internet, we should just do everything remotely and online. And this dip, little one-on-one interaction creates trust and empathy and all kinds of things that you just don’t get that with pull requests or even Twitter or Slack, just chatting with people, talking about your dog and finding out that this person hates onions.

Adam Gordon B.: Do you think video can work? So we literally did talk about your dog before we started.

Heather Miller: Yeah, that’s true. So I have no silver bullet example or answer for you. It’s just that… At least I have found in the last couple of years, making people sit down and talk to each other, discover that one person’s a vegetarian or something. Maybe this ends up getting into diversity and other things. Well, I think especially, it’s the most extremely pronounced and open source, but it’s a thing in software in general, right? There is a culture to being a software developer and it caters very much to introverts and detail-oriented people that like math and all kinds of other things. And I think what we really need is some diversity from… I mean, just other cultures of doing things. Like I said, it could be different colors and creeds and ethnicities of people, but also just put some artists in the room with us because they’re going to be super confused about why we would rather chat on Slack than just ask them a question across the room. They’ll be like, “Just ask her.”

Diversity in Opensource

Adam Gordon B.: Yeah.

Heather Miller: There’s something that bringing in these other ways of doing things, these other ways of relating to one another seems to do wonders for just getting things done. There’s a point here, I swear. It seems like if you look at development in general, we observed all kinds of trends. Let’s just look at whoever identifies as male and whoever despises female. In companies, it’s like 17 or 20 or circa that percent of the workforce will be female or identifying as female and the other will be identifying as male. But if you try to apply that same logic to open-source, it’s like 5% versus 95%, right?

Adam Gordon B.: Oh, it’s even worse.

Heather Miller: Yeah. It’s much worse. I mean worse, as in less diverse in terms of at least being able to say something about somebody’s suspected gender identity, right?

Adam Gordon B.: I would have thought the reverse. That like, oh, there’s no barriers to contributing, so-

Heather Miller: [inaudible 00:12:32]. And the things that we kind of value in open source and in software development, it really caters to like I said, the introverts that like math and don’t like talking to other people, right? And we’re finding that we’re not good at getting things done unless we actually interact with other people or figure out ways to do this better. So this is when I say, bring in people that are extremely different from those introverts that like to hide and hack on stuff because that shakes everything up. In some cases, it forces people to empathize with one another, makes people try to understand each other better and people build trust and whatnot. Right? This is just the thing that I was talking about, where people in the same room with each other develop some sort of empathy and trust really quickly, right? Because they realize that this is a different way of communicating that they didn’t have before. That this is a person that I’m interacting with. I see that you’re a human and having human experiences, right?

Adam Gordon B.: We’re surely not quite as stunted as you portray us though, are we?

Heather Miller: No, but I think we are as a community of people, markedly different than just hairdressers and police officers and people who do other jobs. Like markedly different. And so, if I told a police officer that things got better when we met in person and talked, they would be like-

Adam Gordon B.: Of course.

Heather Miller: Yeah. What did you do before? Because the world has got its silos. And I think software developers especially, live in a silo, where we forget that there’s just all of these other ways of interacting and ways of sharing experiences. I think when you step out of that silo for a second… The thing with the teachers or the police officers or whatever, my point where I’m like, “Hey, we have this weird anecdotal evidence of people hanging out in one place, making it easier to… or sort of encouraging people to keep contributing to open-source because we built a little community where we have some sort of trust and empathy for one another. And people keep coming back because they’re part of that community.” Making that statement, this is obvious to other people, but to us, it’s like a new piece of wisdom. Right?

Adam Gordon B.: No, that’s true. So you mentioned bringing in people who don’t fit this stereotype, how do we do that?

Getting People From Different Backgrounds Interested

Heather Miller: That’s a really good question. I actually came from the art world somehow. I thought I was going to be an artist for a long time, then I went to art school and stuff.

Adam Gordon B.: Nice.

Heather Miller: And so, I realized that I always come at problems in a different way than other people do. And I realized that I think it’s because I didn’t go through all the same mathematical textbooks that other people went through in the same progression that they did. So I didn’t learn the standard Scrooge methodology when everybody else learned it, which meant that I did not follow that methodology when trying to solve some problem the first time. So this was super embarrassing for a long time because I would sometimes ask really stupid questions or really good questions, but never anything in the middle. Right?

There’s an embarrassing question, or it was a good question. Right? And I think just having that different sort of experience, having a different educational background and not sort of following the same steps that other people learn because this is sort of the curriculum that we all follow, letting people ask these sideways questions sometimes is really helpful. And a lot of universities in recent years have started developing master’s programs for people who don’t have CS bachelor’s degree or you lack some years of experience that perhaps other people have because you’re not doing it as long, but you have now the same sort of coursework that somebody who has a master’s degree in computer science has. Like I said, it takes a little bit longer because you end up having to do some of this foundational stuff.

But I think that’s one thing that could take people who have completely different sort of ways of approaching problems and interacting with people and empathizing with people and just injecting them into our software development teams. The fundamental idea is that somebody who has a completely different background and completely different experiences, who decided they want to do software development or get into computer stuff, having a way for them, like a path for them to be taken up into this is, is actually extremely useful because it brings these weird sideways questions that I think are sometimes useful, right? I think that this is also another thing that we don’t realize yet that it’s valuable. And I think it’s another definition of diversity that we don’t recognize and carry around and wave on a flag.

Adam Gordon B.: Yeah. Because it’s less visible. But yeah, coming at things from a different perspective. You mentioned these sprees. Is that right?

Heather Miller: Yeah.

Adam Gordon B.: Is that a diversity tool or-

Heather Miller: Kind of. I’d say in general, it’s not about diversity. It’s about establishing relationships and sort of empathy and trust between people that have never met each other, but maybe saw a screen name or something. And then showing people how easy it is actually to contribute to open source stuff. Once somebody is standing next to you and letting you know that the process isn’t that scary. I know we have lots of documents that say contributing, and read these twenty-five pages before making a pull request and all that. Let’s talk about it, right? Oh, yeah, you should group everything in to one commit, just as these little bits of helpful advice. And a person delivering that to you rather than you spending your weekend on something and then your pull requests getting rejected because you didn’t squash everything into a single commit. And then being depressed about it and not reopening the pull request, right?

Scala Bridge and Minorities

We have other efforts that are more aimed at diversity. And actually, what’s interesting about those is that getting participants is really hard and that’s because of our funny silos that we live in, right? But Scala bridge is all about taking underrepresented minorities, just teaching them how to program. And of course, because it’s Scala bridge, we do it in Scallop. They have them for other programming languages, right? Like Ruby and Go and all these other ones. And I’ve done this before in Switzerland and I’ve done this in the US and in all cases, it’s been super hard to get people to show up because the target audience is people who just are interested in learning how to program, they have half a day to spare. And it’s free and you just show up and we’ll have some mentors walking you through kind of an interactive curriculum that teaches you kind of programming concepts, but visually, and it’s fun.

And you kind of draw pictures and stuff. And so when I’ve done this before, we’ve gotten women who were doing business stuff. I don’t know, and they were good at spreadsheets. And they’re like, “I think I can program.” Right? That seems… but I don’t know how to start, right? And then show up. And we’ve got some teenage girls who show up once and turns out that they blew the socks off these JavaScript developers who kept on trying to mutate state, but these 16 year old girls were like, “Why do you keep trying to change things? Just make a new one.” It was obvious to them. The JavaScript dudes were like, “Oh, sorry, teenage girls.”

Adam Gordon B.: That’s awesome.

Heather Miller: The thing about that which also I think that none of us fully appreciate, we all… Because there’s a lot of people interested in doing diversity things. Yeah, let’s teach diverse people how to do stuff, right? But where everybody fails again and again and again, is figuring out how to reach out to those people. Where do those people hang out? You want to get moms that have taken a career break and they want to pivot into programming after having some kids or something? Go to Facebook, right? It’s a different channel than you’re on. I think there’s a lot of people that would take this kind of stuff up and just say, what we do especially when we want to do like a… I don’t know. Like a Scala bridge or something like this, teach people how to program.

We’ll go on Twitter to our developer friends and be like, “Hey, I’m doing a diversity thing.” And you’re reaching out to everybody who already knows how to program, everybody. And then we’ll all retweet it or something, but we’re retuning it to each other. And it’s like, what’s more helpful is if you’re like, “Oh man, my cousin might be interested in this.” And you get your phone and you send a text message to your cousin. That’s way more useful than retweeting stuff or uploading something.

Adam Gordon B.: It’s like colocated with some conference that’s on something completely different, like hairdresser conference or something.

Heather Miller: Exactly. Right. I mean, it’s just all these silly things that I think are obvious in retrospect, we were not actively realizing even though we have all these great intentions and we have… Companies are putting up… they’re getting somebody to do logistics. [inaudible 00:20:20] a paid admin to do logistics for these events and they get food and stuff. We’re putting money into it and time, but we’re not finding people because we’re asking the wrong places.

Adam Gordon B.: Yeah. So we talked about two kinds of things, right? One was like-

Heather Miller: Yeah. We talked about lots of random stuff. Sorry.

Adam Gordon B.: No, it’s not random. It’s like open source, there’s lots of projects where there’s just not enough people I guess, or… And then this diversity and… Do these relate or are they disconnected?

Heather Miller: So yeah, I think they do relate, but not in the immediate term. What I imagine is going to happen like 10 years from now-

Adam Gordon B.: The 16 year old girls will maintain OpenSSL.

Heather Miller: Yeah, exactly. And then we all can just sit back and relax because they’re bad-asses. No, but I think what’s going to happen is one way or another, either via some painful means or some sort of regulatory way, we will realize that developing open source things is required and either contributing paid time while you’re at work. Like company subsidizing your time to work on an open source thing is just kind of how we have to do things because that’s the only way we can pay the open source tax, right? Taken together with the fact that there’s this hilarious dearth of software developers. Everybody’s trying to hire all the time, right? And the number of people coming out of CS programs is not quite enough to fill the need. And that’s why all these coding bootcamps and all sorts of other things are interesting and desirable. And that’s why there’s all of these master’s programs for people who come from other degree backgrounds and whatever.

So I think these things are eventually going to meet up and that we need to suddenly create a lot more people who have the ability to hack on software things. And the only way that we can get the number of people that we need to work on these things is to actually stop failing at diversity so much. And then these people I think naturally will end up getting involved in maintaining open source things because they think ultimately, like I said, yeah, either a painful way or some more OpenSSL like disasters, or we have some sort of enlightening realization. But I think more and more companies are going to start dictating that engineering time is spent on maintaining something in the open [inaudible 00:22:29]. Something that the company is built on and requires to continue existing or if you need more developers in the first place because there’s not enough. And we have to start solving diversity problem to fix that.

Ultimately, some of these diverse people are going to end up having to deal with open source things because I think there’s nothing that companies can do other than subsidize some of the open source development that they need and depend on. Right? So I think these things are ultimately going to meet. And then of course, it’s a problem that we have such a diversity issue in open source right now, but I think that once we start solving the diversity problem more broadly, it’s going to start being more and more important diversity in open source because there’s hardly any diverse developers in the first place. And currently the experience as an open source developer, especially one who volunteers their time, it’s like a hazing ritual, right? You already have already our time as minority, let’s be hazed too.

Adam Gordon B.: Yeah.

Heather Miller: So I think that these things eventually will meet and we will do a better job, but it’s down the road. I think we have to independently first all develop a better cultural understanding of what’s required. People are writing clue code up between things that already exist, right? And putting an interface on it and stuff. And following this trend, we’ve got to somehow pay for those guts, right? We have to somehow invest development time and energy into the guts of the things that we’re clicking together. This is a problem that needs to be solved somehow. And I think that the most realistic solution is that companies subsidize this somehow. It’s the cheapest fastest way. And 20% time working on various open source projects that are important to your product teams thing.

Adam Gordon B.: I think that’s a good point though. The solution could be corporations paying basically to maintain open source.

Heather Miller: But paying by… I mean, ultimately developing, investing [inaudible 00:24:08] time, which is not what you want to do typically, because everybody is on a tight deadline and everybody is super bad at estimating how long things take and we’re always behind, right? I don’t know, we have to solve that problem and then… I don’t know. There’s a hack week or something. And everybody works on an open source project for one week, every quarter or something.

Adam Gordon B.: There’s also a problem with skills, right? I’m sure that my company used OpenSSL, but I don’t know if I would have been able to help them out with the project.

Heather Miller: For sure. You have to have some crypto person, being super good at those sorts of things. Yeah. That’s true. And not anybody can contribute to any project. Also the Scalar compiler, it’s like I have contributed a few small things to it. I’m like an idiot compared to a lot of the other people that work on the Scala compiler and I have a PhD, right? So I’m sure if I tried really hard, I could get back into it and meaningfully contribute again. Because you can’t just have somebody who’s like, “Okay, Joe, your 20% project now is contribute to the Scalar compiler.” Obvious examples are when there’s a project or a piece of software that the company makes heavy use of, and if you’re opening tickets about it, you should invest time into trying to close those tickets. Right? Or like understand the people who are maintaining the things well enough to maybe even…

Help OS Without Code

So this is another thing that’s a common misconception about open source. Not all contributions are code contributions. Something like 70% of people think, “Well, if it’s not a code contribution, it must be documentation.” And that’s also not a super… I mean, documentation is very helpful. I used to be the person in charge of documentation for a while, trust me. [inaudible 00:25:35] documentation. But my point is just helping people filter and curate and figure out what is in a ticket and if it’s really important. I can’t tell you how useful this is.

If somebody shows up and is helping manage the issue tracker or something, and then realizes that there’s a fundamental issue in this one that I did not notice. I just want to buy that person flowers and beer and mail it to them. It’s like, thank you. I can’t express my happiness through the computer, right? Well, thank you so much, right? There’s anxiety when you look at how many new issues have been opened in the last 24 hours, but if there’s somebody there helping you, it’s like you’re less anxious and less stressed out. Right?

Adam Gordon B.: Yeah.

Heather Miller: So it’s like all of these things that you just don’t really see as a person who’s thinking, “How am I going to contribute to open source?” By helping people figure out what the hell is in the issue tracker and what’s important, you’re reducing people’s anxiety in ways that you just cannot appreciate.

Adam Gordon B.: I kind of assume sometimes that the people who created the open source project that I work on, should fix my problem. And that me pointing it out to them is like a gift.

Heather Miller: You’re very helpful, yes.

Adam Gordon B.: Yeah.

Heather Miller: Yeah. But it is helpful. So on the one hand it’s helpful, but on the other, it’s also anxiety and stress inducing. Right? Because it’s like, “Oh, no, more things I have to do. This is why people make jokes in academia, especially in the programming languages community. The Scala people will show up to a conference and will be like, “Yeah. So we got some user feedback and people said that this was a bad design decision or something.” And then somebody will make some snarky remark in the back and then some professor or grad student or something who’s like, they made like a toy language or something. And they’re like, “That’s why we don’t want users.” Right? Because it’s stressful, right? I mean, it’s wonderful. It’s a great. It’s like, look, Scala’s successful. We have lots of users, but also it’s stressful. So there’s some people who like the fact that they don’t have apparently, a handful of people.

How Did Scala Succeed As Open Source

Adam Gordon B.: So how does Scala not become just like angry Martin working on a compiler on the weekend? How did it become a real… How did it get users and become popular?

Heather Miller: I have a tangent that’s cute, but it’s a visual that’s fun. And you can imagine Twitter doing this and other companies. So when penguins decide that they want to go hunting, they all kind of run to the edge of the ice and they’re all pushing each other. And nobody wants to jump in because there could be one of these sea lions or something swimming around that eat penguins, right? And they’ll just eat all the penguins. So they all push up to the edge and they’re all pushing each other. They all want to, but nobody’s doing it. And then eventually, the pressure is so hard, it builds up and one slips and falls in and then everybody stops and they watch. Right? And if the penguin swims around and comes back up and if there’s no blood, it seems to be safe to go fishing.

Then they all jump in and go swimming and they catch fish, right? And then that’s a great experience. That silly analogy. It’s kind of like, I think Twitter was the penguin that fell in, right? And then a whole bunch of other people were like, “Look, Twitter’s fine. Look, things are great. Oh my God. Things are working somehow.” [inaudible 00:28:24], “What’s your secret sauce? Oh, wow. You have this programming language. Okay. Oh, it’s on the JVM.” And then a bunch of startups were like, “I’m going to do this.” And so a bunch of people started using it, I think after they saw that it worked was working out for Twitter.

Adam Gordon B.: How did the language survive that? Because if I had an open source library, which’s obviously different than a language and large company started using it, that might scare me.

Heather Miller: Yeah. No, it’s terrible, right? Because you’re all alone. So fortunately, Martin was not all alone. He’s a professor and he has tenure, which means that his paycheck is guaranteed. He can do whatever he wants, the university’s not going to fire him. So he can take a risk and just work on a compiler for some years. And that’s technically okay. And of course, professors will get grants or get various money to do research projects and whatnot. EPFL provides base funding to have a couple of PhD students, which is unique. Most universities don’t do that. So Martin had a base team size of himself and at least like… I don’t know, probably two other grad students or something. So it was him and these other people and worst case scenario, we spend a lot of engineering time as small group.

I like keeping these things going. I don’t think Martin ever had a plan for Scala taking off. Martin has a super cool history. He’s basically a compiler maker. He has been since he was very young. And he sold a Pascal compiler to a company before he started his PhD. And he was trying to decide whether he was going to go work on compilers in industry or something, or go do a PhD with Niklaus Wirth, which is the Pascal guy, right? He ultimately decided to do a PhD, but he spent his PhD writing compilers. He ended up finishing his PhD and then just writing a lot more compilers. He’s just been writing compilers his whole life. And I mean, he invented a whole bunch of languages. One’s called Pizza, one is called Funnel. They’re precursors to Scala. And he worked on this thing called GJ, which is Generics and Java.

Martin Odersky and Building Compilers

He basically wrote that compiler that became the reference compiler for the Java compiler. And they were using this in the Java compiler. Compiler was using generics before people had access to generics in Java, right? But he just keeps making compilers. He has been forever. I don’t think that Martin… every time he picked up a compiler, except for the last version of Dotty, I don’t think he ever had a plan for the compiler taking off and this open source problem and all of this, right? He was going to just do a really nice compiler for a really nice, beautiful, perfect language that he thought was right. And he knew that he could continue working on it because he was a professor and he had tenure and they were going to keep paying him to do this. So worst case scenario, he works on it.

And I think that was his contingency plan, right? Fortunately, there were a bunch of grad students. The grad students were helping with triaging and bug fixing and all of this for some years. I joined the research group when our weekly meetings was really just triaging and assigning bugs to people to fix, right? So the whole research group was… Our research meeting, our group meeting or whatever, was talking about the bugs that we would each fix or trying to determine which bugs are the most important. And we did this for a while and ultimately, it was just too much. It was just some random research group. And he did a sabbatical, but his sabbatical was creating this company called Scala Solutions at the Startup Park next to EPFL. It’s like this little startup building, you can have an office, right? It’s affiliated with the university, so he made a startup over there.

It was supposed to be just like a Scala consultancy that did training and stuff. And apparently, Jonas Boner had a company in Sweden that had almost the same name. It was Scalable Solutions or something and Martin’s was Scala Solutions. It was the same name, almost. And then decided that they were going to get together and get some venture capital and found Typesafe, which became Lightbend. And then the idea at the time was that, well, the company will do a lot of the engineering because it doesn’t make sense… I can’t do it all by myself, I’m a professor. I have grad students that are transient and they can do a bunch of engineering, but they can’t do only engineering because they’re supposed to do PhDs.

So the idea was like, well, the company can deal with it, right? And the company is VC funded. It took care of the language for a while. It still pays a number of people to work on the compiler and the build tools. But the company largely does not do open source Scala, making sure that the language’s fine. They pay three people who work on the compiler and two people to do the build tool. Aside from that, they don’t have a lot of engineers maintaining Scala. This is when the Scala Center appeared again. The Scala Center was like, okay, well, let’s add some more engineers to the pot to try and keep Scala alive because it’s still not a solved problem. I think the Scala Center helps, right? And obviously, Lightbend paying engineers to maintain the production compiler also helps, right?

But there’s not like a silver bullet great solution or anything. It’s like, this is the way it’s done. It’s perfect. It’s still like every year it’s a problem. We’re like, “Okay, I hope this keeps working the way that it’s working.” Because how do you maintain an open source compiler? Think about the open source compilers that you know. Who maintains them? It’s usually a company that stands to benefit somehow from you using their compiler and they can invest engineering resources on it. It’s really hard to have a compiler because compiler is multi-year project and you have to get a group of people working on it. And these are some special people that are really hard to hire, right? Because there’s not a lot of people who are just A, into it and B, good at it and love it. Right?

Rust and Open Source

Adam Gordon B.: Yeah.

Heather Miller: And so it’s super expensive and it’s a long-term thing. And so even Rust has had a few scares here and there. I mean, with… maybe Mozilla will cut us. Oh my God.

Adam Gordon B.: Yeah.

Heather Miller: I mean, they blew up in community land and everybody loves them and everybody’s starting to build stuff in Rust and that fear is gone now. But Mozilla is not going to really increase its investment in Rust or anything. We should all be thankful as a group of humans that use programming languages on the internet that Mozilla continues to pay the handful of core team members that it does. It’s hoping that other companies pick up some of the core developers. Even bigger companies like Mozilla don’t have it perfectly worked out. It’s better probably than the in the Scala’s case because it’s maybe a bigger company or something, but it’s not worked out anywhere. Open source compilers are hard to maintain because of they’re fundamentally not profitable. You’re not going to get explosive profit from working on it.

Adam Gordon B.: It’s kind of crazy. It’s like the most base of this infrastructure that you’re talking about.

Heather Miller: I know.

Adam Gordon B.: Right? It’s like-

Heather Miller: And for everybody it’s uncertain. I mean, hanging out with the Mozilla guys four years ago or something, there was some fear that maybe this project is taking too long and it’s too… whatever. There was legitimate concern that Rust is not going to exist at several points because Mozilla didn’t want you paying for it.

Adam Gordon B.: Yeah. Clearly, there’s some sort of blockchain solution where your compiler [inaudible 00:34:50]… I’m just joking.

Heather Miller: I’m so happy that… Actually, that’s right. Blockchain is really the answer for everything. It’s true. I should have thought of this.

Adam Gordon B.: Every time you compile, it gives money to authors or something. I’m not sure.

Heather Miller: Exactly. Yeah. And the more you compile, clearly the more productive you are. Or at least the more reliant on the type system you are, which means we have to pay the engineers to develop the type system. Yeah.

Adam Gordon B.: There we go.

Heather Miller: That incentivizes people to not use type systems then. You’re shooting yourself in the foot, you’re into type systems. This is not what we want. We have to roll back.

Adam Gordon B.: Also, just disincentivizes compiling. They’ll be like, “Cool, let’s not even try to compile it until we want to push it to prod.”

Heather Miller: Yeah. Just let’s interpret everything in prod.

Adam Gordon B.: Yeah.

Heather Miller: Anything, we can just receive any code and just run it. Who cares? Seems totally safe.

Work at CMU

Adam Gordon B.: So how does all this relate to your current role?

Heather Miller: Yeah. I’m an assistant professor at CMU. And CMU is an interesting place. So it’s a really huge cool, and they have done hugely foundational things in programming languages and robotics and systems and files systems. And so, I came in to CMU with the idea that we would try to apply techniques for programming languages to try to make building distributed systems a little bit easier. And when you’re a professor, it’s really a negotiation. You don’t show up and you’re a furor or being like, “Now, you will work on this and you will work on that.” Right? It’s like it’s a negotiation. You’re collaborating with a PhD student, usually. I mean, we usually have a couple of them and that means that there are a couple of projects underway. And they’re usually kind of matched up with the interest of the PhD students.

So I’m working with Chris Mickel John right now, and we’re doing stuff a little bit more related to distributed runtimes, which is part of programming languages world. It’s the runtime piece. And it sort of solved a bunch of problems with actor runtimes that people didn’t realize were really a problem, which actually artificially limited how scalable they could be and how fast they could be. And it’s one of those things where in retrospect, it seems obvious. The changes that we’ve proposed basically, we observed… Again, this sounds really obvious in retrospect, but implementations of the actor model or actor runtimes would just assume full mesh connectivity for distributed actors. So if you had one actor, the runtime assumes that you should have knowledge of all of the other actors that exist in the system at any given moment, right?

And so that artificially limits sort of what you can do with actors. And that also sort of gives the hardcore systems people who build everything in just [inaudible 00:37:22]. They have some credence when they say actors are slow. You can’t do things with actors because they’re all of these artificial bottlenecks are not really necessary. You can still support the actor model while having a more scalable way to interact with other actors. So Chris had this wonderful idea, which was basically, “Hey, why don’t we at runtime let the people who are going to just run some accurate program say how connected you want all these actors to be?” Maybe it should be like a client server situation or maybe it should be peer to peer. And then we have these other projects where we’re trying to… but with this kind of fundamental question about programming languages research and what is the interface to the programming language?

Everybody thinks it’s the compiler, but for the last year or so, we’re like, but it could also be CI, right? Because we submit things to CI anyway, and we want it to pass before we push into production. So it’s basically type checking between services.

Adam Gordon B.: That’s cool.

Heather Miller: So if a service is a method, it’s got a signature, right? And I can figure out what signature is for your service, and I type check it against the other services that I know are currently in production. So I can at least tell you, they’ll deploy this because you’re going to break these other people’s thing because they assumed that that number is always going to be zero or greater, right? And then you suddenly start passing a negative number, now they can’t handle it. So their stuff’s going to break and PagerDuty is going to wake somebody up, right?

Adam Gordon B.: Yeah.

Heather Miller: So don’t deploy it, right?

Adam Gordon B.: Are you familiar with GraphQL at all?

Heather Miller: Yeah. I’m familiar with it, but am I an expert? No. But go ahead, tell me and I’ll let you know if I get lost.

Adam Gordon B.: So because it has a schema, you can validate whether the schema has changed in the build process, right?

Heather Miller: Yeah. But this is also kind of the same thing with Protobuf and all of these other things, right?

Adam Gordon B.: Yeah.

Heather Miller: So it’s already not so bad because you have teams kind of agreeing on schemas. And then schema evolution is something that’s pretty well understood in systems land, right? We can never remove a field, we can only add shields, for example, or [inaudible 00:39:13] them or something, right? So this is already very codified. We have standards around this, but our programming languages and compilers don’t know what schema evolution is, right?

Adam Gordon B.: Yeah.

Heather Miller: And it’s like all kinds of things just reasoning that maybe we don’t have to do and we can offload to something else to have to think about for us.

Adam Gordon B.: That’s very cool.

Heather Miller: It’s like a dumb question, which is basically, why does the programming languages stuff have to be limited to a compiler? Why can’t CI be a compiler? I mean, that’s sort of a funny way to say it, but in some sense it’s kind of semi accurate for what we’re doing.

Adam Gordon B.: Yeah. That’s a neat project. I thought you were going to say you were going to build your own language and repeat this whole structure that Martin does.

Heather Miller: I am doing it, though in a tricky way. Right? I’m just telling everybody that the CI is a compiler.

Adam Gordon B.: Yeah. If people start using it then-

Heather Miller: Yeah. Well, I hope that there will be like a startup or something where that I get tenure. So I get back on it. We have a better idea nowadays of how to support things that are useful. At least I see companies a little bit more willing to help maintain stuff, right? So all of my pessimism about open source in general, I don’t think that that should scare people away from taking risks and starting new interesting open source things. I think that the current situation that we are in together as a community of people who develop software, is one that is ultimately going to be solved. But I think that we just have to change… We have to realize some things before we can solve it, right? And I think that realizing that we all have to be a little bit more responsible than we have been in the last couple of years for the software that we’re using, is a thing that we will slowly realize in the next four or five years, I promise. They like, otherwise we’re going to have more Equifax or OpenSSL or all kinds of things, right?

Adam Gordon B.: Nice. I think that’s a great place to end it. You kind of wrapped it all together.

Heather Miller: How did I?

Adam Gordon B.: Thank you so much-

Heather Miller: Thank you.

Adam Gordon B.: … for your time.

Heather Miller: Yeah. And thanks for thinking to reach out to me.

Adam Gordon B.: That was the show. I hope you liked it. Thank you for listening to the CoRecursive Podcast. I’m Adam Gordon Bell, your host. If you like the show, yeah, tell a friend, join the Slack channel. Until next time. Thank you for listening.

Support CoRecursive

I make CoRecursive because I love it when someone shares the details behind some project, some bug, or some incident with me.

No other podcast was telling stories quite like I wanted to hear.

Right now this is all done by just me and I love doing it, but it's also exhausting.

Recommending the show to others and contributing to this patreon are the biggest things you can do to help out.

Whatever you can do to help, I truly appreciate it!

Thanks! Adam Gordon Bell

Audio Player
back 15
forward 60s

Open Source Health and Diversity