Adam: Hello, and welcome to CoRecursive, the stories behind the code. I’m Adam Gordon Bell, the host. Welcome to the year end episode. Today is all bonus questions. Oftentimes, I have questions I want to ask guests, but they don’t quite fit the overall theme of the episode. So today, we’re going to do a whole episode of those extra questions. I have questions for Brian Kernighan, the creator of AWK among many other things, I have questions for Sean Allen, who works at Microsoft Research, and a couple of other people. We’re going to start with Jim Blandy. Jim Blandy was the guest on Episode 54. One thing you might not know about Jim is he’s been working on open-source software for his entire career. I think that’s pretty cool. It means that all the code that he’s written is out there somewhere in the world. It’s visible. It’s not locked up in proprietary software. It’s like Jim’s personal career portfolio. So I asked Jim what happened and it turned out he started working on Emacs project as soon as he finished university. The story behind it is super interesting.
Jim Blandy and Emacs
Jim: So basically what happened is I had been working for project GNU on Emacs as the Emacs maintainer for several years. And then Richard Stallman was my boss and I was working remotely from Ohio. And Stallman had been really very helpful to me. I had some difficulties learning to work remotely. I was just out of college. And in his own difficult way, he was actually very supportive and helped me get into the pattern. But after I’d done that for two years, I quit. And it was funny, I just finished a project, doing a survey of the copyrights on all of the Emacs list modules that are distributed to Emacs and making sure that everything was on the up and up. And so I called him up one night and said, “Okay, Richard, I finished the copyright survey,” and he says, “Oh good.” And I said, “And I’m going to quit,” and he said, “Oh no,” and then he hangs up the phone on me. And that was our last conversation as an employee.
Adam: Jim quit the Emacs job so that he could do a bit of world traveling. But when he got back, he had to find a new job.
Jim: My friend, Karl Fogel, had gotten a job working at the University of Illinois Urbana-Champaign on a gene editing mode for Emacs. And this is an Emacs, you visit a file that’s full of genetic data, and Emacs will bring it up and it’ll add colors to it. And you can select rectangles and stuff like that. And so basically, the lab that my friend was working for, their big triumph was that they were able to build trees that had hundreds of organisms in them, right? And nobody else had ever been able… People had said, “Oh yeah, that technique is too difficult. You’d never do it.” Basically, they took every kind of organisms you could find. And it turns out that since all life, all cellular life, synthesizes proteins from their genes, right? You have a genetic sequence describes a protein, you’ve got to make the protein. That’s done by an enzyme called a ribosome. And everybody’s got ribosomes. Bacteria have ribosomes, humans have ribosomes, oak trees have ribosomes, weird things that you find living in hot springs and at the bottom of the ocean, next to volcanic vents, everybody’s got ribosomes.
And so if you use the ribosome sequence as your point of comparison, you can actually compare all of life, any kind of life, anywhere you find it. Everybody’s got ribosomes, except viruses. Viruses use somebody else’s ribosomes. But they are doing the same analysis on COVID strains actually. When they talk about the European strain and the Asian strain, or they talked about how they figured out everybody in the U.S. got infected from somebody in New York or from people from New York, that was using the same analysis. It’s differential analysis of the sequences. So yeah. And so basically what they did, their output was that they put on a T-shirt. It was this great thing. It was this circular plot of basically every kind of life that there was, right? And it really showed that the literal distances between these different kinds of organisms. And it was really funny because it had this whole big… There was a hundred organisms arranged around the circumference of this circle with the common ancestor at the center, and then things branch out from the center out towards the edge of the circle.
And basically, almost all of these organisms were bacteria or single-celled organisms. And then often one little corner of the circle, there was human beings, corn, oak trees and yeast. They’re all so close. They’re so similar. If you would see them walking down the street, you can’t tell the difference between one and the other. They’re basically all the same, right? Human beings, corn, oak trees and yeast, identical. No distinction between them at all. So the project really gave you a sense of perspective. It’s like it’s not really about us. We like us, and we should take good care of us, but it’s not about us.
Adam: It’s not about us. Interesting thought. I’m not really sure what the takeaways are from Jim’s story, but now I know there really is an Emacs mode for everything, including gene editing.
Brian Kernihan and Unix
In Episode 58, I interviewed Brian Kernighan who coined the term Unix and is the K in K&R, the famous C book. In the interview, we focused a lot on how Unix was created in the early days of Bell Labs. But in this clip, I asked him about the success of Unix. Why did it end up becoming so popular?
Brian: Some of it is right place at the right time. Unix came along at a time when computers started to become affordable, not for individuals yet, but for smallish groups. So if you and half a dozen people like you working at a company, let’s say, or maybe a handful of people at a university department or something, they could afford a computer. We talked earlier about 50,000 bucks. So call it something like that. That’s not pocket change for either of us individually, but we get a small group of us together, we can actually do that. And so the hardware itself was feasible, manageable for smallish groups, and small groups have more control over their destiny than perhaps a giant group. And so in that sense, useful. And then the other thing is that the service that Unix provided, the computing environment it provided at the time was just so much better than what you would get from the operating systems and software provided by computer manufacturers, because that was at a time when manufacturers, like IBM or Dell or HP or whatever, made their own operating systems, that was part of the service.
They built a computer, but they also provided an operating system. And the operating system probably was even free as part of the purchase price, I don’t know. And those operating systems tended to be, roughly speaking, pretty awful. They weren’t very nice to use. And you could imagine why, because it wasn’t the fundamental focus of the company that was making the computer. The focus of the hardware and the software was just something that you had to provide. And so Unix provided quite a decent alternative to that, in fact, a very much more pleasant alternative for lots of people. And so as a result, people found Unix appealing and they could get things done. And so there was this wave of people following Unix onto more and more powerful machines. So the [inaudible] was powerful for its time, but limited. But then the next generation, the VAX, which went from a 16-bit architecture to a 32-bit architecture, and that opened up a lot more possibilities. And so that was the machine for a long time.
I think another thing that worked was the portability of operating system and supporting code that was written in high level language, so that it meant that you could write something once and it would run on lots of different things. And so Unix just as a software system made it possible for smallish companies, like Sun Microsystems in particular, to design hardware. And then they didn’t have to write software, they could just use Unix. And so that meant that you had hardware companies like Sun and Apollo, and scattered others, who were making workstation type computers and using Unix as the standard operating system. And so roughly speaking, you didn’t have to do anything. If you had a bright idea for a new hardware system, you could get it off the ground, provide Unix. You might have to write some kind of compiler, but that was fundamentally it and you were up and running. And so all of these things help spread the system. And obviously, this is the system itself, but combined with the hardware getting better, our understanding of software getting better, our ability to port things from one place to another getting better.
And so all of these is compound, if you like, to make it easy for this particular system to spread. And another thing that happened probably in the, call it the ’80s, was the development of networking and particularly the internet, which although the internet wasn’t widespread or commercial or anything, it was absolutely there for universities and industrial research operations like Bell Labs, IBM, Xerox PARC, so on. And so the networking, again, made it possible for people to do interesting things and the networking fairly quickly converged on Unix. It was just the easiest way to do networking as well. And then I guess-
Adam: Is this because Bell was just very permissive in terms of licensing it out to these universities and stuff?
Brian: In a weird way, yes. Part of the deal with AT&T being this regulated public monopoly was that they weren’t allowed to make money off stuff that wasn’t their fundamental telephone business.
Brian: Because that would be cross subsidizing and that would be bad. So what they did, and I don’t know whether this was conscious or just nobody was awake, but what they did was to license Unix for academic use for practically free. It was just a nuisance fee and they licensed it for commercial use for not too expensive, 20, $30,000 kind of thing, because they couldn’t sell it at a profit. So they had to basically just say, “Okay, here it is,” nuisance fee, media fee, something like that. And license for universities in commercial was very permissive. You got the source code.
Brian: So you could do anything you want with it. The only thing you couldn’t do was of course distribute that code to anybody else, unless they were a Unix licensee. And so if I had a Unix license and you had one, I could show you all the stuff I did and vice versa. And so we would have in effect swap meets that very quickly became Unix user groups. And so the community spread like that very quickly. And I think AT&T didn’t even know what to make of this, and they tried sporadically to make money from this with not a huge amount of success, although some. But the licensing was also… I don’t know what the right word is, but there were issues. And what happened particularly at Berkeley is that they decided they would start rewriting the AT&T code so it wasn’t AT&T code anymore. And that way, they would be able to distribute it on their terms, which was fundamentally academic, free, here you are.
And there were certainly legal fights that went on into the early ’90s about what are the properties of code that was rewritten at Berkeley? And does it violate AT&T’s intellectual property rights and license terms and all the rest of this stuff? And it was probably helped along by things like Linux. I mean it was 1991 that Torvalds created his version of lookalike. And that was based metaphorically, not literally, I think on MINIX, which Andy Tanenbaum had done at the Vrije University in Amsterdam. And it’s obviously at this point when you say Unix, and for many, many things, it actually is Linux.
Adam: I feel like there could be a whole episode, or many episodes, about the spread of Unix and the eventual rise of Linux. Super interesting topic.
The Importance of Communicating
Another thing that came up when talking to Brian was communicating and writing. In this clip from our first interview, I asked Brian if writing and documenting and communicating ideas was part of the reason why he had such a large impact.
Brian: I think that’s actually highly relevant in some sense. I mean as a programmer, I’m at best average, I think, certainly not even remotely in the same league as somebody like Ken or lots of others who I’ve known. But I probably write better than above average in that. And so in a sense, that’s an ecological niche for me that I can do something. And again, it’s an example of, “Gee, I find this thing and it’s not very well explained or there’s no explanation of it at all,” or, “How the heck do you do this?” And so let me try to figure it out well enough that I understand it myself. And then in some way, write that down so that other people can understand it better as well, an impedance match between the people who wrote the stuff and they’re not interested in describing it very much, and the people who have to use it or want to be able to use it more effectively.
And so I think writing in that sense has for me, personally, been a good thing and a chance to do something where I can compete in a way that I couldn’t compete if it was just writing raw code or something like that. The other thing, and I think this is one of the things that probably Hamming said as well, when you do stuff, it’s important to be able to talk about it to people who are not experts in your field. Can you explain what it is you’re doing or what’s going on in your field or what’s important? Or all these kinds of things. Can you explain that to people who are not specialists in it in a way that they get the essence of the idea and as a result are better educated or informed or whatever? And so I think that’s a useful thing. And I think everybody should be able to do that. Even if you’re a programmer deeping the internals of something or other, you should be able to explain to someone else what it is you’re doing, where it fits in, why it’s important.
And it’s helpful to be able to both write that way and also talk that way, be able to stand up in front of an audience, literally or metaphorically, and explain it as well. And I think that’s another example of a place where if you do more of it, you get better at it, perhaps. So there’s a compounding effect.
Adam: Writing and communicating is important, and that’s something I’ve been thinking about a lot since talking to Brian. I mean he was impactful both as a developer and as a writer, and I’m not sure how you could measure one against the other. But a lot of people I looked up to over the years, they were actually people I just knew primarily through their writing or through their speaking. If the guests on this podcast had never written things or given conference talks, I would’ve never knew they existed. So I would encourage you in the coming year to do some communicating, to do some writing about technology. I mean don’t feel obligated to, but if you have written something or created a YouTube video or whatever, share it on Twitter, join our Slack channel and share it there. Maybe we can help spread the word. And also, I just love to see what you’re making.
Sean Allen and Open Source Supply Chain Attacks
Sean: Almost every program is probably pretty awful on the security front right now.
Adam: Oh, yeah.
Adam: Mm-hmm (affirmative).
Sean: Effectively, you’re granting complete trust to this code, right?
Adam: But what could solve it?
Sean: Certainly, you probably need a type system of some type, that will allow you to represent within the type system what activities, A, something it’s allowed to do, what it’s not allowed to do, right?
Sean: In object-capabilities, languages where Pony has that, there was [inaudible] did, is you provide an unforgeable token is the idea, right? So that here’s a token that allows you to listen on a TCP socket, right?
Adam: Mm-hmm (affirmative).
Sean: And any module that you do cannot open a TCP socket without having this authorization. This is how object-capability solves it. The issue that comes from that then is people come in and they’re used to working in languages that give them ambient authority, right? And what this leads to then in general is you have a token at the top level of your program. The top level of your program usually has ambient authority. It can do anything, in that it can turn that, it can segment off it so that it’s like, “Oh, okay. I can create the ability to open a GCP socket because I have ambient authority. And I’m only going to give you the ability to open a GCP socket.” And then you can hand that off to anything else as well, right? You’re able to do that. But this means that for example, at the call site, I have to explicitly pass this thing along.
Sean: And now, if I want to allow you to do other things in the future, this is dependency injection, not-
Adam: Dependency injection, yeah.
Personally, I love open-source software for whatever that means and everything. I’m just definitely worried about these ecosystems we’re building up that they’re so vibrant. We can’t trust them at all.
Adam: It’s an interesting observation, right? Why do I have to give a functio, call in a library I pulled from the internet, full authority to do everything that I can do in my code? Stated that way, it does seem like an area that needs improvement.
Going to Grad School with Krystal Maughan
A question that comes up in the Slack channel from time to time concerns doing computer science research or doing graduate degrees. I suppose that shouldn’t be surprising considering that we sometimes drift into more academic computer science topics. In Episode 52, I interviewed Krystal Maughan, who’s a PhD candidate. I was going to ask you about grad school, who should consider going?
Krystal: I think if you’re one of those persons who is looking for mentorship in a real way, and you found that maybe software engineering, you just can’t find that. You’re doing the thing and you feel like, “Yeah, I can create a database. I can do this.” But I need to know this, but everybody’s just telling me that it doesn’t matter. A lot of software is just very like, “Oh, just get it done.” And I really struggle with this because I want to understand a lot of things bit by bit. So if you’re one of those persons who needs to take things apart and really understand them, I think you’d be a really good fit. I also think that if you’re a bit of a troublemaker, you’d be a really good fit for grad school. William Byrd, he’s a professor at the University of Alabama. But he told me that people who are the troublemakers in undergrad and people say, “Oh, they need to get their act together,” he said something happens in grad school and those people shine because they have to know but why? They question things.
So if you’re one of those persons, grad school might also be a really good fit. And if you’re one of those persons who is fearless about failing especially, I would say that you’d probably be a good fit for grad school because you have to be okay with just failing, and being okay with the fact that through failing and being broken down in that process of failing and being built back up, you become a better person from it. Grad school’s also a really great place for you to say, “You know what? I’m going to find a professor in the music department and write a paper with them,” or collaborate with people from all different departments. I’m doing research with somebody in the econ department of my school. I met them in a machine learning class. So we’re building a machine learning model that’s related to an economics concept.
I absolutely love that I can do that because it’s also made me better in terms of understanding that problems are not just computer science problems, they’re tied to other things. One of the examples is bias. Bias is not just, “Oh, it’s just data and it’s a model,” and whatever, it’s often tied to other socioeconomic concepts or issues in the larger world. And there’s also balancing that with the public’s or human beings’ skepticism about technology. And you see this in all areas of AI right now, like the autonomous cars and it’s use of algorithmic bias.
Adam: What’s algorithmic bias?
Krystal: Oh, okay, cool. Yes. So algorithmic bias is a concept where we think of an algorithm as fair and we think that there’s no bias in it. And there was a lady, she’s at MIT Media Lab I believe, who found that when she was doing research, she’s doing robotics research, that the robot would not see her face because she’s black. And I’ve experienced this too where I would go up to a sensor in a restroom, the towel dispenser, and it won’t see my hand because it just was not trained on data that looked like mine. And so there’s a huge issue right now where a lot of these machine learning models are just producing horrible unintended results. Another example is I believe was Amazon that found that their hiring algorithm, their hiring AI, it was unintentionally rejecting females. So if you had anything in your resume that was woman’s chess or women in STEM, it would reject you because the algorithm had figured or predicted that you would make a worst hire if you were female based on their details.
Adam: Those two issues are different, but the same. If you’re machine learning that’s trying to build a function that estimates somebody and they have some bias, then you’re just going to reproduce that, right?
Krystal: That’s true. Yeah. And I think historically too, there’s just not… If you have a limited data set or a sparse data set, and you’re training based on… So you’re already starting with a skewed or unbalanced data set, then of course your model is going to have some degree of bias. And so that’s pretty terrible if you’re going to use those kinds of systems to do things like predict whether people are criminals or whether you should let them drive a car or not.
Adam: Yeah. Do you think that there needs to be more diversity in the AI field? Or is the issue more related to data than the people?
Krystal: I think it’s both. I think that perspectives are important, especially in research. And actually this morning, I was involved in a group that is doing long-term research to this capacity. It’s a group based on doing machine learning for social good. And my particular group is focused on education. And so here’s a good example. I feel like there’s so many, especially in the high school community, programmers who would make amazing researchers. But researchers, I think it’s a bit unfortunate, and my advisors, they’re aware of this, you’re at a huge disadvantage if you just didn’t happen to go to school with a strong research program. You’re pipelining these people who already go to strong research schools to get into the best schools or they get into any PhD program at all, then you are limiting the pool of people who could be researchers.
So this is a good example, too. One of the first hackerspaces I ever went to, and this is not a makerspace, it’s a hackerspace, it’s a grungy hackerspace and I love it to death, but they just didn’t have a female restroom because… And their female restroom had chemicals in it. They’re just putting a bunch of chemicals because it was just dudes. And I mean I love them to death and I spent years there, but there’s an elderly woman and her son who’d show up. The first time, she said, “Where do I use the restroom? There’s no female restroom,” was the first time that I saw the group of them just scramble to try to accommodate her. And I just thought, “Well this is interesting.” Maybe if they had thought about this, then so many things would be different. I think one of the first conferences, PLDI, I went to PLDI one year, which is Programming Language Design and Implementation, as a volunteer.
And I remember one of the other volunteers next to me, she was in [inaudible] already, and she was shaking her head because she said that… I think it was one or three out of 300 people where speakers at the conference were woman. And it’s like you can’t tell me that they couldn’t find… I’m not overly political about that stuff, but it’s a little bit scary, because it can go down a really bad road if you’re just not considering other points of view. So yeah. If you’re one of those people, then yeah. Come to grad school.
Adam: So who shouldn’t go into grad school?
Krystal: I have personal opinions. I think first of all, if you just want to make a lot of money, there’s no guarantee that when you come out of grad school… You certainly won’t make up for the time in terms of money that you missed. Another person I would say is the, “I am very smart,” type person, because you feel so stupid. Every single day, you’re starting from ground zero and just building yourself up, building your workup. And then when you get to the point where you finally understand something, you move on to different projects. And then you’re like, “Oh, great. I guess I feel stupid again.” So a lot of it has been like that for me, especially working in an interdisciplinary team.
Adam: Mm-hmm (affirmative).
Krystal: Depending on the project, you have people with different strengths. And before I went to grad school too, I went to Google I/O, which is Google’s big conference, and Jeff Dean was there. And so people are asking different questions. And I asked him if he had any advice, and I asked him specifically about finding mentorship outside of grad school. And he said that if you surround yourself with people who know more than you, or people who are from different fields, that you’ll always be learning. And I really like that. I like the fact that I feel like you should struggle intellectually because that’s how you grow as a person. And I think especially in software engineering, it can be easy to get pretty comfortable with just being like, “Oh, I’m the whatever person. And I know this library really well. I know this stuff really well.” And you can legitimately, until you got laid off, you could get very comfortable just doing the same stuff for 10 years of software engineering. And for grad school, certainly, you just do not do that.
You’re just struggling all the time and then just getting afloat, and then publishing and then struggling again, and then just getting afloat. And so you’re doing that over and over until you become comfortable with that whole idea of struggling and gaining the information and figuring things out. So if you’re one of those persons who finds comfort in just always being the smartest person and being super proud of that, and condescending and smug, then… Not to say that there aren’t those people in grad school, because they are. But if you’re one of those persons, it’s probably not going to go as well as you think it is. I think that the last person’s the checking box person, if you’re one of those persons who checks boxes, grad school can take longer than you think. So you might think that, “Oh, I’m just going to get my PhD and leave.”
And I have a cousin, I think she’s been doing her PhD for 10 years now. So you never know how it’s going to go. Or you might just end up not making it all the way through, is it 50% of people drop out? So you have to be flexible in that way of just I’m here for the journey and however it works out, it works out, I’m going to give it my best shot. And if you’re not one of those people, then it’s going to be rough.
Adam: If you’re thinking about going to grad school, I recommend just talking directly to Krystal. You can find her on the Slack channel for the podcast. If you go to corecursive.com, there is a link for Slack, or you can hit her up on Twitter or email her. She has lots of helpful advice.
Happy New Year
So that was the episode. That was also 2020.
I’d like to thank my feedback advisors, Jeremy Jung, John Walker, Bob Therriault, Brandon Brown and Robert Mason. If the episodes this year have gotten any better, then a lot of the credit goes to them. And if they’ve gotten worse, then that’s my fault. Also, thanks to Courtney, my wife, who puts up with me spending far too much time on podcasting.
And to end, here is some Jim Blandy archival footage. Ben Collins-Sussman dug this up. I guess the pronunciation of GIF versus GIF made people nervous that some version would be pronounced in a strange way. So they decided to make an audio file of the canonical pronunciation, and I’m going to end with that. Until next time, thank you so much for listening.
Jim: Hi. I’m Jim Blandy, and I pronounce subversion, “Subversion,” and I don’t know whether it has a capital V or not, but I’ve never written it that way. And I don’t believe in forcing one person’s interpretation of the world over another. I believe that everybody should have a chance to define their own reality.
Speaker: No you don’t.
Jim: But I pronounced subversion, “Subversion.” Hi. I’m Jim Blandy, and I don’t believe in the supremacy of Western culture over other cultures, but I pronounce subversion, “Subversion.” Hi. I’m Jim Blandy, and I know we’re all cultural relativists here, but I pronounce the subversion, “Subversion.” Hi. I’m Jim Blandy, and one of the things I like about free software is that anybody can do whatever they damn well please, but I pronounce the subversion, “Subversion.”
Speaker: Okay. One last one and then we’ll shut off.
Jim: Hi. I’m Jim Blandy, and I don’t really understand why anybody should care how I pronounce it, but I pronounce subversion, “Subversion.” [inaudible]