The Science of Learning to Code Debunking Myths and Exploring the Science

Adam: Hi. This is CoRecursive and I’m Adam Gordon Bell. Today, something different.

How did you learn to code? For me, it was in grade 10, the second year of high school. I took a class taught by Mr. Omar, who was mainly a math teacher. A great math teacher. I really liked him. But he also taught this programming class. The class was on the programming language, Turing, which I don’t think is used anywhere outside of Ontario, Canada in the ’90s probably. But the programming language was like Python. In the class, we split up into groups of two and we built a game, the Yahtzee game. My partner was Jason Solomon, but for me, this class was so much fun.

I don’t know. It was just the feeling of programming, the feeling of building something and seeing it change, making another change, seeing it change, the quick feedback cycles, the figuring things out. It was just amazing. I decided at that point that whatever this was, this computer programming thing, that’s what I wanted to do and I did do it. I went to university for computer science and I never looked back. So that class was very fortuitous for me.

But how do other people learn to code? If somebody asks me because they have a project they want to build or because they just heard it’s a good racket, or they’re just curious, if somebody asked me how to code, I can’t tell them, go to Lemington District Secondary School and take Mr. Omar’s class, Intro to Turing. That advice doesn’t fly. I’m pretty sure he is retired and the programming language probably doesn’t even exist anymore.

But today I want to answer that question. I want to take you through some of the education research, some of the computer science research on how to teach someone how to code. Research on how to learn and how to build a skill. And by the end of it, I think you’ll have a solid answer for that question. How should you approach learning a challenging task like coding? How should you specifically approach learning coding and how should you approach your own learning goals? So that’s the episode. Let’s do it.

The Camel Has Two Humps

There’s this story that goes back years in the world of computer science education. A story that’s a bit polarizing. At its center are these two researchers, one is a professor, one is a graduate student, and this paper that they wrote with a famous title. The paper is called The Camel Has Two Humps.

So picture this. It’s 2006 and it’s a world that’s growing more and more digital by the day, a world that’s desperate for computer programmers and people who can build the future. In the midst of all this, a special test was found by Saeed Dehnadi. I imagine Saeed and his advisor, Richard Bornat, sitting in Richard’s office at Middlesex University near London, England, and they’re pouring over the results of Saeed’s test looking for patterns to this questionnaire, to this quiz that they have been putting students through. They were looking for an answer to a simple question. Is there such a thing as a natural born programmer?

And they found something, or so they thought. A simple test that they claimed could predict with significant accuracy, who would excel at programming and who wouldn’t. It was in their own words like separating programming sheep from non programming goats.

But before I get into what the test was, here’s what happened:

They wrote their findings up and they started to distribute them and the test that they had developed. And the idea caught on. It was a much simpler world to believe that people were divided into those who would get programming and those who wouldn’t. Instead of confronting the complexities of teaching methods and learning styles and the myriad of factors that influence whether somebody would succeed in a programming class, this was just a simple test. Take this test. You’re either in or you’re out.

Remember, this paper was making its way around the computer science world in 2006 and somewhere around the late, late ’90s, computer science departments had started to get huge.

The Grade Curve

Computer science departments were at the coalface of getting more people into technology, and that wasn’t always easy. My classes at the University of Windsor year 2000, they were huge that first year. I guess the dotcom boom had led to lots of people going into computer science and to the departments expanding a couple years before I got there. And so all of a sudden it wasn’t just computer nerds like me, it was all kinds of students. But I had a class that first semester, that first year that was C programming, and I had done some programming before, but it was the Turbo Pascal, it was the Turing I mentioned, Visual Basic. And I did struggle with the pointers in C. Pointers were still not my thing, but here’s what I remember. I remember one day after a rough midterm, this very large lecture hall, and I remember the professor saying … Let’s call him John.

He shows a graph of the marks for the class. It’s projected on the board from his computer, and it isn’t like a normal bell curve. It’s not the hill that centers around one spot, but it’s kind of this messy blob that spreads out all over the places. This is exactly what the paper was about because my professor said there’s lots of theories about what’s going on here. So one theory is that this doesn’t look like a bell curve because there’s multiple bell curves here. There’s two of them. And really this is the theory I would’ve went with.

I was not a very studious person, but I knew how to program. I mean, I struggled with the pointer math, but yeah, I did good in the class. I knew what I was doing. Others didn’t. Others really didn’t and really struggled. I had two friends, Matt and Joel, they were on my floor in residence and they were both in computer science first year. And both didn’t end up in computer science second year. Both ended up transferring to the school of business.

Matt never got past calculus class that was a prereq for so much of second year that he pretty much had to switch programs. And this happened for a lot of people, which meant that come second year the classes were much smaller. Some of those people on that graph, they didn’t make it.

The Interpretation

So that’s what the paper was about, The Camel Has Two Humps. Instead of university comp sci classes having a normal curve, they had two curves and those are the camel’s two humps and one of those humps is a dud. They should not be there. They do not have what it takes to become programmers. It was very reassuring for first year student university me to hear this professor say, “Listen, this is the graph. Some of you don’t have what it takes and others do.” Because it’s nice to be one of the chosen ones.

And think of my professor. John had been teaching for nearly two decades. He had been trying his best to instill the intricacies of coding into his students. And some, they catch on quickly while others, they just don’t. And picture John late one night and he’s staring at these disappointing midterm exam results and he’s questioning his teaching methods and his patience and his career choices. And then somebody distributes this paper to him, The Camel Has Two Humps, and it says this.

We have found a test for programming aptitude of which we give details. Remarkably, we can predict success or failure even before students have had any contact with a programming language and with total accuracy. We present statistical analyses to prove the latter point. We speculate that programming teaching is therefore ineffective for those who are bound to fail and pointless for those who are bound to succeed.

Adam: In this theory, of course, John can find solace.

It’s not his fault, he can tell himself. These students, the ones failing his class, they simply can’t code. It’s not a matter of efforts or of teaching methods or patience. It’s just nature drawing a line between those who can and those who can’t. They were born that way.

For John and countless other educators, this paper was life affirming. It validated their struggles and their frustrations and it gave them an out, a chance to wash their hands of their responsibility. And this allure of a simple explanation, it made this paper a hit. It made this paper famous or infamous.

The Test

Adam: So this test that can decide if you’re a programmer or not, what’s in it? Well, you can find the test on Richard Bornat’s website and taking it gave me pause. What if I failed? What if it’s all the pointer math that I couldn’t handle in that C class?

What if I should have washed out of comp sci like 23 years ago and I should have transferred with Matt to be in business school and be a business consultant somewhere. But I took the test anyways. The test is like 12 pages long. It’s designed to be printed out in landscape mode. Imagine a bunch of A4 papers turned the long way and stapled together and handed out to students at their desk. And the first question goes like this.

Read the following statements and tick the box next to the correct statement. A equals 10. B equals 20. A equals B. The new values of A and B are …

Adam: Then you have a whole bunch of answers, including A equals 20, B equals 20, which is what I would go with. It’s assigning the value of B to A, which is not hard.

There are 12 questions, and here’s the weird thing. They’re all like this. The 12th one is A, B, and C, but all they seem to test is that you understand how variable assignment works in a programming language that’s like C.

So I passed the test. I got all 12 answers right. But here’s the thing. What an absurd test. Are they saying that some people are born naturally understanding C based variable assignment from birth? That can’t be right. Someone has to explain to somebody that equals equals assignment. That’s not something from birth. It seems like there’s a lot of assumptions embedded in this quiz.

But then it’s time to grade and the grading is a little bit different than I thought it would be. It turns out that there are not right answers per se. The test was looking for consistency. If you thought assignment A equals B caused the values to swap and you use that logic throughout all 12 answers, you would pass the test.

The pass fail was based on whether or not you applied the same rule across all of the test answers, which is a bit better. So it’s not actually C assignment, it’s consistency. The whole thing is testing whether you have a theory for how a computer program might work. That the computer would have some specific rule that would execute repeatedly question after question.

( Of course, if you change your mind halfway because it’s a test and you’re nervous, then I guess you lose.)

But I guess what they’re saying here is if you don’t have a model for how computers work, even if you just make it up, then computer science is not for you. That was the test that they claimed separated students into these two groups. Those who are naturally born and will be programmers and those who can just never cut it. But the story doesn’t end there.

Retraction

The paper circulated and its conclusions came under scrutiny. Researchers tried to replicate the results and they came up short. The sample size was kind of small. The methodology is a little unclear, determining whether they implied a consistent rule. It also just doesn’t match how we think about learning a skill. With effort and practice, people get better at things, including programming. There’s no impossible bar that nobody can get beyond, especially something as simple as variable assignment. That can’t be what’s happening here. But then there’s a twist. Richard Bornat came forward and he said the paper was a mistake. He wanted to retract it.

Though it’s embarrassing, I feel it’s necessary to explain how and why I came to write The Camel Has Two Humps. It’s in part a mental health story. In autumn 2005, I became clinically depressed. My physician put me on a then standard treatment for depression, SSRI. She wasn’t aware for some people an SSRI doesn’t gently treat depression. It puts them on the ceiling.

Adam: I think what he means is the depression caused him to have mania. Perhaps his depression was bipolar depression.

I took the SSRI for three months, by which time I was grandiose, extremely self-righteous and very combative. Myself turned up to 111. I did a number of very silly things while on the SSRI and some in the immediate aftermath, amongst them writing The Camel Has Two Humps. I’m fairly sure that I believed at the time that there are people who couldn’t learn to program and that we had proved it.

I also claimed in an email that Dehnadi had discovered 100% accurate aptitude test. Perhaps I wanted to believe it because it would explain why I had so often failed to teach students. It was an absurd claim because I didn’t have extraordinary evidence. I no longer believe it’s true.

Adam: It turns out Bornat had been struggling. Struggling to teach students had been something that weighed on him. And just like the people who grabbed onto the result he found, this finding had been a savior because it showed him that he didn’t have to worry because these people were unteachable. Speaking of those people, by the way, this test had some slight overtones as well.

In the early days of this paper, Bornat may have made some comments about gender being a factor. Basically, women were in the dud hump.

You see the paper had a cost. A cost that was paid by the students labeled as non-programmers, the students who were written off before they even had a fair shot. It was paid by the underrepresented groups, so the women, the people of color, the non-traditional backgrounds who were already fighting uphill battles to carve out spaces for themselves in the tech field and now here’s a paper that says, oh, you don’t make it.

The Impact Of the Camel Humps

Bornat, of course, wishes the paper was never published. But I think it’s valuable because it serves as a reminder that science is a messy process and there’s missteps along the way and not everything is rigorous or peer reviewed and that we should question bold claims.

But most importantly, and why I wanted to cover this paper is that we should resist easy explanations when thinking about human abilities and learning. It’s easy to dismiss yourself as not being capable of learning something, and that is by far the least likely explanation.

More likely there might be some basic grounding knowledge that’s missing, steps that haven’t been explained well, problems in pedagogy. We need to have empathy for the challenges that people face when learning.

There’s another thing about The Camel Has Two Humps paper that bothers me that seems less mentioned. If you accept the claim that students can be divided into those who code and those who can’t, that might be immediately reassuring if you’re struggling to teach a class or something.

But the next thing that follows from that is there’s no way to teach people computer programming. They either can or they can’t. And why is this, if not true, the most tragic thing for a computer science educator to learn? To learn that your vocation has no point. All the time you’ve devoted to it has not been useful.

How tragic would that be? It’s the perspective of somebody who doesn’t want to be teaching. What happens if we consider teaching computer programming from a different perspective? From the perspective of someone who loves and is obsessed with learning? That’s the next expert. One who would change not just computer science education, but attempt to change education in general.

Seymour Papert

Adam: Part two, Seymour Papert.

Seymour Papert had a passion for learning from a very young age. He didn’t discover this passion in school though. Instead, he started with his love of gears.

Quote:

Before I was two years old, I had developed an intense involvement in automobiles.

The names of car parts made up a very substantial portion of my vocabulary. I was particularly proud of knowing about the parts of the transmission system, the gearbox, and most especially the differential. It was many years later before I understood how gears worked, but once I did, playing with gears became a favorite pastime.

I love rotating circular objects against one another, gear like motions, and naturally my first director set was made from a crude gear system. I became adept at turning wheels in my head and making chains of cause and effect. This one turns this way, so this one turns that way. This one moves this many gears, this one moves that many.

Gears serving as models in my head carried many otherwise abstract ideas into my understanding. I clearly remember seeing multiplication tables as gears and my first brush with equations immediately invoked the car differentials.

Adam: So even from a young age, Seymour was passionate about learning, but this passion did not extend to math class.

I was lucky to have been born a mathematician, but I went to school like other children and hated every minute of math.

In particular, I loathed and feared mathematics class. The subject as it was taught in school seemed to have no relation at all to the gears that interest me.

I could not understand why the teacher would stand at the blackboard and scribble strange marks and talk in a weird language.

Adam: You see, Papert could do math in his head using gears. How many teeth were turned here? This gear multiplies the effect of this movement and so on. What he loved about math was its concrete, hands-on form. Math that he could play with and take apart and understand intuitively.

But math class was all abstract symbols and rote memorization. It was disconnected from the math in gears and machines that he had played with that lit up his young mind. This early experience though, it planted a idea in Seymour’s head that learning should be active and engaging and connected to real world interests.

MIT

You see, years later, Seymour became a pioneer in artificial intelligence and he was working at the artificial intelligence lab at MIT. He wanted to create a new way for kids to learn math and programming. This was the 1960s, and most of his colleagues saw computers as these giant calculators, perfect for crunching numbers and data, but not so relevant to more creative pursuits.

But Papert believed that commanding computers to draw shapes and patterns, kids could learn geometry and problem solving skills in an intuitive and playful way. So he puts together this scrappy team of idealistic young researchers and they get to work thinking about new ways the computers can teach kids. First thing that he wanted to change about education was math class.

The way they introduced probability is some ridiculous calculating of fractions.

It’s not useful for anybody.

You’d never suspect from that that probabilistic thinking was one of the most powerful and dramatic far reaching change agents in the history of science.

Adam: So instead, his idea was maybe we could teach probability by having kids build things with randomness built into them.

We can have five year old kids making art on the computer. Introduce randomness and probability into that to produce wonderful effects. We’re going to have seven or eight year old kids making robotic devices that have probabilistic elements built into them so they can get around obstacles.

Adam: This gives you a flavor of Papert’s thinking. To learn probability, you need to actively construct a probabilistic system, wrestle with core principles by building robots or building art that incorporate chance.

He wanted probability to be hard fun. His term for immersing kids in big ideas and letting them playfully build systems around those ideas. Not passively absorbing, but actively creating. That’s where he believed that deepest learning lived.

And this all was because he remembered his own childhood frustrations with the boring abstract math class. That really got to him.

Logo

Adam: So in 1967, Papert and his team at MIT created Logo, the programming language for children. Logo was among the first programming languages with graphic capabilities, but the key was its simplicity, and to make the programming even more concrete, they also built a little robot called Turtle.

The turtle robot has three key attributes. It has a position and a heading, so children can map its movements to their own body, and it can be controlled with commands like forward, back, left and right.

And it leaves the trace of its path with a pen so kids can see the shapes they tell it to make.

The Turtle builds on a child’s knowledge of space and movement. By playing Turtle and acting out the movements themselves, kids can learn geometric concepts intuitively, and by programming the Turtle, they actively build knowledge structures.

Adam: The Turtle allowed free yet systematic exploration of mathematical ideas. It was a Cartesian plane, but it was also a turtle you could play with. In 1980, Papert published a book, Mindstorms: Children, Computer and Powerful Ideas, and it shared his vision for learning via making things and tinkering. He called his method of learning constructionism. It stood in stark contrast to the instructionist model used in most classrooms.

Constructionism meant learning by building new knowledge with concrete objects and experiences. Yes, it was built on his idea of gears, but it was also built on time he spent observing children learning at schools. He illustrated his theory through one such observance at a junior high school.

I would pass an art class every day. For a while, I dropped in periodically to watch students working on soap sculptures.

This was not like a math class and math class students are given little problems to solve on the fly. But in this art class, they were all carving soap over many weeks. It allowed them time to think, to dream, to get to know a new idea and try it out, to talk, to see each other’s work.

Not unlike mathematics as it is for the mathematician, but quite unlike mathematics as it is for junior high school.

Adam: Papert saw that his students were learning art in a constructionist way. They were iterating on long-term projects that embodied their interests and ideas. They could touch and feel their work. And Papert was a MIT and in the MIT lab, he knew the work he and his colleagues did was much closer to the process of these soap sculptures. They were sitting around thinking and playing with ideas. They weren’t sitting around drilling math equations. They were dreaming and thinking and building.

So what he wanted, what he believed in was that math could be just as hands-on and personally meaningful. Students should be able to construct geometric forms and see how equations works with gears and robots and whatever else. This type of hard, gritty play with mathematical concepts was active learning at its best. He wanted children to experience the beauty and power of mathematics for themselves, not just to be handed down instruction on it.

Constructionism opens up a new way of thinking about learning and knowledge. It tries to make a play. One where students build up their own learning rather than receiving it, rather than having it handed down. The idea of the Logo programming language, it wasn’t there to teach computer programming. In a way Papert cared little about actual computer programming. What he cared about was using it to find a way to make learning fun.

Make Learning Hard Fun

I have learned many new skills in the spirit of hard fun, including flying airplanes, cooking, juggling, and living with distorted spectacles that alter my vision.

When I learned to cook chicken curry from my friend Sanjay, at first I had to carefully follow the recipe, but after making it a few times, I could improvise and adapt the recipe as I wanted.

It was the same when I was learning to juggle. At first, I was dropping the balls constantly, but then I started getting the rhythm and I had my first exhilarating moments of keeping them up in the air.

Adam: So Seymour was a computer science educator, but only because computers and robots could be learning tools.

In fact, side note, Hal Abelson, the Structure and Interpretation of Computing Programs author, he has a book that teaches advanced geometrical and mathematical concepts all using Logo. The whole thing using an exploratory constructionist Papert approach.

That was Seymour’s vision, right? All classes should become art class or auto class or woodworking class where you explore and make things and touch things.

At its core, this constructionism is about joy and the power of learning by making. Seymour Papert never lost his sense of wonder and discovery, neither for himself nor or for the many children he was trying to inspire. His vision was that a graphical computer environment or even a robotic device being controlled by a computer could transform the student to a magical place that he called math land.

MathLand

Adam: Math land sounds fanciful and magical and maybe challenging, I guess, but what it was to him and his team was a place where kids could understand and learn mathematical concepts the same way that a child in Paris might pick up French.

A place where they could learn geometry and algebra and everything else just by interacting with the world. They would just absorb it because it would be part of the background, it would be natural, and it would be fun.

There were a lot of constraints on building this idea of math land. Consider computers at the time were just teletype terminals. The computers didn’t even have screens. Even with those constraints, Seymour and his team made it work.

You can imagine them going to a school, a school that has a teletype terminal, and bringing the Logo robot, the Logo turtle, which was quite large at the time, phoning in to the MIT computer with the teletype machine and hooking up the Logo robot.

The kids gather round, they set up a wood barrier with paper underneath it and then into the teletype machine somebody types forward 100 and it moves. It leaves a mark on the paper behind it with a pen, and then somebody puts right 90, forward 100, repeat four, and the kids gasp as the robot slowly draws out a square on the paper.

This made learning geometry and Cartesian coordinates. Exciting. Now, this was all in the late ’60s, but I had my own experience with math land.

Personal MathLand

When I was in grade three, far before Turing and Mr. Omar, we had these two Commodore 64 computers in the classroom. And when it was raining, we stayed inside and we’d play on the computer. Sometimes we played Carmen San Diego. But Logo was on there and my friend Ben showed me how it worked. And we figured out, or his older brother had told him how to do a triangle.

You could draw a triangle by going forward and then turning, going forward. You had to get the degrees so that the triangle would end up catching back up with itself. Once we figured that out, how the angles of the triangle had to add up, we figured that we could repeat it. We could draw a triangle. You go back to start, you turn a couple degrees, draw a triangle again. And if you repeated that enough times, now you had a spiral slowly being built of triangles going all around like a spirograph thing.

And that was so amazing to us, but it took so long. We’re slowly watching it draw. So we’d do it where you draw the triangle and you turn 90 degrees and you do it again and you get four triangles in a spiral, and then if you do less than 90, the spiral gets tighter.

And so we thought, well, what is the biggest one we can do? So we tried draw a triangle, just turn one degree, draw a triangle, turn one degree, but it was so slow. It was cool to watch as this triangle shaped slowly morphed, but recess was over and we had to go back to working on whatever was going on in class.

So we turned off the screen on the Commodore, I’m pretty sure, and just left it running with the idea that at the end of the school day we could go see and see what was drawn. And we were so excited about what this creation would be. But when we got back at the end of the day and looked, it was just a circle.

If you draw a triangle and turn one degree and keep doing it, we basically filled in every pixel on the screen very slowly by this Turtle drawing a triangle.

But it was fun. I still feel like I have a good understanding for why the degrees of a triangle have to add up to 180 degrees. Because I felt it exploring with Ben, and it was just a game we were playing at recess. That’s the thing. The thing about Logo was that it had what Seymour called a low floor, but a high ceiling.

Me and Ben could mess around with it, but it was also rich enough that Hal Ableson could teach an MIT geometry class using it.

Lego Mindstorms

Adam: Seymour wanted a method to extend that concept, and so they started using Lego blocks to build things. It turned out that the Lego company had very similar goals. Lego company cares a lot about children learning, and they wanted people to build things as well so they sponsored Seymour at MIT. They started collaborating together.

This was decades later after Mindstorm book, but out of this collaboration came Lego Mindstorms. I never had one, but they always looked so cool. They were like a robot construction kit that you could program and made out of Legos. And the name Mindstorm was of course in honor of Seymour’s book. The thing that happened is the Lego Mindstorms, instead of taking off with kids, actually found a lot of popularity with hobbyist adult tinkerers and hackers. According to Wikipedia, Lego wasn’t quite sure to do about that. Maybe this was too advance of a toy for kids, but adults used it and loved using it. But because it didn’t quite meet, I guess, Lego’s focus, in 2022, they canceled Lego Mindstorms, which is seriously tragic.

School Today

Adam: But when we think about technology in schools today, we think about Chromebooks and iPads and educational apps. Devices and content delivered from big companies. This is very different than the Commodore 64 that started up into a basic terminal that you could launch Logo from.

Very different from what Papert imagined. For him, computers were not about content delivery or standardized curriculum. They were toys. You could take them apart, you could play with them, you could use them to design and create things. You experiment. You take risks. Like playing with gears as a child or taking a car apart, turning the wheels and imagining how they connect. You work hard to figure things out because you’re absorbed in your own creative process. This was Papert’s vision for how he could transform education, not with these top-down Chromebook instruction devices, but with creative toys.

Papert’s Lesson

Adam: This is why I think this is all relevant. Some of Papert’s ideas have been put into question, but I think that one thing that is very true is that if you’re teaching somebody to code, they should have a project. They should have something that they want to build that’s fun and exciting for them that makes it feel like play. Something that drives their energy forward.

I think that’s the important lesson he gave us, is that the way that you enjoy learning the most is with a self-directed project. Here’s why this matters for learning. If somebody wants to learn to program, there’s no better way than having as part of it, a self-directed project that they want to build. Nothing will be as fun and provide as much motivation. And when you’re learning to code on your own or taking on a learning goal that’s challenging of any sort, motivation is important.

So that was Papert. He is very famous, but our next stop on this learning tour is the mythical man himself, Fred Brooks.

Fred Brooks

Adam: Okay, it’s the 1950s. America and the world is captivated by the rise of computers, these giant hulking machines that promise to transform everything. But software is still a novel concept. Nobody really knows systematically how to build it.

And that question perplexed a bright graduate student named Fred Brooks. He was working on his PhD at Harvard and he took a summer job at IBM. And they gave him a job. Write the software for the IBM 704 mainframe. There you go, summer job.

So Fred’s alone in a room with this hulking computer. I don’t know what a 704 mainframe looks like, but I assume it’s huge. And he struggled. The work was slow and painstaking, and after just a few weeks, he had only produced 200 lines of working code. And he knew that this wasn’t fast enough. That he would never get his summer project done.

I don’t know what writing assembly for a mainframe is like, but I think cognitive load is probably very high. Research today will show that one of the biggest challenges in learning to program or learning in general is managing cognitive load.

There’s just a lot to hold in your head at one time, especially if you’re new to it. And if you don’t have previous experience, you don’t have chunks of knowledge and long-term memory that you can hold onto. You have to try to shove it all in your working memory.

And this was Fred’s problem, but he had an idea. Get help. Why not just recruit another programmer to work at his side?

So Fred convinced his friend Henrik to join forces. So they sat together at a terminal sharing ideas back and forth, and they began to build out the software for the mainframe. And the software, according to Fred, had took shape rapidly as if their creativity was amplified. And in just 10 weeks, they got the summer project finished and it was free of defects, which sounds remarkable to me. I don’t know if I buy it.

But here’s what I buy. Fred realized that two minds are greater than one. When they collaborated sharing knowledge and catching each other’s mistake, their potential blossomed. It went from something impossible to something that could be done.

And Fred knew he had experienced something special. A new way to write software was boring. Fred recognized the power of teamwork even in those early days. Spread the cognitive load out across two people and you can raise the bar on what’s possible.

This was pair programming, although I’m not sure it really had a name then. It was more just getting help from a buddy. If I’m stuck on something, I can go talk it out with someone else, and then if that doesn’t help, then odds are the two of us will gather around my machine as I walk them through the problem.

Pair Programming Gurus Of the 90s

Adam: But in the 1990s, some people had an idea to extend this pair program concept to its logical conclusion, always work in pairs. They were touting benefits like higher code quality and knowledge sharing, but they couldn’t convince everybody.

Most programmers were still working solo alone in their cubicles, headphones on, trying to stay in flow, totally focused, cranking out line after line of code, maybe rocking out to some metal or some EDM. Who knows?

But now advocates of this newfangled pair programming appear and they tell them, break your flow. Take your headphones off. Stop listening to Metallica and instead, constantly talk out loud to a partner. Tell them what you’re doing. Share a screen and a keyboard with them and work through the problem that way.

It sounded nuts. It sounded counterintuitive to a lot of seasoned developers. It seemed to question the very nature of programming as a inherently individual activity. Because a heads down coding is like a personal introvertive meditative idea. Who wants a constant partner? That’s not why people got into it.

And if you’re me, there’s also a whole realm of self-doubt.

I’ve had to debug things before with some people watching over my shoulder who I thought were judging me, who I didn’t totally trust, and I found it quite the opposite of increasing my brain power. I found it like a coding interview. The cognitive load was higher because I was worried about what they were thinking. I was worried about their eyes on me.

So reactions to pair programming were skeptical. There was audible groaning and eye rolling at every place I’ve worked whenever somebody brought up pair programming as being a magical solution.

All this talk of knowledge sharing and pair flow, it sounds a bit vague and a bit touchy feely. Where’s all the data to back it up? By the way, this is all going to come together about learning.

Pair Pioneers in Utah

But software pioneers exploring this practice, they could feel just from doing it that they were onto something, and so they pushed on this all throughout the ’90s. At pioneering companies like Chrysler, they reported that pairs got more done in less time. They produced fewer bugs and programmers actually enjoyed their work more. The energy and pace when two developers worked in sync was palpable they said. The pleasure in creating something neither could quite do on their own, it was amazing. It drew them in.

Still, there was many reasonable questions and doubts in these early days. Productivity and quality improvements were unproven. So in 1999, researchers at the University of Utah designed an experiment to settle the debate. I’m pretty sure they were on team pair programming to begin with, but the experiment setup was pretty solid. They divided the student programmers into two groups, solo developers and pair programmers, and they observed them complete real world coding assignments. The results of the experiment were clear.

The pairs completed their programs faster and with higher quality. Now, the pair programming advocates could go to industry and say, look, we proved it, this works.

But there’s some caveats. First of all, these are students, not professionals, and they were working on toy problems, not real work, necessarily. They were self-selected for which groups. They got to choose if they wanted to be pair or alone. I would’ve picked alone. But it’s also clear that they got the problems done almost twice as fast. So it took two hours for someone to solve a problem on their own. The group might get it done in a little bit over an hour. Slightly more people time, but far less wall clock time. And the programming things were … They were not just hello world. These were real projects.

Pairs Spread

From there, in the early 2000s, in the aughts I guess as they call it, pair programming finds some uptake in industry. At NASA, people claimed pairing reduces defects in their complex software, which is mission critical, but they have to get it right.

Some outsourcing firms in India come forward bragging about how pair programming allows them to deliver very high quality code in short timeframes. And Sabre, a major player in the travel industry, sees increases in both productivity and quality after adopting pair programming company-wide. This is what they say.

So I have some strong suspicions about this always pair programming in industry idea. The pair programming advocates sometimes make it sound like this is a panacea, that it fixes all problems.

And I question that.

The existing research shows that two devs in front of a computer get work done almost twice as fast. And that’s astounding. That breaks Fred Brook’s famous law that you can’t speed up a project by adding more people to it.

And there’s sort of a efficient market hypothesis argument against pair programming working. If it were true that you could nearly linearly improve the wall clock time of development by just adding more people but keeping them in pairs, you’d expect industries such as startups where I work, where such a huge premium is put on execution speed, you’d expect pair programming to be pervasive.

Pair Learning

Anyways, that doesn’t matter because today’s topic is learning. And so far we’ve talked about how it’s not a natural innate ability. Everybody can learn to program. And we’ve talked about making an experience fun and exploratory helps.

But also pair programming is an amazing way to learn. There’s no question.

Imagine that you were trying to ramp up on a new system at work. Maybe it’s been around forever and you’ve never had to touch it, and there’s just so much tribal knowledge around it. All of those tricks of the trade that nobody bothered writing down. They’re out of date in some wiki somewhere. The experts on the team never seem to have the time to properly document things. They’re two heads down on their own tasks. But if you can get one of those experts and you can pair program with them, it’s like having your own personal mentor right there at your side.

Those unwritten rules, they’ll explain them to you as you go. The shortcuts and the tools that took them years to build up, you’ll master them quickly or quicker just by watching. When you hit some gnarly bug or some design problem, you get the benefit of their experience and working through it. And it’s not just having someone point you to the right area or throw an outdated wiki at you. This is learning an action in context. It’s exactly where you need to learn and where you will retain so much more of the knowledge because it’s directly relevant to the problem you’re trying to solve.

And you may be thinking, won’t I slow them down with all my questions? Don’t they have things to do? But here’s the thing, by explaining out loud what they’re working through, they’ll actually gain new insights into their own expertise and strengthen their own knowledge. Teaching someone else sharpens your own mastery so everybody wins.

Pair programming accelerates the learning curve and it closes knowledge gaps. There’s actual research on pair mentoring that backs this up, which is really wild. But before we get to that, I probably should have done this earlier. Let’s connect the dots here.

Summary Of Learning Approach So Far

Adam: So let’s say you want to teach someone to code. First thing is it’s totally possible. Nevermind the camel humps. Second, the environment for play is very important. In the early stages of programming, cognitive load is very high, and it’s mainly because of all the stuff surrounding programming. How do I install Python? What IDE or text editor do I use? How do I install some package? What’s a terminal? How do I run something? I think learning calculus is probably harder than learning to code. Even if you’re trying to learn something unusual like Lisp or APL or Zig, there’s much harder subjects to learn.

But what makes learning coding hard isn’t the inherent challenges, but all the struggles of just getting an environment set up right and learning the tools. So probably if you’re teaching someone to program, one of the big wins is just getting them past all of that nonsense. Just saying, okay, you’re learning Python or JavaScript or whatever and you’re using VS code or this online repl or whatever. Prescribe the environment and help them use it. Getting them there gets you to that Logo like place where they can play, where they can type something and run it. Same with when you’re trying to learn something yourself. If you’re trying to learn a new skill or get familiar with a new area, getting everything set up is a great time to pair. You’re not going to want to because you won’t even be able to compile a project or have the right version of something installed.

You’ll have a bunch of setup stuff to figure out, but that’s the best time to get input from someone who knows the area. It’s the best time to get the lay of the land. It’s like Seymour going to some school and setting up the Logo turtle on the floor and the paper underneath it and some wood blocker blockers around it. He’s setting up the environment in which that learning can take place. You get all that out of the way and programming seems less hard and less impossibly far away. But let’s stay focused on this one case, teaching somebody to program. I’m going to extend this later. So next thing is let’s follow Papert again. What do the person you’re teaching want to build? What motivates them to play and create? That can center the learning so they don’t run off in a million directions trying to learn a million things or get lost in some corridor of knowledge.

Building a project is fun and motivation is important to keep them going. But next though, as we learned, project-based learning, purely discovery based is slow. And learning Python or JavaScript or whatever pedagogical path is chosen and relates to the project is well developed. Wherever there’s a smooth well trodden learning path, it makes sense to follow that even if it doesn’t totally track with the learning project. This is hard advice for me to give. Find a Python tutorial and have them do that is basically what I’m saying. But I love lots of weird technology. I’d love to say like, oh, if you’re building a web app, you should learn Kotlin or Rust or something like that. And of course you need to learn my vim key bindings and whatever, but that’s a mistake. If you want to build a web app project, learning JavaScript or maybe Python is the way to go.

Any given beginning learning path using a standard editor and some online tools is the way to go. This is the lesson of instructive teaching research. A well laid out learning plan with explained examples is the fastest way to learn. If you think of learning as exploring a new territory, then picking a path that’s well trodden like Python or JavaScript beginner tutorials is like taking a highway, whereas choosing your own path that hasn’t been done before is like bush whacking through the undergrowth. It’s much harder.

So that’s my overall learning advice so far. You need a project, you need a pedagogical path, and you need to pair with the person and make sure that they get their environment set up. But we still have to discuss another piece of learning research, maybe the most important, and that’s mastery learning.

Mastery Learning

Adam: So there’s this guy, Benjamin Bloom. He’s a famous educational psychologist, not a computer programmer at all, but he did an experiment that shocked the world. Shocked the world of education that is. I’m not sure how much anyone else cared. But it became known as the Two Sigma problem.

So picture it, it’s Chicago in the 1980s, and Bloom and his team of grad students have set up an experiment with students at different Chicago public schools. They took a group of average students randomly selected, and they had them learn a curriculum through conventional teaching methods, which is like lecturing, testing, some group work, et cetera. Standard classroom stuff.

Another randomly selected group from the same pool of students learned the exact same curriculum, but with one-on-one tutoring. Each student worked through problems on their own, but with a tutor who gave them immediate feedback and then advanced them to the next problem set when they had demonstrated that they had mastered that concept. The results, the students with tutoring vastly outperformed the others by two standard deviations, hence Two Sigma.

It was like taking an average class and transforming them from the 50th percentile to the 98th percentile, making every kid in the class the best kid in the class. This is as far as I can tell, the most important result from education research. It’s the biggest effect size found. Practically, it’s like making everybody the very best student in the class. And it caused quite stir, as you can imagine. The notion that personalized tutoring could boost grades so dramatically, it seems too good to be true, but also hard because you can’t give everybody a tutor. It’s the complete opposite of The Camel Has Two Humps paper. It says that you can get anyone to the very top level if you sit them with a mentor who helps them master every process step-by-step. The study design came under a lot of scrutiny. Follow-up studies have been somewhat mixed.

Some showed similar major two sigma gains from one-on-one tutoring and mastery learning. Some found to get the two sigmas sometimes students and tutors had to spend more time really slowing down and working through difficult areas. So yes, students became the absolute top of the curve, but it took a lot more time investment than a standard learning approach. Other studies showed more modest improvements, only one standard deviation. But one sigma, one standard deviation is still huge, and the core idea has held up. Guided practice with a tutor working you through works so well. The camel has zero humps or the hump is really compressed to one side of the graph. I don’t really know. I need a better term. The metaphor is breaking down here.

We see this play out every day in the world of coding education. We see when successful bootcamps pair students with instructor mentors who get people through material they need to pass an interview very quickly.

We see it with competitive coding teams where a team of three students has a coach who intensively provides them feedback and gets them up to an astounding level of coding proficiency.

And I’ve seen it most often when a senior team member spends a lot of time with somebody new to a team, somebody junior. A team can be working at a really high level doing advanced stuff and bring in somebody junior, work with them directly, assisting them, working them through things and catapult them up to a level that took people decades to get to.

The Mastery Approach

Adam: And so that’s your answer for how to teach someone to code. All the stuff I mentioned. A project that excites them, a tutorial series they can work through, but most importantly, you pair with them through the hard parts, work it through. The mentors in Bloom’s studies knew the material and made sure the students understood it step by step, but they weren’t the most amazing mentors in the world.

You could do this. You could teach someone better than the best instructor class led program out there because you’d be a personal mentor. This isn’t available for a teacher teaching a class. This is personalized learning. Think about that: If somebody wants to learn, you can teach them to a level higher than they could get in a class if the two of you can take the time. That’s my big lesson today. I don’t know how to underline that. It’s taken me a long time to get here, but that’s what I wanted to say.

If somebody has something they want to build, you can find learning materials, you can help them set up their environment and you can pair with them and help them work through the hard problems and they’ll get there. That’s it. And here’s the trick. It’s taken me a while to get here, but this doesn’t just apply to teaching someone something you know.

Teaching Yourself

Adam: It doesn’t just apply to teaching someone how to learn to code Python or JavaScript. The same process applies to learning in general. For helping others, but also for helping yourself. The thing I said about the camel study is that it’s wrong. That people can learn things that they want to learn, and that’s something you need to tell yourself when you want to learn something and you get stuck. You can do it. It might just mean slowing down on the hard parts or backtracking, finding the piece of knowledge that you’re missing. You can do all this on your own.

The hardest part to do on your own is the mentoring, obviously. You could try to find someone who knows what you want to do and maybe ask them for help, ask them for environment setup. You can seek out someone who knows the area you’re learning, ask them to pair with you when you get stuck.

Peer Tutoring

Adam: But there’s one more way. You don’t need to necessarily find an expert, but if you can, that’s amazing. But Bloom’s Two Sigma approach was extended to peer tutoring.

In peer tutoring, instead of a mentor, the learners each taught each other back and forth. You pair up with a peer. Basically, if you want to learn something hard, you find somebody else who also wants to learn it, and you guys work through it together. You put together a combination study group and pair programming group. One of you learns a topic and then teaches it to the other and then back and forth and you pair with each other on it. Education research says if you do that, you won’t get a two sigma increase, but you’ll get one standard deviation better than traditional learning instruction. Still an amazingly effective research result, and you don’t need an instructor.

The two of you can hill climb through the toughest material given time. Not to say that learning’s not hard. I just want to say that it’s possible, and the best way to do it is like that. Be motivated, have a project, but have a peer that can pull you up and that you can pull up. You each learn by teaching each other.

So this is what I found looking into the research on how to learn. How to learn coding, how to learn skills in general. And it’s wild how fast you can learn when the conditions are right. Learning’s hard, but when you’re doing it, the progress you make is astounding. When the conditions aren’t right, many people stagnate. If you’re good enough to do your job, you’ll just stay at that level for decades.

Grade 10 Turning

Adam: If I think back to that class, my grade 10 class on Turing, it’s hard for me to know whether I loved it so much because of the programming or because of the learning structure.

Me and Jason were peered up. We were teaching each other how to build back and forth. We had Mr. Omar there to help us set up our environment, and we had other kids in the class that we could learn from. We had a project that we were working on and we had a learning curriculum if we needed to learn sorting or something like that then Mr Omar might jump in and explain bubble sort, but mainly we were learning from each other and it was exhilarating and hard.

Did I get into programming because of how much I enjoyed that learning method in that class? Yeah, probably. If there’s one thing all this research has taught me that learning is fun. Frustrating and hard, but just an exhilarating experience.

Outro

Adam: So that was the show. Thank you to Bob and Brandon and the supporters who helped me with feedback on parts of this episode, and thank you if you made it listening this far.

I know it is a bit of a journey to get to where I wanted to get, but I hope you found this interesting, and I hope it can inspire you to tackle some learning challenges or to teach some others what you know. If you want to get more CoRecursive content, including occasional sneak peeks of what I’m working on, join the supporters group. Link in the show notes, but it’s corecursive.com/supporters.

Also, check out my newsletter because in this month’s newsletter, I cover a related question: When should you start a learning project and when should you stop one? These are tricky questions that I often get stuck on.

And if you want to form a learning group, the supporters group might have people like you who might be able to pair up and work on something or just join the podcast Slack channel. I know there’s at least one learning group that’s happening there. Maybe others will form.

And until next time, thank you so much for listening.

CORECURSIVE #091

The Science of Learning to Code

Debunking Myths and Exploring the Science

Transcript