CORECURSIVE #094

Platform Takes The Pain

The Inside Story of Spotify's Engineering Growth

Platform Takes The Pain

How did Spotify scale from 10 engineers to 100s to 1000s …without slowing down? Without becoming corporate?

Facing an IPO deadline, Pia Nilsson worked with 300 teams to transform how Spotify built software. She spearheaded a movement that led them from working in silos to a unified developer platform.

Hear the inside story of how Spotify’s Platform teams embraced transparency and customer focus to create Backstage — now used by companies worldwide.

It’s an unbelievable tale of ingenuity and perseverance. Hear Spotify’s secret to scaling engineering without losing speed and independence. Don’t miss it!

Transcript

Note: This podcast is designed to be heard. If you are able, we strongly encourage you to listen to the audio, which includes emphasis that’s not on the page

Adam: Hi, this is CoRecursive, and I’m Adam Gordon Bell. Each episode is the story of a piece of software being built.

Today, we’re peeling back the layers on Spotify and a monumental challenge they faced several years ago.

6 Months To IPO

Here’s the problem. It was 2016 and Spotify needed to IPO, but IPOs, they take years to plan. You have to hit the market timing, right? You’ve got to have everything in order internally. And you’ve got to look exciting to potential investors to get ready.

You have to pay investment bankers. You have to pay auditors to get feedbacks on the gaps in your business. And then they found a problem:

Pia: They had learned that from the external auditors, that we had too many vulnerabilities in our CI infrastructure. IE, we had over 200, um, bespoke Jenkins is running at various stages of cleanliness, sort of.

Because, of course, that was not only 200 teams maintaining each of these, but actually it was 300 teams on these 200 plus Jenkinses. So these Jenkinses were maintained at best to some kind of degree.

Adam: So yeah, each team was responsible for getting its own code built and shipped to prod. Each one owned their own CI instance.

You could use the tool that best suited how your team worked and you would set it up and you would manage it yourself.

Pia: Brilliant, right? And very empowering and motivating for teams. The challenge, of course, with this is this led to a lot of complexity and duplication.

Adam: That’s Pia Nelson. She’s fresh from a smaller company, and she’s a force of nature. She’s an engineer turned passionate manager, and she thrived on technical challenges.

Spotify’s situation, though, it wasn’t just a technical challenge. It was about culture, and it was about business stakes. Imagine Ernst & Young, the IPO auditor, sifting through Spotify’s operations, you know, looking for secure practices.

They find this maze. 300 Individual paths to production, all in varying states. Alarm bells went off, right? How can you guarantee security with 300 distinct paths? The risk was evident. So Spotify needed to address this. If not, it would have to show up in the audit, which is an official IPO SEC document.

And it would hurt the stock price, potentially. It might hurt the IPO. That’s a big problem.

Pia: So when I joined, the CI project was to sort of consolidate all of this. Build one CI solution for everyone and do that rather quickly. We had a timeline of six months.

Adam: That seems fast, right? Join an organization and lead an organization wide migration at a company ten times the size of the one you just came from. Migrate across 300 teams, across 2000 engineers. And do it in six months with the IPO on the line.

But Pia signed up and really when it was all over she and her teams at Spotify didn’t just consolidate CIs, they led a transformation in the way Spotify did things, a change that is now spreading to other large engineering orgs. That’s today’s story and it all starts on day one when Pia joins Spotify.

Because on her first day, she stepped into a world unlike any she had ever seen.

First Day at Spotify

Pia: It was super cool, like, you entered into this building, and for each floor, you walked from one room to the next, and they were so unlike each other. Like, it was like a different universe, every room. It was real, the identity of the team was on the walls, basically.

Adam: Pia’s desk was in the squad room of a team that was called Pipe Dream.

Pia: It still is, actually. Yeah, so, and they were very happy with that name, as they are the CI team, so Pipe Dream sounds fantastic. Good name. It was just a mess in their squad room, and a lovely mess, sort of, because over the years, there were posters upon posters on like, very music oriented and sort of, Yeah, it was very alive.

Adam:: What music posters?

Pia: Oh my god, everything from Bob Marley to Daft Punk to anything else, sort of. It was, it was sort of the love for music and expression. It wasn’t a certain like, oh, we are into rock or we are into R&B. Not at all. It’s like love of music, I think, was the overall theme. And very many musical instruments.

You could run into guitars just everywhere and lots of music enthusiasts.

Adam: If each squad room was unique to the team, each desk was distinct to the individual.

Pia: You could never sit at the wrong desk, basically, because that was like living in someone else’s house, almost. It was so, yeah, a huge identity, physically. That was very, very clear. And, of course, super exciting.

People were very, sort of, proud of their team and their team names and belonged to the team.

Adam: Pia took it all in on that first day. People are tied to their squads. And the squads and the people have strong identities. It’s great, it’s cool, it’s amazing.

Each team is unique. Pia’s there for a reason, right? She’s there to consolidate things, which is sort of removing uniqueness. At least when it comes to CI, so this might be a struggle.

And also part of this culture just didn’t fit with who Pia:

Pia’s Desk

Pia: I never loved to sort of try to look cool, because these rooms looked really cool.

And that has never sort of appealed to me, for good or worse. So I’ve never been one of the cool kids. And I guess it’s, it’s something that does not attract me to try to look cool, sort of, I feel I’m hiding. And I feel like insincere. What do you mean you’re hiding? I’m not the customizer. No, I have never sort of, I’ve never been a homey person like that.

I kind of feel stuck almost, actually, when I sort of… Try to identify with some physical thing. It makes me feel stuck rather than expressing myself.

Adam: So Pia’s desk stayed bare in this personalized squad room with everybody’s identity on display with trinkets and musical instruments. She just had an empty desk.

And the others customize things for good reason because the most hardcore Spotify engineers, they were at their desks for 12 hours a day. It practically was their home, but Pia, you’d never find her at her desk because she was a chapter lead Spotify’s version of an engineering manager.

The Choas of Hyper Growth

Pia: Which meant that I didn’t have one team, I had five, but I didn’t have five teams either, I actually had a few people here and there, in five teams, and that seems like a very strange setup.

It was. I, however, there was a very practical reason for it. During the hypergrowth, which I sort of entered into after the fact, hypergrowth as in quadrupling their size every year, then having people, leaders like that, focusing on People grows and, and people health while new teams were constantly forming, it made more sense to have chapter leads because the people could keep their manager, which meant they would be more likely to be retained while moving to the next team, to the next team, to the next team.

So that was sort of the real reason why we had chapter leads and that had been working out really well.

Adam: It’s a smart idea in a way. If a company is growing so fast, the new squads are sprouting constantly. Maybe you can minimize the chaos a bit by having your engineering managers stay constant. That gives the individual contributors some stability.

But imagine being a new engineering manager, a new chapter lead, the company’s growing fast and you’re trying to connect with people, but they’re in five different rooms. It’s chaos. Pia was confused and so were many of her reports. Imagine working at a place that quadruples in size every year. A team of 4 becomes 16 and then 64.

Those original 4 are now spread across 8 teams, where no one but those original 4 had more than 2 years with the codebase. And meanwhile the codebase itself is changing at 16 times the rate it was 2 years earlier. How do you handle that? How do you keep up? You could take those original 4, you could have them get the other 60 up to speed.

Right? Full time, just work on that. But then short term velocity would go way down and those original four wouldn’t be doing the work that they loved anymore. They might even leave and plus you’re getting slower and Spotify couldn’t get slower. This is when they needed to be moving faster. So Spotify had its own way. It was all about independence and speed. Let newbies will catch up.

Pia: It was a growth ache of a teenager, you know, the company had been so small and we had had these wonderful, brilliant engineers that had coded all day, all night, basically, which had been incredibly important for the company and successful.

And now we were so big, several people who were like this employee number 10, coding all day long, all night long, obviously knew everything there was ever to know about the infrastructure at Spotify, uh, brilliant thinkers, writing code faster than I ever saw before.

It was challenging to just keep up with the code that had been written when I had been home eating dinner. And there were like hundreds of lines of code that I needed to understand in order to even understand what, what is going on here. 60 percent of the team were not up to speed on the last pull requests that these two people maybe had created during a 14 hours workday.

And that just kept going. You can imagine, like, it’s just challenging for people who work normal hours to ever catch up.

Adam: This is another problem. Pia has her six-month CI project, but now she also has to unite these groups. She’s the squad leader, both for employee 10 and employee 1100. She needs to find a way to cross this divide.

Pia: That was something I was struggling with because it was on my responsibility to have team health, to have pull request reviews that actually were meaningful. But then to discuss tech debt, one has to understand, well, what does this system actually look like? And also, love of speed. I mean, who doesn’t love the speed of iteration in development?

Do you slow down in order to get everyone on board, to have a conversation? And then, you know, some arrogance might come in for people who feels like they are way ahead of everyone else. Are we going to spend their hours? Discussing with the team and slowing down and pair programming. So I’m leaning always towards, like, we need to get everyone on the same page.

And we think better as a team than as just one person here, one person there. Because then we can’t build on each other’s ideas. And the team culture also is way nicer. It’s also way easier to onboard. I am a strong believer in like this collaborative, let’s work together approach, and I sort of was challenged myself, like, how do I actually make that happen?

Because these other folks are brilliant, they know everything by heart. If I ask that engineer to do it, it’s going to take them 15 minutes. If I ask the team to work on it, it’s going to take two hours. But then of course, five people will know the solution instead of just one.

Chapter Lead Group

Adam: This was a problem Pia could see as an outsider. The things that have made Spotify nimble and move fast when they were small were now working against them as they grew.

Pia: We had a fantastic community within our chapter lead group, and I think that’s the reason I, I kept fighting with myself on this problem sort of, because they were seeing it and I had sort of this feeling like these are among the best engineers I’ve ever worked with, but we aren’t as impactful and as effective as we could be.

And we’re a bit fragile too, because if someone here goes on vacation, um, who knows those five systems, but they were coming out of a culture where it was always possible to just reach out to that someone they would always pick up the phone and, and walk to the office basically and fix it. And, and I don’t think that is a sustainable way of living and working, and it certainly wasn’t a sustainable way of growing the company.

Adam: So is that literally true? Like everybody lived in the area and you could just walk into the office if something went wrong with that system you knew?

Pia: Absolutely.

Adam: But there is an advantage to that, I guess, in the early days, you know, everybody’s in Stockholm and you could just walk in if your service falls over and kick it, or…

Pia: Yes, absolutely. And I think it’s a great way to start. As is working on a monolith in the beginning, usually. Like, you need to start somewhere and have people actually understanding the full picture. But if you’re successful, you will run into a place where that isn’t gonna be effective any longer.

Because many managers were seeing these challenges. They were just like me running in between these rooms and seeing well. This team asked this question, and that team just had that question two weeks ago. It’s about challenging the status quo a bit. And I think I, I did that a little bit more than maybe others because I have this strong belief that we can work better together.

Yeah, I, I have a hard time handling when, when I see ineffectiveness.

Adam: This is why I wanted to do the episode. This is Pia’s mission, right? And it’s, it’s even bigger than the CI thing. I mean, she absolutely needs to hit the CI goal, but there’s a bigger challenge, scaling an engineering org, keeping execution speed up and both chaos and bureaucracy at bay.

As you go 10 engineers to a hundred engineers to thousands of engineers, how do you do that? It’s a big challenge to tackle, but luckily Pia does have some leverage because she has the CI project and absolutely must get done.

Pia: And leaning on that, since that was in a way aligning with my values. I could use that as an example for the other teams on, well, this is actually possible.

Autonomy and Adoption

Adam: So she’s got a countdown, six months to get everyone on a secure, standardized build pipeline. But there was another problem.

Pia: The platform team, because of the autonomy, the platform teams did not own the problem of adopting their own. The team I led, we did not have the mandate to tell anyone because we had this autonomous culture.

Adam: This seems like a problem, but I think this is actually pretty cool.

If a platform team can just make everybody use their stuff, they might force teams to use tools that don’t fit. I’ve seen this before.

Spotify avoided this trap. They said, we’ll build the tools, but we won’t force it. If there’s a better way to solve the problem, go, go use that. But that meant that the platform team didn’t worry about adoption.

Pia: Yeah. It wasn’t about being lazy. It was about being respectful of the autonomous culture. As in, we should build something that is so good that they just want to adopt it and they will plan for that adoption themselves. But as you can imagine, very, very busy feature teams would not maybe have the chance to even know about some of these tools sometimes. Nor plan migration.

Platform Takes the Pain

Adam: So this old way of not pushing your tool out of respect for the other teams. It just wasn’t going to cut it, not with the six-month deadline, not with an IPO, the platform teams needed to switch gears. They had to drive adoption, not just hope that it would happen, but the platform teams resisted this.

Pia: In the very beginning with the CI team, when we were like, okay, we’re going to actually go out there to squads and help them migrate, which was entirely new. And the sentiment in, in the CI team was. But they are not, they’re not going to want this. They, they will basically not let us in to the squad room.

And we’re going to be looked upon as this sort of corporate folks that want to centralize. And that’s like saying that like the devil being the devil almost like they’re going to hate us basically.

Adam: This is a problem, right? Who would want to go team by feature team, one by one, walk into their custom squad rooms and say, we’re going to take over your CI. It’s gonna go our way from now on.

But Pia had a different idea. She told the Pipe Dream team, “Maybe you’ve got it wrong. We’re not here to control the other teams. We’re here to help them, help them get ready for the IPO. And to do that, we need to change who we are as a team. We need to worry about adoption.”

Pia: The platform teams did not think they were accountable for the adoption of their products. So it was like both starting to take accountable for adoption and. That would lead to going out there to the customers, actually sitting there, onboarding them, migrating them. And we had this mantra that we still have.

And it’s still a part of our, our main engineering practices, uh, for the platform mission, which we called the platform takes the pain. It really helped us actually, because it’s short and snappy and everyone knew. What that really means, because it’s tedious to migrate someone off a Jenkins to another CI engine.

This is not sort of the lovely ivory tower work that maybe some platform teams had loved doing. Deep, deeply thinking about sort of orchestration and creating some fantastic product. But this is actually sort of going out there to some feature teams. 34, 50 feature teams. And sitting with them, and helping them, because they have different challenges, all of them.

Adam: No two teams were alike. These weren’t cookie cutter migrations. This was custom, hands on build work. But the Pipe Dream team leaned into the grind. They took pride in taking on the pain of each migration.

Pia: I remember this one team we went to, and we actually sat there with these engineers. It was only two from this other squad.

They were like… You start asking some really great questions like okay, how would that work and how are we gonna do this and okay So if you do that, then I can do here and like, okay, just get it done. Let’s get it done. Sounds good Okay, good. Let’s go.

Migrating CI

Adam: This was the total opposite of the pushback They’d expected and it kept happening as they met with more and more teams It turned out the big resistance wasn’t coming from the feature teams.

Those teams were open to change as long as it helped them ship features faster.

Pia: Because the platform teams are the teams that really care about the infra. So they wouldn’t have wanted to have an infrastructure solution, uh, being sort of asked of them to, to comply with, because they really care about these things, but a team in e commerce platform, in user platform, in the playlist platform, they care about other things way more than what CI infrastructure they are running on.

So it was like a lack of customer awareness or understanding. Had we known this earlier, I think this move would have been enabled way faster and way earlier, much longer before I joined.

Adam: This was a big realization. The autonomy that teams valued, it wasn’t about specific technologies or workflows. It was about owning their work, having the freedom to build the stuff that mattered.

Not owning CI was fine, as long as it didn’t mess with their ability to ship things. Once Pipedream got this, it was a lightbulb moment and they realized that most engineering teams would embrace any changes that remove friction from their lives. This gave Pia hope that with the right approach, she could rally the teams around a shared mission to work better together.

But first, they had the six month deadline. Lots of Jenkins files to write. And at Spotify, you don’t want to be a bottleneck. You can’t slow people down. So they had to make sure that all their changes rolled out smoothly.

Pia: So the toughest thing with this migration was that there were so many Jenkins instances.

And you can imagine around 300 teams that were impacted. So the scale of reaching all of these critical pipelines, actually migrating them. That was one of the biggest challenges. The second one was the build templates differed so much. There was no standardization on the build templates, which makes it very hard to actually build them.

However, we, we figured out we’re going to just containerize the builds. So that was the solution for building very custom build templates.

The Transformation of Pia’s Team

Adam: Through all this work, Pipedream transformed their thinking. Platform takes the pain? Yes. But Platform also cares about their customers. Platform understands their customers.

Platform cares about impact. These ideas started to spread.

Pia: There was no moment, I can remember at least, where someone said, like, right now we’re gonna move into, uh, team ownership of, uh, adoption for all your products, Platform folks. It just happened gradually, as I think people were… Seeing more and more that it actually works to own adoption.

It’s also a matter of like, how do you define success in your organization? Right. We started celebrating. When, when teams reached high adoption of their products, and we also tried to move the flywheel by celebrating successes.

Adam: This let Platform rethink some things. Yes, they didn’t want to be top down, they didn’t want to be corporate, but maybe they were wrong about autonomy.

Pia: Where is the autonomy really? We started to learn how to nuance that understanding because autonomy has to do with impact. We’re not interested in being all alone in a room being autonomous. That’s useless. When we say autonomy in the engineering industry, right, we mean actually impact. So I think for the platform teams, this autonomy word.

Started to be equal to, okay, if I actually own the adoption of my tool, I have more impact. And if I don’t own adoption of my tool, I don’t actually have much impact. And that doesn’t, it doesn’t really matter if I am autonomous.

Adam: This change shifted the team. They went from ivory tower architects, dreaming up theoretical solutions, customer focused engineers.

Pia: Because we really weren’t customer oriented at all when I joined. Nobody had asked the engineers to be customer oriented. It wasn’t that they didn’t want to, they, nobody had thought about it basically. But this owning your own adoption, they have to become and they will become customer oriented because they’re gonna go out there and fix that adoption.

The Deadline and The Small Popes

Adam: So did you hit your six month deadline to get everybody? We did.

Pia: You did. Look. Yes. It was a success story. And the IPO happened in 2018, I think, April or so. It was a, it was a massive success. And the CI challenge was solved through this containerization of the build templates, which really helped us control all CI where we had to comply.

Adam: But Pia still saw problems to solve. She had a success. A big success. Yeah. But Spotify still had silos, still had tribal knowledge. She’d seen firsthand how not knowing where things lived led to people reinventing the wheel, led to productivity bottlenecks. And this was the hardest thing for her to handle working at Spotify.

Pia: Because there were so many folks that were not open to this change. They called it like, oh, we’re going corporate. I mean, they were fantastic engineers, many of them. And they had built so many useful things for Spotify. But not being able to reach them was, you know, taxing and just, it was sad. It was sad.

And one gets a little angry as well, like upset. How can you not care about the bigger picture? How can you, who are so incredibly smart, not see that this is wasteful? Why is your pet project more important? Because there were, there were so many pet projects. There were so many, in Swedish we call them small pokes.

Running around like, and these were great, highly skilled people that had built really valuable stuff. Uh, so not being able to reach them, I think was my hardest learning in the beginning.

Rumor Driven Development

Adam: Here’s the thing about autonomy. It can sometimes work against transparency. Even in a place full of talent, communication gaps are gonna happen.

But reaching everyone and breaking down silos is a big problem. So Pia decided to focus on something smaller. What frictions are getting in the way of work at Spotify?

Pia: We were doing this service where we basically sent out an email people could reply to on like What doesn’t work with us, basically, what doesn’t work with back end right now, and people were just filling out a form, and it was also a little bunch of internal jokes, so we had a very good response rate on those.

And we were seeing a lot of times that people were struggling with being interrupted all the time. And that people couldn’t find things. That seemed to be the top problems over, overall, quarter after quarter. And this speaks to the challenge of silos, right? When, when one is in an organization like that where we have a lot of siloed orgs and teams.

I wouldn’t know really exactly, uh, how to integrate with another system. There will be several APIs that I may find myself through digging around in infra. But I wouldn’t know exactly maybe how to integrate and which documentation that is up to date, etc. So I would be tapping someone on the shoulder that I know, Oh, this person probably knows something about this system.

So we had a name for it even, and it was called rumor driven development. What we meant was you had to know who actually had once upon a time written something in a system. And then you tap them on the shoulder and ask, are you the right person to ask about this API? And then they would forward you to the next one, to the next one, to the next one, and finally you would find your team.

And of course sometimes people don’t have time to go through that rumor chain, but instead they had to build it themselves, and there we end up with fragmentation. So we saw this need of like, oh. People just want some place where they can find everything. And since we don’t have that place, they are tapping each other on the shoulder all the time, which leads to the second problem of being interrupted.

Measuring OnBoarding

Adam: They decided they were going to fix this. Just like they were fixing CI, they would roll out a centralized developer portal. With the data that everyone needed. A place for all the services, who owns them, their APIs, their documentation. Maybe if they had all that, this would fix rumor driven development.

It would fix transparency. They needed a way to test this, though. They needed a metric to aim for. A way to measure success.

Pia: The metric we decided to track was the onboarding metric. We borrowed that from Meta. We used the number of days it takes for, to make the 10th pull request, which is a very crude metric, and of course one has to follow up with a bunch of other things, because developers do not only code.

That’s not the only way to get onboarded. But at the time, that was the metric we were using, and it was spread all across the organization, so everyone understood it. It’s a simple metric. And we had over 60 days when we started, and this was sort of the rallying cry for, well, why do we actually need this?

Well, we pointed to this one metric where everyone could look at it. Over 60 days to actually do their patent pull requests for all these newcomers that were joining every single week.

Adam: It’s a tough metric, right? It’s not just measuring whether existing people will adopt the developer portal, but whether new people can figure out how to contribute to the codebase faster.

They could end rumor driven development, but that might not be enough. They needed to add enough transparency that it would speed up onboarding. And besides that, even getting feature engineers to use the service, if they built it, It wasn’t a given.

Building Backstage

Pia: We were still a very autonomous engineering culture and we’re, we still are.

So we needed to build buy-in for this idea that there would be one developer portal, one catalog holding everything.

Adam: In fact, a solution like this already existed. And one of Pia’s teams owned it.

Pia: It had never been sort of seen as like a core to our infrastructure at all. It was like just a catalog of the back end services basically.

Adam: In the past, because they had never worried about adoption, the service directory never took off. But why didn’t it take off? Well, it lacked features. It didn’t cover enough things. It didn’t cover front end things. It didn’t cover data things or infrastructure. So they could fix that. They would call this new catalog this project backstage.

And they would use it to try to end rumor driven development.

Pia: Exactly. The core idea of Backstage was to visualize the connection between all components and all owners. So that you would be able to solve, for example, an incident. Way more autonomous than you had before. You, you did, you should not have to know exactly anyone, actually, in order to figure out, like, who owns this data pipeline, and when was it last run, and who’s on call right now, and is this incident actually already logged there, and who’s working on that, blah, blah, blah.

You could get that through a few clicks.

Plugins and Decentralization

Adam: But there’s another issue, right? It’s a centralized service trying to solve a decentralized problem. One team trying to reflect the autonomous, decentralized nature of all the teams at Spotify.

Pia: We were trying to sort of work out the architecture of Backstage so that it fit our decentralized engineering org.

And also engineering culture sentiment, as in autonomous ownership. We also saw it as sort of, we needed to decentralize ownership of the plugins in Backstage to increase speed and never become a bottleneck. Because the main focus for Spotify all the time and still is speed of development. So whatever the platform folks like myself come up with can never impede speed.

So that was one of the core ideas as well. Like whatever we do here, if we are going to centralize the developer portal, it may not slow down the feature teams. And the way to slow down is to create the bottleneck, right? Of a central team owning everything.

Adam: So they built Backstage around a plugin model that allowed for expansion.

The idea was that if Backstage was adopted but it had a gap then instead of the rumors any of the thousands of devs on various feature teams jump in. They could add a plugin, they could push things forward. There would be no bottleneck at the platform team. If all the backend services weren’t Backstage but the web components or the software libraries or whatever weren’t there Then the plug ins were away around that bottleneck.

And the plug in that really showed the way that this could work was the data plug in. You see, the data teams needed different types of data. They needed to know things like retention policy.

Pia: We have some datasets that are, uh, the root datasets, and then there are, uh, a myriad of datasets that build off of these, and sometimes they are, uh, necessary to be maintained, obviously, super critical going forward as well.

And sometimes that was sort of for this one campaign, for this one initiative, and then that data set isn’t as important any longer. The, the data folks were just able to create this beautiful plugin for, for all the data engineers at Spotify, several thousands of them, and so that they can sort of interact with their data sets through backstage instead of other, Uh, portals that were already existing.

Adam: Of course, this decentralized approach kept backstage flexible and fast moving and adoption spread internally. Backstage became an important part of how engineering was done. More plugins were developed, more information appeared, but Pia knew that a portal alone couldn’t transform the culture.

Autonomy’s Refresh

Pia: It’s not possible that code shifts culture. I think it has to sort of come from a need that was already identified across the organization. And then this technology speaks to that need. And I think we were at this breaking point when I joined 2016, and 2017 even more, where the scale of Spotify had become what was slowing us down, and our former ways of working did not scale.

Adam: Everybody wanted the change anyways, they just didn’t know how to accomplish it. Does that sound true?

Pia: It sounds very true. And also I think that the autonomy needed a refresh, this idea of… Of complete autonomy, team autonomy needed a refresh on like Well, what about the impact? It used to be the case when we were a small company that complete team autonomy had the right impact because the company wasn’t as big.

But in a large company, you won’t have the impact that you’re looking for. And then autonomy becomes almost meaningless. That’s sort of the cultural change that we went through, at least as a company. I think we didn’t have this kind of transparency because we had this employee number 10 kind of, uh, archetype.

Who were incredibly fast in, in producing valuable code and systems. And then the rest of the team kind of struggling to keep up or maintaining the stuff that were produced. And with backstage, the sentiment behind backstage is team ownership. There is no individual ownership. It’s the team owning everything.

I think the shift is very gradual towards team ownership, but backstage sort of softly moved the organization towards.

Adam: How does it softly move you towards team ownership?

Pia: Because the rumor driven development fades, you don’t need it as much. Because people are, like myself, I could just look up the team to connect to in order to understand some certain API.

I wouldn’t know that it was, uh, this person called Karin that actually had built it all from the beginning, knew everything there was to know about this thing. I wouldn’t have any idea. I would just go to the person that was called a goalie in the Slack channel, in the team support channel and speak to them.

And I would have no idea that there was this person who actually built the whole thing from scratch five years ago. I think it’s sort of put an interface in between these people.

Adam: That’s simple, but it makes sense, right? In rumor driven development, the individual who built this is the most important thing.

But now, with service goalies, and teams acting as the unit, The individual fades. The team, the squad, is the main thing. Team identity isn’t harmed by transparency. If anything, it’s increased.

Knowing Where Things Are

Pia: Absolutely. It’s also a gradual change. But looking back at, before Backstage, and looking now at mature Backstage Adopter, I think the Big challenge, uh, that at the time seemed so big that we just don’t know where things are and what’s going on, hardly exists, because you can see everything, everything is available, there’s like no need to actually No anyone for me in the commerce platform.

Recently, I tried to look up a few APIs, understanding them and how they’re integrating to answer some questions from some other teams, and I don’t work in commerce platform. But I could just go and find them and trust that their APIs that they were surfacing on the backstage page were the right ones, obviously, because they were there.

And that’s the place to go. I could get to very far on my own, basically. And then I had some really relevant questions to ask. And I knew exactly who to ask, because it was like obvious which team was owning this. Slack channels were just a click away. I think it’s this empowered feeling, it’s something I lacked in the beginning, I felt like a junior when I joined, which was very odd to me and after having spent like one and a half decades in engineering, I felt I hardly did not know anything, I had to ask everyone for everything all the time, and everything was different, also people were giving me five different answers, which were all, to some extent true.

Adam: You guys centralized information, but you left the autonomy, you left the, the team structure still, like, is it still silos? I guess you guys can, it’s easy to find information, but are the people still in their pods, or?

Pia: But we never wanted to move away from strong team identity and belonging. It’s been the core to our culture and still is.

So we wanted to find this balance where standards set you free. That was another one of these mantras that we invented to help people understand no standards and centralization in some places actually empowers me. So I think today teams are quite on board with, with that. And there’s, they’re seeing that this did not lead to us becoming very corporate or a top down kind of engineering culture.

Not at all. It just removed a bunch of toil.

Adam: The whole process took time building out backstage, adding plugins for data, plugins for documentation, so that people could learn more about your service. Plugins for seeing builds, plugins for seeing Kubernetes stuff, and then templates for building new services, and so on and so forth.

Along with this, the culture shifted. Not a complete 180, but towards a place where autonomy and team identity also meant embracing standards, embracing transparency. Spotify embraced Backstage as the central portal. And two years in, they hit their goal. It was now taking 20 days for 10 PRs for new developers.

They had cut their onboarding time in half, and this is even though they were much larger than when Pia had started.

Open Sourcing

Pia: We, we celebrated a lot, and one of the ways we celebrated was to open source it in 2020. Backstreet had become this super critical part of our infrastructure, and we were super proud of it.

And when something becomes that critical, you got to protect it as well. And one of the fears that was starting to arise was Hold on. This is now super important to us. What if someone builds an internal developer portal and it becomes the industry standard? At the time, there were no developer portals, basically, in the industry.

But we were sort of thinking we would have to migrate off of backstage because we are not going to sit there with a bespoke system. And that’s going to be so painful for us because now we have every single team. On backstage and so many plugins over 200 plugins actually internally built. So this was a real threat to sort of our speed again.

And so we, we made the decision, like we’re going to actually donate this. We’re going to give it away and make sure to keep investing in open source backstage because we need it. We need it to be the industry leader for ourselves, for our speed. So that was why we open sourced back in 2020. That we’re really celebrating.

The Future of Engineering Orgs

Adam: As companies grow, they can do more, but each person can do less. Ten people can move fast. A thousand can’t. But a thousand people can accomplish a lot if they can work together. The problem is coordination, right? Some big companies, they’re all administration. They’re all top down control and nothing happens fast.

Others, they just can’t pull off big projects. There’s too much internal politics. It’s like game of thrones in there. But it seems like Spotify may be found another way with their focus on speed of development, speed of iteration. They’ve kept that small company feel they’ve kept the autonomy. While still being able to work together.

Pia: I really want to believe that. I really hope so.

I think if one really wants to prioritize speed, then one has to figure out the solution because the top down company isn’t fast. And usually from what I have seen, at least it’s difficult to make the right decisions. Because you lack so much information, uh, at the top. And engineering industry is moving incredibly fast.

So, actually, the devil is in the detail. One has to sort of understand a problem space to the lowest level, almost, in order to have this sort of creative, brilliant idea that afterwards sounds like, oh, obviously we should have done that. I also think the real speed happens when one deals with ambiguity really great.

Because honestly, I think many companies, we don’t know exactly what will be most beneficial to build because the space is so ambiguous. The customers usually aren’t exactly sure exactly what they need. So one has to be fast at failing. And in order to have a fast failure culture, one has to be sort of empowered to make quick decisions, try something out, it failed, great, then we learn.

Next, next experiment, next experiment, next experiment. That’s how you actually win. You out experiment your competition. So if one is serious about speed and success, I think one has to figure out like how do we enable and empowered teams because they are going to be the ones out experimenting.

Adam: So here’s the thing about how companies run.

Even as companies get larger as they scale, it seems like it’s possible to maintain some independence and some speed. How do you make this happen? If you’re in a platform role, I think we know some of the answer, right? Platform takes the pain. Take ownership of adoption. Have a customer centric mindset.

Understand your users’ real needs and what’s slowing them down. And just be willing to take on the tedious work involved in aligning teams. But there’s a broader lesson here, one that applies to all of us. Don’t underestimate your power. Even the smallest change can make a difference. Be the connective tissue that a growing organization needs. Be Pia.

Listening to People

Pia: I’ve always had this huge interest in understanding how people collaborate and thrive together. I always ended up being the responsible to sort of arrange the parties and arrange the… The get togethers and that was kind of my informal role of trying to make people come together in a sense.

Adam: So you’re an organizer of people?

Pia: I hope I could say that. I’m, maybe I’m even more so a listener. My go to is to hear people out by, by being that friendly ear. So that has always been something very dear to my heart. I,

I feel like that’s been a red thread, actually, throughout my life.

Outro

Adam: A big thanks to Pia. What an amazing person, right? She… And the people she worked with at Spotify, I’m sure she’d be the first to say, that this wasn’t all her. They created so much value. And they, they did it while keeping the things that Spotify valued.

It’s amazing. And it’s funny how hearing some of the internal struggles at Spotify, it just, it makes me interested in working there.

I’m not saying that because Spotify is paying me. Uh, trust me, they’re not. I just love hearing stories like this. I love hearing about these changes big and small inside these big tech organizations. So if you’ve got one, let me know because I, I find this stuff fascinating.

Also, if you’re new here, sign up for my newsletter and you’ll get some behind the scenes details about the episode. And for a truly behind the scenes experience, join as a podcast supporter.

The, the next two episodes that are coming out that are, that are amazing , I have to say. Um, they were very much shaped by some of the discussions I had with the supporters, both on the Patreon channel and on Slack. Uh, so yeah, thank you so much to the people who support this effort, the people who helped create this podcast with their financial contributions.

And until next time, thank you so much for listening.

Support CoRecursive

Hello,
I make CoRecursive because I love it when someone shares the details behind some project, some bug, or some incident with me.

No other podcast was telling stories quite like I wanted to hear.

Right now this is all done by just me and I love doing it, but it's also exhausting.

Recommending the show to others and contributing to this patreon are the biggest things you can do to help out.

Whatever you can do to help, I truly appreciate it!

Thanks! Adam Gordon Bell

Audio Player
00:00
00:00
48:35

Platform Takes The Pain