CoRecursive #119

From Hacker News to TikTok

How Algorithms Learned to Hook Us

From Hacker News to TikTok

Corey told me about his AI cat reel problem. He found these AI-generated cat videos hilarious. Who makes these? He kept sending them to his wife. Then he tried to stop watching and he couldn’t.

So I went down the rabbit hole of how social media algorithms actually work. It starts simple. Upvote, downvote, sort by time. But by 2017 Facebook has a metric that quietly reshapes what two billion people see. Then a leaked playbook lands, and a CEO takes the stand in Los Angeles.

Today is an investigation into what happens when the algorithm knows you better than you know yourself.

Transcript

Note: This podcast is designed to be heard. If you are able, we strongly encourage you to listen to the audio, which includes emphasis that's not on the page

A Developer Who Can’t Outsmart His Own Feed

Adam: Hi, this is CoRecursive. I’m Adam Gordon Bell. And today I have, uh, Liam with me here. Hello, Liam?

Liam: Hello. Good to be here.

Adam: So Liam, I got this message from Corey, a developer I used to work with, and he ran this weird problem on Instagram and he wanted me to look into it.

Corey: We’re cat people. So finding, you know, funny cats on Instagram Reels feels like a supernatural thing. But once, uh, was it Sora when it that showed up? There are, there were so many AI generated cat videos. Cat videos were already cracked on, you know, these social media algorithms. And my feed. Just turned into nothing but AI slop cats.

Adam: So the thing that happens, Liam, he thought they were hilarious. He started sending them to his wife and then eventually his wife told him to stop. But he couldn’t stop them. ‘cause now the algorithm I’d learned he liked AI cats.

Liam: Yeah, it knows. Now he can’t get away from them. That’s the problem. He just not watch them or stop going on that social media platform? What’s he trying to solve here?

Adam: Yeah. I asked him that myself.

Corey: It does affect me sometimes in daily life where, it’s time to put the kids to bed and you can’t do much when you’re putting your kids to bed, right? You gotta keep ’em on track, but you can’t also like do it for them. So you’re just kind of standing there in this awkward, you know, traffic control situation.

And sometimes that goes for like an hour and you’ve been scrolling the whole time and you’ve spent an hour just on, you know, five second videos watching. Who knows how many of ’em? And it, you come out of that and you don’t, you don’t feel great.

Liam: So he is a developer and he can’t beat the algorithm, and if he can’t beat the algorithm, then who’s gonna be able to beat the algorithm? And what are you trying to solve here? What are you trying to figure out for him?

Adam: Yeah, so that’s a good question. Yeah, I mean, I feel like it’s a very identifiable problem. I, I struggle with this sometimes as well. It’s like he’s losing his time to scrolling on the app and he can’t understand why he can’t stop. We both have a developer background and I was kind of interested just in how these social media algorithms work.

So this was a great prompting, but I thought this would be very quick and that it was anything but.

So I started Googling around, right? Like how does the Instagram algorithm work? But that’s not really the type of answer I was looking for. That’s like influencer tips on how to make your various posts spread. So I had to go further back.

Every Algorithm Starts With a Space Bar

Adam: Before any of these platforms existed, you mostly chose yourself what you saw online. There were blogs you subscribed to, you bookmark things. But then came Slashdot and then Digg. And then sites like Reddit and Hacker News changed the model from editors deciding what mattered to crowds. Voting. People post links and you get up votes or downvotes and popular stuff rises to the top. And the algorithms behind this were remarkably simple.

The original Hacker News ranking algorithm, it’s public, it’s written in, uh, Arc, uh, Lisp dialect. And honestly, the best way to understand it in all the similar algorithms is, uh, Flappy Bird.

Liam: What do you mean by flappy?

Adam: Have you played? Do you remember Flappy Bird, that game? Do you remember—

Liam: Vaguely. I think I only sort of saw it on the internet. I don’t think I really got into it.

Adam: So Flappy Bird, you’re a bird in like a Mario type world that’s a side scroller. And all you can do is press the space bar, right? And when you press the space bar, the bird goes up. And then it slowly starts to fall. And so there’s various obstacles and you’re just pressing up to like give it enough up to get through the various things as the, the landscape scrolls through.

So it became this sensation briefly like, like Wordle, but it is very simple. You just have one thing that goes up and then the gravity like slowly pulls it down. And so this is exactly how all of these early ranking sites worked. Hacker News, in fact called theirs gravity.

So you, you go to a, a website, and there’s a bunch of links listed, and when you hit up vote, that’s like pressing the space bar on that item. So like, the ranking of the first page is like a whole bunch of parallel Flappy Birds and all the people on the internet are like pressing the space bar, raising them up or down. And the gravity is, is pulling them down over time.

But the space bar of people is pushing them up. The way that these algorithms can be tuned, you can, you have like a time decay. So basically you can imagine each story gets heavier over time. So when an article first appears, it only takes a couple up votes to push it to the top.

But the longer it appears, the heavier and larger that bird gets and the more people it takes to raise it up. It’s the simplest algorithm you could describe. Like it’s kind of neat to see, but really it’s just simulating gravity and the way it does that, with that decay over time, they can tune that in terms of how often the stories turn over, right?

How often do you have new things? How often do things stay so that people can comment and, and react.

So that’s it, that’s like the simplest version of what eventually becomes Instagram and TikTok and et cetera.

Liam: Alright, so every post is like a bird on a Flappy Bird. It gets boosted, it goes up over time, and then eventually gravity makes it go down.

Adam: Yeah. But like the crazy thing to me, Liam, is like, I, I spend a lot of time on Hacker News, sometimes on Reddits. There was a while I was really into custom keyboards, like for my computer, and I spent a lot of time on the, uh, mechanical keyboard subreddits. Whatever the size of the community, there was always interesting things to find. But it’s so small, right? And so simple. There’s no machine learning, there’s no personalization, no tracking, it’s just votes and time,

It has come up with in 2005 it’s nothing fancy but it creates this unified front page. It creates this experience where all the people in a certain group you know, they have their own front page of things that are interesting. It gives the community something common to talk about by, by bringing everybody together and voting on something.

Liam: So it is a little bit like a shared homepage, which is different to a typical feed where it’s entirely bespoke and personalized to you.

Adam: Yeah. This is like the early versions, right? And yeah, it has that advantage. Where on Reddit, if you’re into woodworking, the Reddit woodworking surfaces interesting things, right? If you’re into politics, you get the most talked about debates and, what it’s doing is it’s searching through the interests of the people paying attention.

If you’re voting you can think that this simple algorithm puts forth a bunch of ideas about interesting woodworking ideas, and then you pick on one. And then that’s probably interesting to somebody else in the community. So it’s kind of searching through everybody’s interest to find the things that are most interesting and common with all the people who are there.

The Button Nobody Should Have Pressed

Adam: But here’s where it gets interesting, right? On Reddit there’s another button and it’s called Sort by Controversial. So instead of showing the things that get upvoted the most, Sort by Controversial shows the things that get the most upvoted and downvoted.

The interesting thing about that is a lot of times the conversations around something that’s controversial are more in depth, right? There’s more to talk about if there’s a controversial woodworking technique, uh, and some people don’t even want it on there.

It’s likely to be interesting to talk about, although people may get inflamed worse and it may cause aggression. And so that sent me down another rabbit hole. Part of this, Liam, is just, I started looking into this and I, I ended up going in a lot of different directions.

There’s this short story by Scott Alexander and it’s called Sort by Controversial. In it, a startup of some sort wants to find things that do well on Reddit. And so they train like a machine learning model on this controversial post, right? They get this sort by controversial information.

They try to build something that will recreate it. And so what they end up doing is building a machine for generating very divisive statements. A statement that, you know, if you show it to two people, they may have very different takes on it.

This sort by controversial is actually just finding things that divide us, that cause us to battle it out.

And the crazy thing—the reason I think this is interesting is, you know, you don’t need a fancy model or a bad actor or like some crazy machine learning to, to get this result, to stir things up. Because it’s still just this simple gravity, but, but you’re just measuring instead of what’s people like the most, like what causes them to interact the most.

All you’re doing is measuring that. It’s not complicated, crazy AI math, but you can find these stories. You can find these things that just cause such visceral reactions that people can’t help but interact with it. That’s just how we’re wired, like certain things, if you care about it a lot, you can’t not respond. Oh, I can’t go to bed. Someone’s wrong on the internet. You’re like, I gotta tell people that that’s incorrect because everybody has an opinion.

Liam: And this can happen to any community, right? Whether it’s on Reddit or Facebook or YouTube. It can happen anywhere where there’s these divisive comments that are made with the intention of splitting the readers.

Adam: It’s not even an intention, right? It’s just, it’s almost just innate to who we are, that there will be points of discussion that are very divisive and they grab people’s attention. Luckily though, Reddit and Hacker News, Digg, and these early sites, like they weren’t made to surface these.

They did get surfaced. And on Reddit you could go and see, oh, what, what’s the things that’s dividing the woodworking community? But they didn’t build the system to stir up this type of drama.

Liam: Yeah. So how does this explain Corey’s cats?

Adam: Yeah, exactly right. It doesn’t, right? Like not yet, but I had to start somewhere to understand how these social algorithms work. Reddit explains how communities find what matters to them. You know, people are up voting stories, slowly fall off the bottom. New ones pop up. But Corey is on his own right there.

There’s no group. He’s just scrolling lost in his feed. So I had to keep digging, right? And next up was Facebook, and that’s where things start to get a little unsettling. I was up late and I was reading, I’m up on my phone in bed, and I think like, oh, this is what Corey’s talking about. I’m stuck. But before I get into what went wrong with Facebook, I want to talk about just how Facebook worked in the early days,

The News Feed Was Just Gossip All Along

Adam: So you go into Facebook. You would post a photo and other people would see like, oh, there’s an update from Liam. He posted a new photo. This is the newsfeed. And it was an innovation like everywhere has this now. But this was a big improvement at the time. And because it hit on something deep, we’re, we’re social animals, right? And we’re, we’re wired to keep tabs on people around us. It’s kind of like the office lunch room or cafeteria in high school.

You know, you’re not there for the coffee or for lunch. You’re there to hear what went down in that meeting, you missed somebody maybe has a comment about somebody else under their breath that whole thing, the newsfeed automated that.

It’s like, here’s all the people you’re following and here’s the things that they’re saying.

Liam: So the newsfeed is just gossip. That’s it.

Adam: People always talk about gossip, like it’s bad, but gossip we like because it’s something important. It’s like it’s social standing. It’s important for us to know where we stand in the tribe.

And we will want things because we see other people want them. We compare ourselves constantly to other people in our group. Um, if your neighbor posts that they just did a new kitchen renovation, suddenly your kitchen feels inadequate. Like nothing has changed in your world, just your comparison group has changed.

There’s actually like research around this. It’s not celebrities who get under your skin. If Kim Kardashian has an amazing and crazy vacation, like you can roll your eyes at that, it probably doesn’t make you as jealous as if your college roommate that you didn’t think was that smart, if they get a promotion. The social standing of those that you perceive as close to you is like just something in us that we deeply care about.

Liam: What I’m curious about is what’s the algorithm actually doing here?

Adam: Mm-hmm. The algorithm was called EdgeRank and is super simple. It just looked at how close someone was to you, right? Like, is it your friend or is it your friend’s friend? How often do you interact with him? Looked at what type of post it was.

You know, if you posted a picture that is more important than just a text update. Um, and how recent it is. So it’s kind of like just old school Twitter, right? It is a list of the recent things from the people that you have friend. And it can rank them a little bit. Like if you comment on somebody’s posts a lot, it’s more likely to show theirs.

So I’m sure that’s hard to do at Facebook scale. But again, it’s so simple, Liam. The thing that made it so powerful isn’t this great insight that the algorithm has. It’s actually, it’s in us, Reddit was finding consensus in a group. What does everyone think is interesting in woodworking?

Facebook taps into something deeper. It’s just our, our social standing. Envy, jealousy, gossip. The, the code isn’t complex. What, what’s interesting about it is that your friends from high school are on there. The, the thing that makes it powerful isn’t an algorithm, actually. It is human nature.

Liam: But how, how does this relate to Corey? Because he is not jealous of a cat.

Adam: I guess what happened is, I wanted to understand what these algorithms do. And honestly, I spent a lot of time on this and this was one of my early light bulbs.

Like yes, it doesn’t totally connect to Corey’s thing, but it connects to this idea that, oh, like Facebook newsfeed wasn’t this great all-seeing eye that understood Adam. It, it’s just working because I care about these people that are my friends.

And I, I spent some time on this, but I couldn’t figure out how it exactly connected to Corey’s thing. Right. He’s not in the community. He’s not envious. He’s not even angry or jealous. He’s watching what sounds very wholesome to me, just like cats doing silly things.

Um, so I almost called him back to say that I didn’t have a good answer. But then I found these Facebook data leaks, they’re called the Haugen documents.

Tens of Thousands of Files Walk Out the Door

Liam: So what are these? What are these leaks?

Adam: Yeah. So this data scientist, Frances Haugen, walked outta Facebook with tens of thousands of files and you can find a lot of these files online. They shared them with journalists. They became part of the public record, they told this whole crazy story that around 2017 Facebook broke its algorithm.

So this kind of does relate to what happened to Corey. I don’t know why this topic is so interesting to me, Liam, but I’ve gone pretty deep on it. It’s 2017, Facebook’s engagement numbers are dropping and it’s a big deal because there’s this feedback loop of Facebook.

You post a photo, you know, of your birthday party and then your friend gets a notification ‘cause they’re in it and they log in and they leave a comment and then you like that, right? And then they post something and they tag you and you go in and it’s this circle, right? It’s like the people in your community interacting with each other.

Activity causes more activity. Comments feel good. Likes feel good. Everybody keeps contributing and the cycle repeats. But then something happened. The loop was slowing down, people were getting less out of it. So it seemed, and so they were like, we gotta find a fix. To address this, what they perceived as like a, a real threat to the heart of their business.

They rolled out this new metric, they called it meaningful social interactions, or MSI, and instead of, you know, measuring how long people stayed on the site, they wanted to boost engagement. They wanted to boost comments and shares and reactions, and the thinking was simple, right? If people are talking to each other, people are interacting, then the platform is working as intended.

Liam: Alright, so to clarify, the fix was optimizing for these meaningful social interactions, which was comments, shares, and reactions.

Adam: Exactly.

Liam: Yeah, it sounds reasonable.

Adam: Yeah. Right. Like, I mean, we work at tech companies, it sounds like a perfect strategy. You have all these metrics, you’re like, oh, this one’s falling. Let’s put together something that optimizes for raising this up. ‘cause this is the most important thing.

They didn’t have a way to tell what a reaction was. They see the numbers moving, but they don’t see the posts behind them. They’re, they’re watching these aggregate numbers and things are looking better.

Liam: Alright, so they’re watching the forest and not the trees.

Adam: Exactly. From a distance, everything looked great. Engagement was up, people were commenting, and the metric was doing its job. But if you zoomed in post by post, you saw something else. The posts getting the most reactions, uh, were the ones making people angry. You don’t rush to comment on your friend’s vacation photo.

You comment when you can’t believe what someone just said, or what they just shared. That’s when you feel the need to jump in. You set a metric and you optimize it, and the number goes up, but there’s side effects and you can’t see the side effects on the dashboard.

But they built this system, Liam, where you would get rewarded for posting something that would anger people, you would get more distribution on Facebook, more likes, more comments, whatever.

For saying something that was divisive. That was how it ended up working.

The Metric That Optimized for Rage

Liam: Shit. So what they’ve actually done is they’ve accidentally built Sort by Controversial here.

Adam: Yeah, looking, it’s wild, right? Like they, they didn’t read obviously the Scott Alexander story. By optimizing for engagement, they had built this system that just finds the most divisive and angry things across 2 billion people who use Facebook. They, they took it from a woodworking Reddit, where people might fight about glue or whether or not you can use it, to a worldwide thing where the messages that get the most distribution on Facebook are the ones that anger people, that divide.

They had this small group in Facebook that was supposed to be stopping misinformation from spreading and after this fix rolled out, they sounded the alarm, right? That this is just making people fight.

Like we built a system that makes people fight and they, they brought it to the newsfeed people. And the newsfeed people said like, yeah, this is risky. But you know, like we were worried our, our site was spinning down. Less people were using Facebook and we found this thing to flip and all of a sudden everybody’s on Facebook.

The newsfeed people said to the integrity people like, Hey, we’ll look into it. We got the metrics in the right direction. There must be a way to kind of tune this where it doesn’t cause quite so much anger.

And so they actually did hire some people. They got a guy from Netflix who worked on Netflix recommendation system. They asked him like, Hey, you guys recommend movies to each other? Like, what can we do here? We need to keep the growth up, but we don’t wanna cause the world to split into wars or, or whatever, right?

And, and he spent a lot of time on it, and he said, it’s actually tricky because you’ve actually found the, the best growth metric. Like this is really working. I don’t know how we turn it down without hurting our numbers.

Facebook was kind of stuck. What was happening, Liam, is the, the turn towards engagement brought growth back up. It brought growth to exactly where they needed, maybe beyond, but it was causing all these problems and they could see it like even all over the world, some really horrible things happening that some would say was related to this change, like wars breaking out.

But turning it down would turn the growth down and turning the growth down would hurt their revenue, hurt their stock price, hurt the stock options that everybody pays there. It was really hard for them to address this.

Liam: So the fix existed, but it just cost too much for everyone in the company to go and implement it.

Adam: Yeah. And like, in fairness to them, I get it right, like companies are often predicated on growth. There’s other social networks, they want people to use the platform. They’re not forcing people to respond in angry ways, but yet, they sort of are.

Liam: Yeah, I mean, this has been going on for years, like, newspapers publishing front page articles about a war or about a murder or something like that, and this whole thing of if it bleeds, it leads.

Adam: So you’re right. Like you could say, Hey, they’re just replicating in their distribution how editors choose stories for newspapers, like the most salacious thing goes first, but there’s editors. And the things generally are true. This isn’t true of Facebook, right? There’s no editors, there’s no limits. There’s instant feedback, there’s angry people talking to each other.

So honestly, this is super interesting. Like I, I spent a lot of time reading into all this because it’s such like a Sophie’s choice for a company, to decide what to do here. And in the end, of Corey’s question, like, I kind of wish that it was like, Hey, why is my dad always fighting with other people on Facebook or Twitter or, or even in YouTube comments, because I think that now, like I actually have a really strong understanding, uh, to answer that question.

It’s all about metric optimization and how engagement causes divisiveness. But actually that wasn’t Corey’s question.

I mean, Instagram is part of Meta, part of Facebook now, but Reels is not a copy of the, the News Feed. It doesn’t sort your friend’s posts by who’s got the most angry reactions, right?

It, it’s more like lining up an endless stream of videos from strangers really, and trying to find the ones you can’t stop watching. It’s kind of a whole different thing.

Liam: Yeah. So we’re still not at Corey’s answer yet.

Adam: I know, I know. It’s actually frustrating for me, every time I found an, an answer, like these leaks, I thought would explain it and they did explain why Facebook got toxic, but they didn’t explain why Corey spends all day watching cats.

YouTube Trades Clicks for Watch Time

Adam: That’s when I landed on YouTube actually. ‘cause YouTube cracked a different code and, and it is video based. It doesn’t care about reactions or anger or even a sense of community. They’re just trying to find things to keep you watching.

Early on, YouTube cared about clicks, right? You’d upload a video onto YouTube and YouTube counted how many people clicked. And that was already like way more data than Facebook had because people click on things way more than they comment. So it’s just like, Hey, show the videos that people click on the most. But that led to a problem. It just led to clickbait and junk where the videos didn’t actually deliver. So YouTube had to solve this.

Instead of just counting clicks, they started caring about how long you would actually stick around. And so they built their whole system instead of this engagement system of Facebook, they built it around. How do we keep you on the site? How long are you watching this video?

That way we can move past flashy thumbnails and we can focus more on, if people are on our site, they must like it. So they pulled it off by taking an idea from Amazon and others, which is collaborative filtering. You go on Amazon, you are like, people who bought this product also bought this one.

YouTube was able to really unlock this algorithm and say like, if you watch this video, then you might like this other video that somebody else watched both of them together. And then every second you watch a video is a signal. Every time you skip a video, that’s a signal that it’s not a fit. And it’s not just about you, Liam, right? It’s about the people who watch the type of stuff that you do.

So when you stop watching a video at the 32nd mark, YouTube doesn’t just learn that you didn’t like it, right? It learns that people with viewing habits like you didn’t like it.

And so I wanted to see how YouTube actually does this under the hood, and they actually published a paper on it. The paper is by Covington, Adams, and Sargin. And it’s about how they trained YouTube to find what videos to show you. And you might think this is really complex and it’s got a lot of crazy math in there.

And it does because YouTube is massive and there’s so many videos and so many people visiting. But the core of it is actually pretty clear. Here’s what it comes down to, right? They take the last 50 videos that you watched for more than a few minutes. And then they use like a vector space. So they’re, they come up with a representation of those 50 videos and they combine them all together and they, they put it in some sort of space, like you can imagine a, a graph.

And they’re like, Liam is the last 50 videos he watched and that makes him here on that space. All they’re doing for everybody is finding that spot that they’re at and then they find other people near that spot and they’re like, well, what did they like? And they show them the next one. That’s the recommendation.

Liam: So they’re building a model of you specifically based on your behavior, on their platform.

But all of these videos on YouTube back in the day were all long videos. Right? Because I’m curious, what I’m hearing from you regarding Corey, is that he is watching five second clips, not long form videos.

Adam: Yeah. And so is everybody, like so many people are addicted to these short videos. What makes short videos different? That question kind of changed how I saw all this. I used to think of oh, there’s this algorithm, it’s this one trick. You know, whether it’s Facebook or, you know, even the simple one on Reddit or whatever is on YouTube, but it’s a little different than that. Every platform has built their own little algorithm and they’re each kind of optimizing for a specific thing.

Reddit is great at finding this consensus for a small community. Facebook is good, you know, originally for stirring up envy, and then they found a way to oh, stir up rage and divisiveness. And YouTube, right? They found this momentum loop like, oh, let’s keep you watching, right? One video after another. Let’s focus on the time. Let’s learn and predict. But five second videos, it’s a whole new game. That’s TikTok. They’re the people who innovated here. And once I looked into what TikTok does, things started to really click.

Liam: So this is the algorithm that we’re talking about with Corey, right?

The Leaked Document TikTok Confirmed Was Real

Adam: Yeah, man, I, I’m getting there. There is like a arc here, I promise. But yes. After reading the YouTube paper, I wanted to know how TikTok pulls this off. The reason that TikTok’s important here is like TikTok is the pioneer of this addictive short form video. But TikTok, they keep a lot of things under wraps. But then in 2021, an internal ByteDance, their parent company, document was leaked to the New York Times.

Came straight from their engineering team. And TikTok actually confirmed it was a real document, but it was kind of a cheat sheet to explain to non-technical staff how everything worked.

It broke down the algorithm, explained what it did. So YouTube looks at your last 50 videos and makes a model of you so that when you go to the YouTube homepage, it will recommend things and then you’ll click on them to watch them. And it does this kind of offline, you can imagine nightly.

It’s updating its model of Liam based on what you’ve seen. But watching 50 videos, that takes a lot of time. If your interests change from watching YouTube, there’s just a lot to watch. But with short videos, the pace is totally different, right? And the time that you could watch a 10 minute YouTube video, you might watch or just flip through and skip like 30 or 40 TikToks.

Every swipe that you do, every like loop is a signal. If you watch a video twice, that’s like a really strong signal. If you swipe past in half a second, that’s a really strong signal. If you pause mid video or go back a little bit, you’re interested, right? You tap the creator’s profile, you’re interested very quickly, you’re generating all these signals that they learn about you.

And, like YouTube, right? TikTok has to narrow down all the videos on the platform to pick the ones just for you. But once they do that, the system runs the numbers on the fly, not waiting till you know the night to build the model of you.

They are figuring out constantly:

Thirty Minutes to Build a Model of You

Adam: How likely are you to watch the whole thing? How likely are you to just abandon TikTok at this point? They weigh each of those and the strongest signals, unlike Facebook and the interaction system, the strongest signals are nothing to do with like or comments. It’s like YouTube. It’s just watch time and re-watch time.

Did you watch the whole thing? Did you watch the thing after it? Did you watch twice?

Liam: So they’re reading your unconscious behavior and then they’re adapting the videos they show you on the fly.

Adam: It’s actually quite a technical feat where YouTube did this huge thing at night to determine what’s best. TikTok is doing it on the fly. As you’re going through there, they’re learning about you. They’re constantly updating that. Their first breakthrough was signal density. So that’s what they call, just the fact that they’re getting like 40 data points for you in 10 minutes.

YouTube just gets one that you watch something for 10 minutes, right? The second breakthrough is that training that I mentioned, right? YouTube does this batch at night. But TikTok ditched that approach. They published a paper about their system Monolith, and it’s a streaming pipeline. Every time you’re interacting, they’re real time updating that model of you, of what you might want to watch next.

Every swipe or pause goes into a Kafka queue, and then they have this Flink job that grabs those events and it adds context. How you watch this one video can affect another like two later. It’s learning very quickly.

And then what they tried to do was figure out like, oh, what period of the information you watch should I learn over? Should I look at what Liam has watched in the past year, in the past week, etc. And what actually, do you want to guess what they chose?

Liam: What do they choose?

Adam: 30 minutes. So the most important information about Liam is in the last 30 minutes. It’s very keyed in to what you’re currently doing. The reason they could do this 30 minutes is ‘cause they have way more data per minute.

The model is learning all the time. So in the last 30 minutes they may actually have a lot of information, but then there’s the real engineering hurdle that, you know, using the last 30 minutes of your time on TikTok means they need to be able to constantly update this, right?

So they, they came up with a complicated system to do that. That’s really a great feat of engineering. Basically what all this means is they have a lot more data per video and a lot more video per period of time.

Liam: I’ve got this image in my head of a massive burger and chip’s meal versus a meal at a sushi train. And the restaurant that serves the massive burger and chips. Maybe the chef is out back, maybe they’re even watching you to see whether you like it and they’re looking to learn how you behave or how you think about the massive burger and chips that you’ve eaten, but they’ve gotta sit through the entire meal.

And maybe at the end of that meal, maybe you hated it and you only ate half of it. They were learning once. From one big meal.

Whereas a sushi train, someone sat down and there’s small bite-sized meals going around and around and around, and the sushi chef might be there watching every single bite that you take, and they can see that you really like the chicken, teriyaki sushi, maybe you don’t get any of the raw fish. So all of a sudden they start adapting their sushi train and they take off the raw fish sushi and start piling on more of the other things.

Kentucky 96 Watched One Sad Clip

Adam: Yeah, exactly. The, the Wall Street Journal, uh, did this test to see how fast TikTok could figure out your interests. So they built these bots basically, and they gave them each a personality that they would only like certain videos.

They have this one account called Kentucky96. It was set up to be a 24-year-old interested in sadness. In just three minutes, 15 videos in, it watched just a sad clip.

And it watched it several times, right? And that’s all it took. By the time that 30 minutes was in place, Kentucky96 had seen 224 videos, and 93 of those were about people who were depressed or thinking of suicide or self-harm. What happened is every time Kentucky96 lingered on something about sad, the next one was likely to be even more sad.

And the algorithm doesn’t know what it’s doing right? It’s still very simple.

Liam: Yeah, so this algorithm can send someone from sort of the mainstream into fringe in 36 minutes from a relatively okay state with a couple of depressive thoughts, and then all of a sudden they’re entirely depressed in 36 minutes only. That’s all it takes.

Adam: Yeah. And this Wall Street Journal test, they tested a lot of different user profiles, one interested in politics, and they quickly got pulled towards, you know, QAnon and conspiracies. I don’t think there was a cat owner one, maybe there should have. Um, but the pattern was always the same. As long as you interact with something, the algorithm can find more adjacent things that’ll grab you and, and pull you deeper.

A system like YouTube takes weeks to learn your habits, and so it sees a kind of a larger range of what you’re into. TikTok locks into your narrow in the interest moment and keeps feeding it. So the rabbit hole isn’t really intentional. It’s what happened when a system is learning your preferences faster than you realize it, and this connects back to a problem that Corey is actually really worried about, and it’s kind of a bigger one than AI cats.

Roblox on the Right, YouTube Shorts on the Left

Adam: It’s about how people and especially young people are able to cope with the world we’ve built. Basically, he was worried about his son.

Corey: He’s old enough to where he sits down at the computer, he cracks open Roblox and he has it fill the right three quarters, the right four fifths of the screen, and then he puts a browser in the other 20% and scrolls YouTube Shorts while he is playing Roblox.

That’s not behavior that I want.

And I’m worried that I’m raising kids who don’t understand how to actually focus on something for longer than four or five seconds because their attention’s just hopping everywhere.

Adam: So I think that’s the real problem, right? He’s actually worried, I mean, for himself, but for his kid.

Liam: Is it a technical problem here or is it a bad habit that is enabled by tech?

Adam: Yeah, that’s the hard question. So, I will answer that ‘cause I think that’s important. I called Corey, I walked him through the whole story, right? There’s like Reddit and Facebook and YouTube and TikTok and how each found its way to like, get more people interested in it.

Corey: Is there ever a point where this becomes unwillingly like self-reinforcing? Oh, you know, we showed him a bunch of these and so we’re just gonna keep flooding his algorithm. But then, because we’re flooding his algorithm, it’s just reinforcing the signal that we’ve already decided. Did I just like DDoS my algorithm to the point that it just thinks this is mostly what I wanna watch?

Adam: Yeah, yeah. I mean, possibly.

Yeah, so Meta has, uh, what is this called? It says, “We wanna make sure everyone on Instagram, especially teens, has positive and age appropriate experiences. We’ve built in protections, but we also wanna give you a way to shape your own experience. That’s why we’re testing the ability for everyone on Instagram to reset the recommendations. In just a few taps, you can clear your recommended content across Explore, Reels, and Feeds.”

So that’s your key. It’s just under content preferences, reset suggested content.

Corey: Oh, really. I was gonna ask, is there, do you have any tips to dig myself out of the hole I’ve created for myself here?

Adam: But, but the interesting thing is, this part you probably won’t like as much, right? Because like

Social Media Is Cheesecake

Adam: The thing that makes TikTok and Reels and whatever work is not necessarily some magic in the algorithm. It’s like your interest. Social media is like cheesecake, right? Like we like things that are bad. Easy tasting. We like things that are sweet because that’s just, our evolutionary background, right?

We wanted fat and sugar and cheesecake is just like a purified form of that. So like the thing about cheesecake isn’t that it’s magically addictive because of the ingredients. It’s addictive because it’s things that we like and enjoy.

So I mean, I think that you like these cats,

Corey: I guess, yeah. I love annoying my wife by sending tons.

Adam: Yeah, right. Maybe it’s just the absurdity of sharing it with her. But like, if the novelty of them wears off and the Reels algorithm is working properly, I mean, it should be able to start showing you other stuff, although, as you said, maybe it’s over indexed on this.

Corey: Thanks for giving me the benefit of the doubt there.

Adam: Yeah. I don’t know, but I don’t know if this is helpful. Like, the thing I found is that like actually learning about the algorithms doesn’t necessarily make them not work. Now that I know how they work, it’s not gonna make me less likely to scroll on my phone.

Corey: Yeah. And I’m hoping I can scroll less, but we’ll see.

Adam: Like I feel like, you know, much like cheesecake, the key is just to like, not have it in the house or like on the kitchen counter, that’s all I got. Like I didn’t solve your problem necessarily.

Corey: Uh, you did. I’m gonna go give this reset Preferences a shot. We’ll see if I can get back to normal.

Liam: I feel like cheesecake now, but I mean, so, the answer here is that we just don’t have social media in the house.

Adam: Well, the point isn’t that you should never eat cheesecake, right? Cheesecake is delicious. Everybody has their weak spots,

Liam: The craving’s always gonna be there, for us as humans, but the concentration of the, the thing that we’re craving for, that’s what matters.

Adam: Yeah. Like the craving is us, like our, our desires and, and the things that we want in the world, whether food, or weird videos, it’s part of who we are. The algorithms didn’t build it. They just got very good at finding it and concentrating it.

Liam: Yeah, so, how about his kid?

Adam: Yeah. I think in a way that is both simpler and more profound.

Adam: The one thing that I found that was disturbing, I guess, was because of the TikTok thing—so small, it can exacerbate mental health things. Like if you’re feeling sad, it will learn, oh, like you’re interacting with sad videos.

I mean, that’s concerning for teens and children, I think more, more so than anything else.

Corey: Yeah, that I didn’t even think about that. That’s, that’s tough. Are they doing anything guardrail wise around that, or is it just the algorithm’s gonna algorithm?

Adam: I think it’s algorithms can algorithm. Like they don’t know what a cat video is, right? The algorithm’s not that smart. It’s just pattern matching. And if it’s pattern matching and showing you things that you think are funny to share with your wife, that’s cool. But if it’s like a depressed teenage girl and they find that she likes depressed things, and just kept feeding her depressing stuff, like, yeah, man, you go to a bad place.

But I just think like, people shouldn’t panic about kids’ use of social media as much as just try to have constraints.

Corey: Yeah. It’s kinda where we’ve fallen between my older daughter and my son. We just opt for, you know, reasonable monitoring without getting too much into their business, but enough that we’re aware, you know, like, I have a link to every short my son’s watched and it, have I watched all of them? No. But I go randomly pick some and it’s like, okay, it’s 12-year-old boy stuff like, it’s what I would’ve expected. We’re good.

1,700 Lawsuits and a CEO on the Stand

Adam: Yeah, so that’s the show. Or at least that’s where I thought it would end. Keep the cheesecake outta the house. Simple enough. But while I was putting this episode together, Mark Zuckerberg took the stand in Los Angeles, the first time he’s ever testified before a jury because a young woman said Instagram was designed to hook her and she was suing, uh, Meta, and there are 1,700 similar cases behind her.

And those same Haugen documents that I spent weeks reading through, they’re now evidence in this trial. Meta’s own researchers, uh, found, as was in the documents, that turning off middle of the night notifications helped kids sleep better, but it hurt growth, so they didn’t do it. And it’s easy to hear things like that, Mark Zuckerberg keeping kids up at night, and think that like, “Hey, there’s your villain. That’s the answer here,” but I don’t think it’s the right lesson.

We spent this whole episode talking about how every platform is kind of tapping into something innate in us. That cheesecake is always gonna get made because we want it, right?

That’s why maybe regulation matters. You know, we do it for alcohol, we do it for cigarettes.

You know, we don’t regulate cheesecake, but maybe, maybe we should.

So until next time. Thank you, Liam. Thank you, Corey. Thank you everybody, and thank you so much for listening.

Support CoRecursive

Hello,
I make CoRecursive because I love it when someone shares the details behind some project, some bug, or some incident with me.

No other podcast was telling stories quite like I wanted to hear.

Right now this is all done by just me and I love doing it, but it's also exhausting.

Recommending the show to others and contributing to this patreon are the biggest things you can do to help out.

Whatever you can do to help, I truly appreciate it!

Thanks! Adam Gordon Bell

Support The Podcast
Select an episode