14 Twitter code insights in 3 mins!

Hey friends, Happy Wednesday!

Thanks for voting! Let’s look at some interesting insights about the Twitter Recommendation algorithm this week. I aim to write my newsletter issues in a way one can follow them while traveling on a bus, having a coffee, waiting for food, etc. Let's jump in!

Shoutout!

Thank you, Umang and Ruthrash for submitting a testimonial for my Newsletter.

Tweet of the week

I’m sharing a cool fact about gadgets every day on Twitter. And here is one of them for you!

Twitter Recommendation Algorithm Code insights

The following is the state of the Twitter repository in April 2023. And it might change later.

  1. About 50% of the Twitter Recommendation algorithm repository is in Scala, and 30% is in Java.

  2. 500 million tweets are posted daily

  3. There are 145k communities on Twitter with Pop, News, Soccer, Bollywood, and Nba being among the top in terms of the number of users.

  4. The “For You” timeline on Twitter consists of 50% In-Network Tweets (people you follow) and 50% Out-of-Network Tweets (people you don’t follow) on average.

  5. Out-of-Network Tweets without a second-degree connection to the Tweet are excluded from being shown on your feed.

  6. Getting blocked, muted or the tweet reported as spam/abuse will impact its visibility and reach.

  7. PageRank is a graph algorithm that was originally developed by Google to determine the importance of web pages in search results. Twitter constructs a similar directed graph by treating Twitter users as nodes, and their interactions (mentions, retweets, etc.) as edges.
    Source code: TweepCred readme

  8. Your page rank on Twitter is reduced using a factor of the (1.0 + numFollowings) / (1.0 + numFollowers) ratio. So, the higher the following you have compared to the number of people you follow, the better it is. But this reduction is done only if you’re following at least 2500 accounts.
    Source code: TweepCred readme, Reputation.scala

  9. While selecting tweets to be shown on your timeline, the tweets by Twitter Blue tick verified users that you follow will get a 4X boost. And the ones with Blue Tick that you don’t follow get a 2X boost.
    Source code: VerifiedAuthorScalingScore.scala, HomeGlobalParams.scala

  10. You have 30 minutes to edit a tweet since it’s posted. Source: EditedTweetsCandidatePipelineQueryTransformer.scala
    // The time window for which a tweet remains editable after creation.
    private val EditTimeWindow = 30.minutes

  11. A machine learning model is used to rank tweets before displaying them on "For You" timeline. It is one of the final stages of the funnel.

    A score is given to each tweet by doing a weighted sum across the predicted engagement probabilities. The factor (weights) 0.5, 1, 13.5, etc. that are multiplied in the beginning, as you see below is adjustable.

    But as of April 5, 2023, the following is where it stands. All the following products are added together to generate a score.

    P(x) = Probability that X will happen.
    Source: twitter/the-algorithm-ml/blob/main/projects/home/recap/README.md

Figure: Tweet score using weighted sum of probabilities for different possible actions

12. Notice that the tweet gets a 75X boost for falling on your timeline, based on the probability of you replying to that Tweet on your timeline and this reply being engaged by the Tweet author.

13. There is a -369X penalty on the probability of you reporting the tweet if it’s displayed on your timeline. By predicting all these, the algorithm displays the series of tweets on your Timeline using this weighted probability as a factor.

14. Twitter identified 4 different user groups - Power users, democrats, republicans, and Elon Musk. It is claimed to be a stats collection code for measuring how often Tweets from specific user groups are served. This has been removed since last week!
Source: HomeTweetTypePredicates.scala, Commit id - ec83d01dcaebf369444d75ed04b3625a0a645eb9

Source - Twitter’s blog post on their Recommendation Algorithm, Twitter’s Open source code on GitHub.

Twitter's blog post on recommendation algorithm explained the overall high-level architecture of how things are done. It's an amazing place to get an overview of everything.

Question of the week

After reading these, can you look back and relate to how one of your tweets had a great response? Or why you see what you see on your Timeline? Reply to this email with your thoughts, and we can have a discussion. As someone who is new to Twitter, I would be interested in learning more about it.

Blog posts

My blog post S2E7 is coming out next week. I’ll notify you next Wednesday about the same :)
Also, by replying to this email, let me know if you’d like to read about a specific gadget/device. And I can write about it.

Gadget of the week

Health: The U-Scan by Withings is a sensor that provides you with significant biometric information by analyzing your pee. You'll need to take a leak on this gadget placed in your toilet bowl. Shaped like a puck, this device performs a daily analysis, determining your pH balance, carbohydrates, and hydration levels. It can even identify any deficiencies in essential vitamins.

Thank you for reading!

Have a nice rest of the week, and take care!
Until next Wednesday,
Chendur

Previous
Previous

Lidar Tech explained in 3 mins!

Next
Next

Segways explained