DEV Community

Cover image for 🤯Twitter Recommendation Algorithm is now open sourced
Hung Vu
Hung Vu

Posted on

🤯Twitter Recommendation Algorithm is now open sourced

This requires a recommendation algorithm to distill the roughly 500 million Tweets posted daily down to a handful of top Tweets that ultimately show up on your device’s For You timeline.

The pipeline above runs approximately 5 billion times per day and completes in under 1.5 seconds on average. A single pipeline execution requires 220 seconds of CPU time, nearly 150x the latency you perceive on the app.

Along side with OpenAI, I personally think this is one of an important moment in the computing community as no one would ever guess a global-scale algorithm such as Twitter's Recommendation becomes open-sourced. Based on their engineer blog post, it is not out of reach to say the code base literally costs hundred thousands if not millions a day to run. How do you feel about this moment?


Twitter's engineer blog

Twitter's Recommendation Algorithm

Twitter Apache Thrift is an open-source, standalone, lightweight, data encoding library. In this blog post, we share the library we built so iOS developers outside Twitter can start using Thrift data.

favicon blog.twitter.com

GitHub repository

GitHub logo twitter / the-algorithm

Source code for Twitter's Recommendation Algorithm

Twitter Recommendation Algorithm

The Twitter Recommendation Algorithm is a set of services and jobs that are responsible for constructing and serving the Home Timeline. For an introduction to how the algorithm works, please refer to our engineering blog. The diagram below illustrates how major services and jobs interconnect.

These are the main components of the Recommendation Algorithm included in this repository:

Type Component Description
Feature SimClusters Community detection and sparse embeddings into those communities.
TwHIN Dense knowledge graph embeddings for Users and Tweets.
trust-and-safety-models Models for detecting NSFW or abusive content.
real-graph Model to predict likelihood of a Twitter User interacting with another User.
tweepcred Page-Rank algorithm for calculating Twitter User reputation.
recos-injector Streaming event processor for building input streams for GraphJet based services.
graph-feature-service Serves graph features for a directed pair of Users (e.g. how many of User A's following liked Tweets from User B).
Candidate Source search-index

Top comments (0)