Skip to content

DEV Community

Hung Vu

Posted on Apr 1, 2023

🤯Twitter Recommendation Algorithm is now open sourced

#webdev #cloud #discuss #watercooler

This requires a recommendation algorithm to distill the roughly 500 million Tweets posted daily down to a handful of top Tweets that ultimately show up on your device’s For You timeline.

The pipeline above runs approximately 5 billion times per day and completes in under 1.5 seconds on average. A single pipeline execution requires 220 seconds of CPU time, nearly 150x the latency you perceive on the app.

Along side with OpenAI, I personally think this is one of an important moment in the computing community as no one would ever guess a global-scale algorithm such as Twitter's Recommendation becomes open-sourced. Based on their engineer blog post, it is not out of reach to say the code base literally costs hundred thousands if not millions a day to run. How do you feel about this moment?

Twitter's engineer blog

Twitter's Recommendation Algorithm

Twitter Apache Thrift is an open-source, standalone, lightweight, data encoding library. In this blog post, we share the library we built so iOS developers outside Twitter can start using Thrift data.

blog.twitter.com

GitHub repository

twitter / the-algorithm

Source code for Twitter's Recommendation Algorithm

Twitter Recommendation Algorithm

The Twitter Recommendation Algorithm is a set of services and jobs that are responsible for constructing and serving the Home Timeline. For an introduction to how the algorithm works, please refer to our engineering blog. The diagram below illustrates how major services and jobs interconnect.

These are the main components of the Recommendation Algorithm included in this repository:

Type	Component	Description
Feature	SimClusters	Community detection and sparse embeddings into those communities.
TwHIN	Dense knowledge graph embeddings for Users and Tweets.
trust-and-safety-models	Models for detecting NSFW or abusive content.
real-graph	Model to predict likelihood of a Twitter User interacting with another User.
tweepcred	Page-Rank algorithm for calculating Twitter User reputation.
recos-injector	Streaming event processor for building input streams for GraphJet based services.
graph-feature-service	Serves graph features for a directed pair of Users (e.g. how many of User A's following liked Tweets from User B).
Candidate Source	search-index

…

Latest comments (0)

Subscribe

I write about various topics that I'm currently learning, be it front end, back end, cloud, DevOps, etc., you name it! For any questions you may have, you can reach out to me at: hello@hungvu.tech.

Location

Washington, US
Education

University of Washington Tacoma
Work

IT Support Engineer
Joined

Feb 2, 2022

Eliminate IPv4 tax on AWS, is it that easy?

#webdev #aws #networking #docker

My checklist for a production-ready website

#webdev #performance #security #nextjs

🦕 How to build a simple back end PoC with user notification using Novu, Amplication, and Discord?

#webdev #javascript #tutorial #programming