I used to develop apps. I still do, but I used to, too
Back in 2007/2008, I learned Ruby on Rails and developed two prototype sites that didn't end up in production. Since then, I did extensive work on non-Ruby, non-Rails server applications and learned enough about Android and iOS apps to manage the development of mobile apps in my current role.
I never touched Ruby on Rails again...until @anshbansal asked a question that I had asked myself a few times before.
The following is my deep dive into the dev.to codebase to answer this question. There are probably a few things wrong, please point them out in the comments so I can correct them. Thank you.
Start at the beginning
And it doesn't get much earlier than the root route
root "stories#index"
Taking control
Rails follows a Model View Controller (MVC) architecture. When you ask dev.to to show you the root page, it will ask the stories controller to run the index action.
What we see there is it sets up a bunch of state then renders the articles/index template
render template: "articles/index"
Show me the stories
If you inspect your dev.to home screen, you'll notice all the articles/stories are listed within an articles-list
div. You can find it in the articles/index view as expected.
And here's where we start to see how the feed is populated.
OK, first show me the featured story
The first story in the article list is a featured story.
The algorithm to get the featured story for a logged in user comes from the stories controller and the articles/index view. I've simplified it by substituting some variables and reorganizing some statements.
@stories = Article.published.limited_column_select.page(1).per(35)
@stories = @stories.
where("score > ? OR featured = ?", 9, true).
order("hotness_score DESC")
offset = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
1, 1, 1, 1, 2, 2, 2, 3, 3, 4,
5, 6, 7, 8, 9, 10, 11].sample # random offset, weighted more towards zero
@stories = @stories.offset(offset)
@featured_story = @stories.where.not(main_image: nil).first&.decorate || Article.new
In English:
- Fetch a collection of stories that score above 9 or are featured
- Order them, starting with the "hottest" one
- Randomly skip the first 0 to 11 stories, weighted more towards 0
- The featured story is the first story that has a main image
Leaving how score, featured, and hotness are determined as an exercise for the reader
Notice the featured article has nothing to do with which people, organizations, or tags you follow.
Now show me the rest of the stories?
After rendering the featured story, the article/index view creates a substories
div and then renders the stories/main_stories_feed partial
<%= render "stories/main_stories_feed" %>
These are not the divs you are looking for
I was scratching my head while reading through the _main_stories_feed partial
It populates the data attributes of a new-articles-object
div and a home-articles-object
div, then a bunch of other divs that have no contents. And the divs I do see when inspecting the home screen have the single-article single-article-small-pic
class, but don't look like what's in this file.
Evil action-at-a-distance like this can only mean one thing: JavaScript
Nobody expects the Spanish Inquisition
Searching the repo for new-articles-object
and home-articles-object
, we find them both in initializeFetchFollowed Articles, called very early when a page is initialized.
And there is a lot of logic here which I did not expect.
The new stories are not the old stories
The stories controller populated the @stories
collection used for the for the featured story. It is also used to populate the the data attributes of the home-articles-object
div. But that comes next, not now.
Instead, The first stories we see after the feature article are, populated from a query directly in the view.
@new_stories = Article.published.
where("published_at > ? AND score > ?", rand(2..6).hours.ago, -15).
limited_column_select.
order("published_at DESC").
limit(rand(15..80))
In English:
- Fetch a collection of stories that have been published some time in the last 2 to 6 hours and score above -15
- Order them by most recent first
- Return the first 15 to 80 of them
Then the JavaScript function insertNewArticles
takes over:
articlesJSON.forEach(function(article){
var articlePoints = 0
var containsUserID = findOne([article.user_id], user.followed_user_ids || [])
var containsOrganizationID = findOne([article.organization_id], user.followed_organization_ids || [])
var intersectedTags = intersect_arrays(user.followed_tag_names, article.cached_tag_list_array)
var followedPoints = 1
var experienceDifference = Math.abs(article['experience_level_rating'] - user.experience_level || 5)
var containsPreferredLanguage = findOne([article.language || 'en'], user.preferred_languages_array || ['en']);
JSON.parse(user.followed_tags).map(function(tag) {
if (intersectedTags.includes(tag.name)) {
followedPoints = followedPoints + tag.points
}
})
articlePoints = articlePoints + (followedPoints*2) + article.positive_reactions_count
if (containsUserID || article.user_id === user.id) {
articlePoints = articlePoints + 16
}
if (containsOrganizationID) {
articlePoints = articlePoints + 16
}
if (containsPreferredLanguage) {
articlePoints = articlePoints + 1
} else {
articlePoints = articlePoints - 10
}
var rand = Math.random();
if (rand < 0.3) {
articlePoints = articlePoints + 3
} else if (rand < 0.6) {
articlePoints = articlePoints + 6
}
articlePoints = articlePoints - (experienceDifference/2);
article['points'] = articlePoints
});
var sortedArticles = articlesJSON.sort(function(a, b) {
return b.points - a.points;
});
sortedArticles.forEach(function(article){
var parent = insertPlace.parentNode;
if ( article.points > 12 && !document.getElementById("article-link-"+article.id) ) {
insertArticle(article,parent,insertPlace);
}
});
In English:
- Give each article 0 points to start off with
- Sum the weight of each tag (which can also be negative) the user follows and this article is tagged with, then double it
- Now add to that, the number of positive reactions the article currently has
- If the user follows the article's author, or is the articles author, add 16 points
- If the user follows the article's organization, add 16 points
- If the article is written in the user's language, add 1 point, otherwise, subtract 10 points
- Randomly (with equal chance) give the article an extra 0, 3, or 6 points.
- Subtract half the difference of this articles experience level vs the user's experience
- Order the articles by most points first
- If the article has more than 12 points, show it to the user
What about the rest?
The next batch of initialized articles come from the same batch we got the featured article from and processed by a new (but familiar) algorithm in insertTopArticles
.
When you get to the bottom of that list, articles are populated from an algoliasearch index of ordered articles. The definition of that index is found in the Article model.
Finally, scrolling kicks in which you can find in initScrolling.js.erb and populates more articles from the algoliasearch index.
Leaving the details of these as an exercise for the reader
TL;DR
For the first article in the list:
- Fetch a collection of stories that score above 9 or are featured
- Order them, starting with the "hottest" one
- Randomly skip the first 0 to 11 stories, weighted more towards 0
- The featured story is the first story that has a main image
For the next batch of articles:
- Fetch a collection of stories that have been published some time in the last 2 to 6 hours and score above -15
- Order them by most recent first
- Return the first 15 to 80 of them
- Give each article 0 points to start off with
- Sum the weight of each tag (which can also be negative) the user follows and this article is tagged with, then double it
- Now add to that, the number of positive reactions the article currently has
- If the user follows the article's author, or is the articles author, add 16 points
- If the user follows the article's organization, add 16 points
- If the article is written in the user's language, add 1 point, otherwise, subtract 10 points
- Randomly (with equal chance) give the article an extra 0, 3, or 6 points.
- Subtract half the difference of this articles experience level vs the user's experience
- Order the articles by most points first
- If the article has more than 12 points, show it to the user
If you've scrolled passed all of those,
- Using the same collection the featured article came from
- Process with a similar but different algorithm as the previous batch
And, finally
Closing remarks
This could change at any time. For example, on 2019-09-19, @ben merged a PR to add more variation to home feed. All links to github are to the commit that I saw which was in master
at the time of writing but, by the time you read this, master
has probably moved on.
Top comments (24)
This is really timely because we've just begun the phase of overhauling this. @nickytonline and @joshpuetz should check this out 😄
Ha, really glad I added the disclaimer
Really happy you all keep iterating on every aspect, with community involvement 🙌, to keep improving
Reminds me of Jose Aguinaga's famous article about JS development.
First line: No new frameworks were made during the writing.
Top comment below: I highly doubt that.
Thanks for sharing Justin. 🔥
Someone should do this about YouTube recommendations algo.
Someone should make an article about propretary software :p
I set up an RSS feed from my website and was using that to publish to DEV, but since I don't have timestamps set up properly on my RSS feed when it published to DEV it would immediately show as published "20 hours ago". Once I started going in and manually modifying the timestamps it seems like more people have been viewing my posts. I guess this is why!
Definitely. This reminds me of a PR I saw not too long ago
Ability to backdate a post #3455
Is your feature request related to a problem? Please describe. Unable to change publish date. I personally wish to back date a post, but there is no way to set the publish date (or time) for a post.
Describe the solution you'd like Add a custom variable for
publish_date
Describe alternatives you've considered Time travel?
Additional context In lieu of being able to delete/edit comments on an old post I have duplicated and republished it as a new post, but the date does not/cannot reflect the origin publish date.
Semi related to #3274 and #1363
And @jess posed a great question
Just wanted to add another thanks for this deep dive @piannaf as I've been referencing it over the past day. First up is getting all of these pieces in the same place: while technically someone could change the feed algorithm right now, one would need to change code in multiple places. That's part of what we're trying to improve!
Wow, thanks! That's an unintended side-effect I'm really glad has been beneficial.
This is awesome Justin! Thank you for the deep dive!
Thank you for taking the time to answer the question in such detail. I believe many users were expecting only to view the tags, users they are following chronologically perhaps like a RSS feed.
Yeah, when I first joined, that's what I expected "feed" to mean. Pretty quickly discovered that wasn't the case. But I've been happy with the recommendations because I like seeing things outside my chosen bubble from time to time.
Can understand, though people getting upset if they put -100 on a tag and still saw it anywhere
Thanks for sharing, @justin
Thanks for sharing this post. I have a better understanding of how posts are selected for display.
This is interesting!
Really cool thank you for posting this