There's one major drawback to a "feed-oriented" platform like DEV: posts have an incredibly short social lifespan. That is, posts drop out of the awareness of the majority of users after only a few hours — maybe a day or two if it goes viral — and then fades into relative obscurity unless manually promoted.
"Duplicate" Content
Understand, the thing I don't want is the Dupe Hammer Of Dread™ that we all loathe on StackOverflow to ever become a thing here on DEV. "Duplicate" is such a weird term anyway; there may be a hundred articles about Angular vs. React, but each one is going to provide a different take. "Canonical content" is the enemy of diversity.
However, when you have a hundred articles explaining the difference between ==
and is
in Python, you have to question how much of that is a genuinely unique take, and how much is posted because the author is led to believe it hasn't been covered before.
Discussions seem to be particularly vulnerable to duplication: yet another conversation about music while coding or your favorite VS Code extension. Yet this duplication is presently justified on one point: there aren't likely to be new answers to old #discuss
posts!
A "Discoverability" Problem
This isn't so much a social problem as a UX problem. As I've established, there's plenty of "duplicate" content that is perfectly justified as a high-value contribution to the community. I am only concerned that some "duplicate" content exists purely because prior identical posts weren't discoverable!
That is, an author should post because she or he wants to post an article or discussion, not purely because they can't find what they're looking for, or because the platform led them to passively assume they were the first to think about writing on the topic of is
vs ==
.
Similarly, we shouldn't need nineteen practically identical "what music are you listening to while coding" discussions, because everyone should be able to find the one that was originally there!
In other words, the platform should encourage so-called duplicate content to actually contribute value, and not merely echo what's been said a hundred times. A new take, an article written to reinforce the author's learning, new or additional data, a rebuttal to a prior post, an explicitly renewed discussion (what's your favorite extension in 2020?)...all of these contribute value.
Interestingly, I think the author is usually the best judge of whether their post contributes value, at least assuming said author is aware of similar content on the platform. To accomplish this, we need to fix discoverability.
Possible Solutions?
Similar Articles
One feature we could borrow from StackOverflow's playbook might be to list "similar posts" on the compose page. This could update periodically based primarily on the frontmatter, and secondarily on the words in the article. This gives the author the information they need to decide if their post really contributes value, with the positive side-effect of directing them to posts and authors that may interest them.
Bumping
Another useful feature might be to allow "bumping" posts. If an author realizes "Oh, I don't need to post another 'What music are you listening to' discussion", she should still be able to renew the conversation by going to the post she likes and bumping it; pushing it to the top of the feed, just as if she had posted new content. Then, that post would get attention from new users who may never have seen it before.
This would also give users the ability to give posts they like another shot at getting noticed. There are some real gems buried on DEV that I would deeply love to push into the spotlight, even for a few minutes. At the moment, though, I'm forced to use more "informal" means, such as linking to the content in a comment or a Twitter post. There's nothing wrong with that, of course, but my reach is limited to only those who follow me or happen to see my comment.
It would be necessary to prevent abuse of this feature, of course, so there should be a limit on how many "bumps" a user can perform in a given period of time, and perhaps how many times a user may bump any one post. It may also be necessary to make bumping your own content more "expensive" than bumping someone else's.
This would also have no effect on "Latest".
(Mod Note: Moderator downvotes should disqualify a post from bumping.)
EDIT: It would also be excellent if Bumping a post shares it to your followers, in the same way your own posts are. This would allow popular authors with thousands of followers to easily "boost" more obscure authors and posts, something which is difficult at present.
What Do You Think?
That's the problem and some solutions as I see it. What are your thoughts?
Top comments (28)
If I comment on a post that has substantially the same message - or a disagreement with - a post I've read before and can hunt down, then I sometimes reference that post in my comment.
In fact:
How should we handle duplicate content on dev.to?
Ben Sinclair ・ Jun 7 '18 ・ 1 min read
I'm conscious that people might take it as me criticising them for duplicating content, but I really mean it to help spread the discussion. If one post about, say, the Pythonic
is
garners a lot of interaction, while an existing one with more to say doesn't, then everyone's losing out.It's a social problem in that popular (heavily followed, veteran) people can have their posts read a lot more than newcomers simply because they pop up in more people's feeds, and in turn they get more clicky action.
It's a time-of-day problem for some as well, where people post at a time convenient for them that might not coincide with a lot of readers being online.
And it's a popularity problem, too. I'm sure you've seen it happen, where people who have a lot of regular readers post something run-of-the-mill like, "what's your favourite text editor?" that ends up dominating the feed because it's easy to answer, never mind that it's asked multiple times per day and most people don't get any response.
There's some luck involved, some momentum of popularity, and some algorithms that we can influence.
I like the idea of bumping. I like the idea of related posts.
Something I also like is that perhaps the related posts could be partially algorithmic and also partly influenced by moderators. Where you get to rate the experience level of a post or suggest a tweet, you could also get to suggest a related post. Curation of content could really work!
That "popularity bias" may actually lend itself to a further feature of bumping: if an author bumps a post, it's shared to his or her followers in the same manner as posting! That way, if I really like an article by a newcomer, and I bump it, it automatically shows up in the notification area for my 15K followers.
I suppose that would make bumping not entirely dissimilar to "retweets", but that seems like a good thing. New and less popular authors gets promoted, and old posts resurrected periodically. Continued activity — comments, hearts/unicorns, bumping — by other users would then have a juggernaut effect, increasing the reach of otherwise obscure posts via notifications, feed ranking, and organic sharing.
Stackoverflow wants to create a repository of knowledge, that's why they are so annoyingly eager to close as duplicate. But DEV seems more interested in building community and fostering discussion, so closing duplicate is indeed a bad fit here.
In a 'normal' social group if there is someone new to the group and bring up a topic that the regulars have already discussed multiple times, I think the group will entertain his question for some time, someone will point out the summary from past discussions, and most will not engage. So, um, just like what happens in DEV now :)
But yes, agree 100% with you, if the new person is actually looking for information instead of just want to strike up a conversation, a good search engine, 'similar to' widget, bumping etc will be very useful.
I think the best solution(s) depends on what we think is important for readers, as this problem isn't really that important for authors, as being made aware of duplicate content wont stop those that don't care, and you might already be aware of older posts as a reader.
The reason I say any solution should focus on the read is because there are always going to be more readers the authors, and expanding the platform relies more on getting more readers, with a few turning to authors than too many authors and only a few readers, as then things turn into a "shouting match".
I see the current feed as being focused on the new, at the cost of the old. Where newer articles are promoted over finding older articles. This incentives more "new" articles, cause more or less what we see today.
So the "dump solution" relies on authors being active in maintaining their posts (up to a point), before being "lost to the void" just like before. It will also incentives the most active posters that maintain their posts, which may or may not be good for the reader.
The feed could take the approach similar to a forum/Reddit where active posts are promoted higher, but this has the side-effect of inherently promoting topics that are more active and or "heated".
The feed could also take a more "active" approach in finding older content that is more relevant to the reader, but this requires the feed to know what the reader is interested in, and what topics they would want to read. (hello big data parsing or big $$$) It already kind of does that with "Another post you may like" features, and "classics" that are promoted on the bottom of the page in most cases.
Maybe the solution is as simple as leveraging existing promotions to older popular posts more heavily, or stronger "reader" filters?
Its an interesting question, without a simple answer that is for sure tho haha :)
I do agree with you that the quality of content on the main feed here is a little disappointing. A lot of repetitive articles.
I think any attempts to manufacture a 'good' feed are bound to end up in tragedy for the whole community. It's your job as the feed consumer to filter out things you aren't interested in, and it's the job of the website developer to give you the tools to do that effectively. To surface good content you either have to have a moderator that removes the bad content, or a curator that picks and presents good content. You can either have a person do that, or have the user to that. Both take work.
And, see, that's why I think the authors and readers literally should be a part of that process (ergo my two suggestions.) "Enforced" quality doesn't lend to true diversity or inclusiveness, but a lack of quality isn't much better. If we're all empowered to push quality material to the front, then that quality is according to the community as a whole. As it stands, it's more of a "visibility by default" thing.
Slightly unrelated, I often get people not fully reading my posts and racing down to the comments to post something off topic, is there a solution or should I just write better content 🤷♂️. That and spelling mistake corrections, I'm dyslexic, I don't actually feel it is all that important for the style of content I write, stream of consciousness posts.
I'm dyslexic too, but I find that spelling and grammar matters! For one, by leaving those in, you're making it harder for other readers, especially other users who are dyslexic, ESL, or have reading difficulties. Quality matters.
Proofreading need not be at odds with 'stream of consciousness'. You simply need to form the habit of fixing errors as you go. The LanguageTool add on can help with that.
Damn, your right, but still a "nice article but you made a mistake, I wouldnt mind that" usually it's quite a blunt comment.
Some (many?) people could stand to be more tactful, but at the risk of sounding like I'm excusing them (I'm not), we are programmers. Code has to be exact in spelling, grammar, and syntax, or it won't work as desired. Thus, proficiency with coding usually comes with a knee-jerk reaction towards typos and errors, a la code review "this is wrong. fix it."
I've never found a twitchier mob of English pedants than programmers, even compared to professional authors. Late nights of debugging have entrained us to fear errors, and even more, to distrust coders who regularly produce errors: "If you can't resolve a basic their/there/they're collision, how can I expect you to comprehend link resolution order?"
It's sound logic to be sure, but it could use more than a tablespoon of basic human decency to go along with it.
(P.S. If you, dear reader, are one of the folks ripping into people over typos, this isn't an excuse. Learn tact.)
Array methods ... Again and again and again. That's what's in my feed quite often. Jason I couldn't agree more, more like this would be a good idea.
DAE optional chaining?!
Maybe a button "not interested in Optional Channing" would solve all of this
Yep?
Well, the current voting features already cover things like that, although maybe we could add the improved "related" side-pane to the post itself, in addition to the compose page.
I wouldn't want to violate individual articles by "merging" however. For example, I'm working hard on my Dead Simple Python series, so I wouldn't want someone else to be able to mix in other content. If I want to cite another article myself, I can include the quote manually.
Like I said, the "dupe hammer" itself is dangerous here. It can quickly devolve into silencing one voice over another.
It's both valuable, bumping everything new as well as focusing on adding value and respecting the original posts. I think howerver, the strict uniqueness thing is some kind of flawed, which you can see on SO question like this: stackoverflow.com/questions/576732...
Allowing newcomers to write about the very same thing someone else wrote on their own journey before gives them at least the possibility to learn and reflect and the same time reach out to people and get honest feedback. So I am pro bumping :-)
Really great stuff, Jason. I actually had the same sort of complaint awhile back about this platform, and I even mentioned it as an issue on GitHub:
Recycle Old Content #3453
Is your feature request related to a problem? Please describe.
This is mostly a selfish ask, but I think it would be cool if there was an option to discover older content. Right now, the main homepage allows you to sort content by latest and greatest, but there doesn't seem to be any opportunities for good content to shine if it never gets seen in the first place.
In other words, unless an article gets overwhelmingly popular, I feel like it sort of disappears into the void like an old Tweet. It would be nice if articles had an opportunity to resurface as they do take a lot longer to write than a Tweet.
Describe the solution you'd like
I know that dev.to displays related content under articles, but I think I'd be interested in a tab or widget that allowed you to discover older articles. For instance, GitHub has this little discover repositories section on the main dashboard. I don't know how it works (probably based on trending repos), but I'd love something like that as both a reader and an author—especially if those recommendations were based on your interests.
Describe alternatives you've considered
Honestly, I haven't thought a lot about how this would work, and I couldn't find any similar suggestions in the current list of issues. If anyone has any ideas, I'm welcome to them!
Additional context
Here's the GitHub feature I was referencing:
But, I never had a solution for it. I like some of your ideas!
I like the chaos of chronological posts, much like tumblr, but yes I agree there's a lot of duplicate content and I wish there were a feature to assist either better discoverability of pre-existing articles while still allowing for frequent opinion ones ("How I got into tech"/"My fave extensions") to be given space.
I would like to point out this community is much more friendly than stack overflow and I'd rather it remain so. I'm into the related-content idea, or even curated lists like on twitter.