DEV Community

Cover image for How do you hide low-quality tweets from Twitter searches?
Chun Fei Lung
Chun Fei Lung

Posted on

How do you hide low-quality tweets from Twitter searches?

I created my current Twitter account in 2009. Back then, the service was still relatively new and no one really knew what to use it for. Consequently nearly all of the “content” produced by individuals was crap: most people probably used it to share dumb status updates about literally everything, as if they were trying to implement some sort of digital real-life version of event sourcing.

Things have improved a lot since then. There’s a lot more interesting content on Twitter, especially for developers… provided that you can find it.

You can follow individual accounts or lists that have been created by other users of course. Or you could subscribe to certain topics of interest.

Tweets about specific things made by casual users (whose tweets have very few likes and retweets) can be found using Twitter’s search functionality. It works for the most part. But it sorely lacks an easy way to filter out (what I think are) uninteresting tweets.

Because everyone probably has their own preferences and definitions of “good” and “bad” tweets, here’s what I mean by uninteresting tweets.

Tweets that only show up in search results because they contain an obscene number of hashtags that aren’t relevant at all:

As far as I know, there’s no easy way to filter out tweets that contain more hashtags than actual content.

Keywords don’t have to be hashtags however. Most people who regularly search for tweets about a popular programming language will likely have seen something like this:

It gets worse when a programming language uses a fairly generic name that has other more common meanings, like PHP.

I wish there was an easy way to filter out accounts from the Philippines (especially those with Korean avatars 😅), as its currency is also abbreviated using PHP:

While you can filter by tweet location, it doesn’t seem possible to exclude locations. Also, very few tweets actually have location data.

Then there are also very personal tweets about partial hospitalisation programmes (also PHP) for patients with mental illnesses. I’m not providing an example of such a tweet here for reasons that should be fairly obvious…

PHP is far from the only language or concept that has this problem. java, go, nlp, and (to a lesser extent) python also have other, more common uses.

How I (try to) filter out bad tweets

I mostly use Twitter’s search for personal (rather than commercial or professional) reasons. I therefore simply “subscribe” to search queries using Twitter’s own freely available TweetDeck web app.

My queries usually look like this:

  • I start with the keywords that I’m interested in.
  • I append a lang:en so that most of the search results will be in English
  • Virtually all advertisements will include links, which I sometimes (but not always) remove from my results using -filter:links.
  • Because there are no easy ways to filter out spam, I then add a list of words that I want to exclude:
Keyword Why
hxxps (Automated?) tweets about security vulnerabilities on specific URLs
threat (Automated?) tweets about security vulnerabilities on specific URLs
hiring Recruitment ads
jobs Recruitment ads
remote Recruitment ads
talent Recruitment ads
talents Recruitment ads
vacancies Recruitment ads
vacancy Recruitment ads
retweet Promotional tweet
rt Promotional tweet
blockchain Keyword stuffing
coinbase Keyword stuffing
crypto Keyword stuffing
cryptocurrency Keyword stuffing
digitalmarketing Keyword stuffing
iot Keyword stuffing
ml Keyword stuffing
nft Keyword stuffing
pytorch Keyword stuffing
anatomy Homework/thesis services
assignment Homework/thesis services
biology Homework/thesis services
chemistry Homework/thesis services
course Homework/thesis services
essay Homework/thesis services
essayhelp Homework/thesis services
essaypay Homework/thesis services
essays Homework/thesis services
essaysdue Homework/thesis services
exam Homework/thesis services
exams Homework/thesis services
grade Homework/thesis services
grades Homework/thesis services
homework Homework/thesis services
paper Homework/thesis services
codingpics Low-quality tweets
meme Low-quality tweets
programmingjoke Low-quality tweets
programmingjokes Low-quality tweets
programmingmemes Low-quality tweets
doctor Is likely about hospitalisation
hospital Is likely about hospitalisation
album People trying to sell stuff
buy People trying to sell stuff
buyer People trying to sell stuff
cost People trying to sell stuff
costs People trying to sell stuff
currency People trying to sell stuff
deal People trying to sell stuff
discounted People trying to sell stuff
dm People trying to sell stuff
dropship People trying to sell stuff
dropshipping People trying to sell stuff
fashion People trying to sell stuff
fee People trying to sell stuff
free People trying to sell stuff
gcash People trying to sell stuff
gift People trying to sell stuff
items People trying to sell stuff
kpop People trying to sell stuff
payment People trying to sell stuff
paypal People trying to sell stuff
peso People trying to sell stuff
pesos People trying to sell stuff
pm People trying to sell stuff
price People trying to sell stuff
prices People trying to sell stuff
rewards People trying to sell stuff
sale People trying to sell stuff
sales People trying to sell stuff
sell People trying to sell stuff
shop People trying to sell stuff
sold People trying to sell stuff
spend People trying to sell stuff
unlock People trying to sell stuff
usd People trying to sell stuff
wts People trying to sell stuff
"how much" People trying to buy stuff
wtb People trying to buy stuff

As you can see the list is pretty long. This means that there are inevitably going to be interesting tweets that I never see because they are excluded by my queries.

What about you?

This approach mostly works for me, but I have the feeling there are much better ways to do this.

How do you separate the (t)wheat from the chaff? Do you use extensions? Alternative Twitter clients? Share your strategies in the comments!

Top comments (0)