DEV Community πŸ‘©β€πŸ’»πŸ‘¨β€πŸ’»

AI Sapiens
AI Sapiens

Posted on • Updated on

Finding trending products with data science

In the last couple of years the speed of change has gone up a notch in many areas, with both individuals and companies having to cope with it.

These times have increased the need to better understand the changes/trends and possibly predict them.

The first step in this task is to gather the data on trends with many potential sources available.

One approach is to utilize the social media platforms, as many of them have decent APIs available to gather the data.

To get into more detail let us first narrow the scope of what we are interested in to the following - we want to find, among some large set of e-commerce products, a set of those that have the most positive trend (i.e. highest growth in some predefined time interval), preferably categorized in separate categories, using taxonomies from Google products or those from Facebook Products.

Trending products with Social Media (Twitter)

Having a defined list of products and wanting first to focus on social media domain, a viable approach would be to focus on Twitter API (you can find a lot of information about Twitter API here: https://developer.twitter.com/en/docs/twitter-api).

There are different API endpoints that one could employ. The first instinct would be to go for the Twitter Firehose, but the problem is that the amount of data streamed on Firehose is enormous and one would need to deploy a lot of servers to parse for products from our list of products in this stream.

The other downside is that the access to Firehose is also extremely expensive, which would burden our potential product or service with high costs even before the actual start.

A better approach, both in terms of server resources needed and costs involved is to use Twitter Search API and just send our products to the search API endpoint on some regular time intervals.

This will work for most of the products except those that are really high volume (e.g. IPhone), where it may turn out that fetching all the tweets mentioning this high volume item may take too long or require a large number of servers to accomplish this. In this case it may make sense to use some count caps on these items.

Another decent social media API that one can use for finding trending products is that of Youtube, you can learn more about it here: https://developers.google.com/youtube/v3. Other platforms, like Facebook and Instagram, are a bit more restrictive in terms of the availability of data, buy may also be worth the try.

Determining trending products using search engines - PyTrends

Another approach on obtaining an estimate about interest of users in particular products is to use the statistics on keywords trends on search engines, primarily google.

For this purpose, one can use the data from Google AdWords (or Google Ads). These trends can practically be most easily obtained by using libraries for this purpose. One of the best ones is https://github.com/GeneralMills/pytrends.

Pytrends is a kind of non-official API for Google Trends and has many features for downloading the Google Trends data. It is especially suitable if you want to build your own code for obtaining the Google Trends data and you want to quickly build an automated solution for this.

The data obtained with Pytrends is scaled from 0 to 100, with 100 indicating the highest popularity. 0 means that there is not enough data points for particular keywords.

There are many different parameters that you can pass:

Interest Over Time
Historical Hourly Interest
Interest by Region
Related Topics
Related Queries
Trending Searches
Realtime Search Trends
Top Charts
Suggestions

Let us discuss in more detail some of them.

Interest Over Time: returns historical time series (indexed) which shows when the keyword was searched most.

Historical Hourly Interest: is a historical, indexed, hourly data for when the keyword was searched most. It works by sending several requests to Google, with each one fetching one week of data on hourly basis.

Interest by Region: same data as first one above except on per region basis.

Related Topics: this is quite useful, it returns related keywords.

Trending Searches: these are the latest trending searches shown on Google Trends' Trending Searches segment.

Top Charts: data for given theme as shown in Google Trends' Top Charts segment.

Suggestions: another useful set of data, suggestions that can be used for further refinement.

Practical examples of trending products in various verticals

As part of our data science platform we determined trending searches for a large number of products and then categorized them to various verticals (Tier 1, Tier 2 and Tier 3 which has over 1000 categories).

Below are some of the popular products in various categories.

Apparel & Accessories

Mirabel dress
Image description

Printed custome is perhaps related to recent Spiderman movie:
Image description

Camera & Optics Vertical

Tripod Studio
Image description

Electronics
Increase in search for uhd movies:

Image description

Product classification

When looking for trending products it is very useful if one has the product classification done, so that products are classified in specific categories.

Then one can ask the question like what are the most trending products in the category of aquarium supplies?

Or which of niche categories is currently experiencing the most up trending keywords and would thus be interesting to look into as at whole category or vertical and not just on level of specific keywords.

Product classification API can thus bring many useful benefits. It is usually performed using machine learning models, of the supervised variant, which are trained on data sets that were prepared specifically for this purpose.

Optimization of portfolio of products

Trends of products can be used as one of the variables for optimizing portfolio of products on sale and what share (e.g. of prominent places on website) should be attributed to each product, based on recent trends, price, margin, volume of potential sale.

One could actually use a general portfolio optimiser tool for this purpose, as it is really a linear programming or quadratic programming problem, depending on the formulation of objective that is optimized and the constraints used (they are most often linear or quadratic).

Top comments (0)

Build Anything...


Use any Linode offering to create something for the DEV x Linode Hackathon 2022. A variety of prizes are up for grabs, inculding $1,000 USD. πŸ‘€

β†’ Join the Hackathon <-