It is quite some time since the last time when I wrote about business requirements and goals of the recommendation system. That because I had busy private life recently. Thankfully, I am back and here is the next part of my Redis Hackathon journey.
You can find source code of this project on GitHub:
Open source recommendation system based on time-series data and statistical analysis. Written in
Redis for storage. The recommendation system uses the
Jaccard index to calculate the intersection between two sets. One set is represented by the maximum possible sum of tag score and the other set is the score sum of user events per tag. The higher the Jaccard index, the higher the recommendation. It uses numbers to represent sets to increase performance.
- Use tag score and Jaccard index
- Content-based filtering
- Event-driven powered engine
- Naive exploration of new tags
- Suitable for product and content recommendation
- Fine-tuning of tag weights
- Minimalist and lightweight
- Written in TypeScript and Node.js
How it works
How the data is stored:
- Actors are stored in Redis as simple
Stringkeys with create date
- Items are
tagsas members. The item may have…
First of all, I set taxonomy for architecture. System is made of relations between four entities:
- Actor - decisive entity, for whom we try to match recommendations. In most cases this is a user or a group of users.
- Item - this entity represents what we try to match e.g. product in a store, article on a blog or video on a streaming platform.
- Event - are actor's actions used to calculate recommendation. It consist of tag with a score (weight).
- Recommendation - final verdict, how close certain actor and item are.
Having domain entities I tried many different Redis data structures to find one suitable to store them. To store actor data the most useful was HashSet with key of actorId, field of tag and value of score but I wanted to add event expiration and it isn't possible to do that easily with this structure. So instead I decided to use basic String structure where key contains all such information and value is the score. Then using SCAN with wildcard I'm retrieving data. Item have set of unique tags so the Set structure seems perfect choice and Actor - just to point existence - are again simple Strings with create timestamps.
Node part of the application has been written using TypeScript and Express. System consists of app layer with recommendation mechanism run by controllers. Redis is hidden behind data access layer so can be easily replaced with different storage engine. High level architecture of recommendation system.
In the next part we are going look close at algorithm used to find recommendations.