DEV Community

Cover image for Scalable E-Commerce Architecture - Part 1: Inventory, Product and Search
Savy
Savy

Posted on

Scalable E-Commerce Architecture - Part 1: Inventory, Product and Search

In this article, I delve into crafting a comprehensive e-commerce micro-service architecture.
In my previous post, I discussed the payment service, utilizing Authorize.net, and now it's time to discuss product services.

Starting Point

I begin with the following figure. I put a "Product DB" in the center, customers actions related to "product" on the right, and agent actions on the left side.

Image description
High Quality Image

Technologies and Specifications

  • Environment: LTS Node.js - 20.10
  • Programming Language: Typescript - Javascript
  • Database: RDBMS - MySQL 8 - Redis (Ioredis)
  • Communication Style: REST API - Kafka Message broker (kafkajs)
  • Backend Framework: Strapi - Koa or NestJS
  • Frontend Framework: Nuxt or Next (or any other framework with SSR capability)
  • Service Contract: Swagger
  • API Test: Postman - Jest

Searching Challenge

In a micro-service architecture, implementing best practices and ensuring data consistency is crucial.
Consider a scenario, where some part of data updated in the inventory service, such as a product being changed and having no quantity. Who is responsible for removing the product from being sold in the storefront?
similarly, when a product price has been changed in the product management service, which part of the system calculates the new price? who delivers the new price to storefront?

when dealing with distributed systems (micro-service here), we have 2 options to ensure data consistency:

  • Normalization: saving different parts of data in the different services.
  • Denormalization: saving a copy of data in another service where the data is needed for some process.

In this system, I used the second option. the main reason behind this decision is necessity for "searching and filtering products".
I've been in many e-commerce projects where "searching and filtering" operation became a significant part of the system. so I prefer to divide the system into smaller parts, but each system should update a central database for products.
Although we may have redundancy for product data, we ensure that searching and filtering will be done efficiently.

Catalog Service

the catalog service is responsible for adding event rules; it saves its data in its database and then sends a copy of updated data to the product service which is responsible for providing product data.

The drawback would be redundant data. we have to keep a large database updated. It seems like we have a monolithic database in the middle of a micro-service architecture, but the point here is that this database and product service do not execute any business logic; they only collect data. All the business logic related to the catalog should be implemented in other services.

Filtering Product, Dynamic Prices

If we had only one place to calculate and display the final product price, a base price and catalog prices would be sufficient. For example, on a single product page where we display the specifications of only one product, including:

Showing the price (or regular price / base price)
Calculated price (or final price)
Base discount (or calculated discount)

We can easily save base price and catalog prices (like category price) in a centralized products database. Then, we are able to perform searches easily. In this case, we can use the Strapi search API to filter and search products based on a range of prices.

*But what about dynamic prices? *
when we show different prices based on dynamic parameters like "customer group" policies, where we show different prices to two different groups of users.
I know, it may sounds a little bit rare, I agree. But we have to deal with this challenge. some of the options are:

1- Performing a search without considering price, and then conducting a second search (price search) to remove unrelated data

it is the first option. for example, if a user selects a range of $20 to $30, we can perform a search without considering the price, then calculate the prices, and filter products within the specific range.
This option is only suitable when we have a large number of products with similar prices.
It could also be implemented in the frontend sometimes, although it's not considered best practice.
The biggest problem with this approach is pagination. When we use pagination, we have a limited number of products on each page. It is possible that when we remove unrelated products from a page, no items remain on that page, and the user has to proceed to the next page. if we don't have a large number of products, we can use a single API call to fetch all products and filter them based on price. in this case, I see no problem with this approach. specially for databases with < 1K records.

2- Using price filter as a middleware

in this approach, we do exactly the opposite. consider you have two different search systems. the first one calculates all product prices and finds the products within that specific price range.
then it passes those product IDs to the next system, which filters final products and do the pagination. it receives the IDs of those products as an argument.

the first advantage of this approach is that we can use pagination. because the pagination will be handled by the framework in the final process. unlike the previous approach which was suitable for searching without pagination, this approach is better suited for paginated search systems.

the drawback of this approach is that calculating prices for all products may be expensive for the system.
it could be a significant process load for the system, specially for systems with a large number of products. consider having to process 200K prices and send their IDs to search system for further filtering!

Another problem could be filtering based on IDs. we may have a landing page where only half of the products have dynamic prices; and other products have only base price.
So, we have to process prices for all products, even if the price could be filtered easily by the framework in the data layer using ORM.

Another disadvantage of this approach could be sorting based on product price. since second searching system only received the IDs of products from previous filter, it doesn't have any access to their prices in this step. therefore, we may have to remove sorting based on price in this approach.

this approach could be suitable when dealing with a medium-sized database, something like 1k to 10k products, specially with a variety of price ranges.

3- Calculating and saving different prices into database

The third approach involves doing the homework sooner.
In this approach, when a dynamic price added for a product, we calculate all possible prices based on different rules.
for example, we may have 10 different prices for a single product, based on factors such as:
1- Customer groups
2- Events
3- Stores
4- Zones
5- Discounts
6- ...

The next step would be applying and considering these rules for incoming search request. you have to automatically extract parameters and apply them to search query.

The advantage of this approach is that filtering will be handled on data layer. the filtering would be done at a low-level part of the system.

Also, another benefit would be centralized data. When you have a source of truth for prices in your data layer, you can use various tools to perform searches (Elasticsearch, Algolia, Cache, etc.).

One of the downsides of this approach could be facing inconsistency and dirty reads.
for example, consider a situation where you updated the price of a product, but due to latency or some specific problem, the price didn't get updated, so the old price is showing in the search results until you implement extra strategies.

Another down side would be saving a large number of unused data. for example you may calculate 20 different prices for a single product, which might not be used in any search.
what will happen to those prices?
when you may have lots of products, it gets worse.

This approach guarantees a fast search and provides a source of truth for a wide range of search systems. however, you should consider the cost of saving a huge amount of unused data.

this approach will work for any number of products, and I can consider it the best practice, as it also enables sorting by price option.

the only the challenge that may arise is when some dynamic prices won't be processed until specific requirements are met.
for example, a product may have a specific price based on another external API, or the product price may be extremely sensitive to user input (e.g., weight of gold or a mix of some materials).
I think for those scenarios, it's always better not to provide batch filtering for price.

Tools and 3rd parties

In order to build and scale the system fast and reliably, we use some tools and outsource some functionalities:

  • Payment: Authorize.net - PayPal
  • Fast Search: Algolia
  • Inventory Management: Zoho Inventory
  • CRM: Zoho CRM
  • Chat and Support: SaleIQ or any other online chat provider
  • Recommendation: MoEngage
  • Analytics: Google Analytics
  • Tag Management: Google GTM

Using these tools isn't mandatory, but they are very helpful for specific tasks, specially if you don't have a large budget and a big team. for small teams (2-5 developers), it's always better to consider using reliable 3rd parties.

Conclusion

Selecting the optimal approach for dynamic pricing in e-commerce micro-service architectures is vital for performance. Consider factors like scalability and data consistency. Share your thoughts and experiences on dynamic pricing in the comments below - your feedback is valuable!

Top comments (0)