DEV Community

Cover image for How Apache Solr Enables Precise Product Search on Professional Web Shops
Mikael Araya for Unchained Commerce

Posted on

How Apache Solr Enables Precise Product Search on Professional Web Shops

Thumbnail

Introduction

In the highly competitive world of e-commerce, providing a seamless and efficient shopping experience for customers is paramount to success. One of the most important aspects of this experience is having a strong search function that allows customers to easily find the products they are looking for. Research shows that customers who use site search are more likely to make a purchase and have a higher average order value than those who do not.

However, implementing an efficient and accurate search function on an e-commerce site can be challenging, particularly for e-commerce backends with large products. That's where Solr comes in. Solr is a search engine built on top of Apache Lucene, which uses advanced algorithms and data indexing techniques to deliver fast, accurate, and relevant search results.

Solr is designed to handle the unique challenges of e-commerce and other content-heavy websites, making it an excellent solution for businesses looking to improve their site's search capabilities. By using Solr, e-commerce businesses can provide their customers with a user-friendly and efficient search experience, leading to increased customer satisfaction and sales.

In this article, we'll explore how Solr can help you boost your site's search capabilities and drive sales. We'll start by explaining what Solr is and how it works, including its architecture and components. We'll then discuss the specific ways Solr can help improve product search on unchained e-commerce backends, such as through faceted search and query suggestions. Additionally, we'll cover the benefits of using Solr for e-commerce searches, including its scalability and fault tolerance.

We'll also provide practical tips and best practices for implementing Solr on your site, including steps for data indexing and configuration, as well as how to optimize Solr performance and troubleshoot common issues. Furthermore, we'll offer case studies of e-commerce sites that have successfully used Solr to improve their search capabilities and drive sales.

By the end of this article, you'll have a better understanding of how Solr can help improve your site's search capabilities and drive sales. You'll also have the practical knowledge and resources to implement Solr on your site, making it easier for your customers to find and purchase the products they want.

What is Solr and how does it work?

Solr is a highly scalable, fault-tolerant, and open-source search engine designed to handle complex data structures, including those found in e-commerce and content-heavy websites. Solr is built on top of Apache Lucene, which is a powerful search engine library that provides the foundational building blocks for Solr's capabilities.

One of the key benefits of Solr is its ability to provide fast and accurate search results. Solr achieves this through a combination of data indexing and querying techniques designed to optimize search performance. When data is indexed in Solr, it is first analyzed to identify the most relevant and important fields, such as product names, descriptions, and categories. Solr then creates an inverted index, which is a data structure that maps keywords to the documents that contain them. This process is called tokenization and it converts text into individual terms, known as tokens, which are stored in the inverted index.
When a user enters a search query, Solr uses the inverted index to quickly identify the documents that are most relevant to the query. Solr then ranks the results based on a variety of factors, including keyword frequency, proximity, and relevance. This process is called scoring and it helps ensure that the most relevant results are returned first.

Another key feature of Solr is its support for faceted search, which allows users to refine their search results based on specific criteria. For example, a user might search for "red shoes" and then use a faceted search to filter the results by size, brand, or price range. Solr achieves this by using a technique called faceting, which involves grouping search results into categories based on specific fields or attributes of the data. Faceting can be applied to both structured and unstructured data, making it a powerful tool for e-commerce sites that have complex data structures.

In addition to faceted search, Solr also provides support for spell-checking and query suggestions. When a user enters a search query that contains a misspelled word, Solr can suggest alternative spellings or related keywords based on the data it has indexed. Solr achieves this through a combination of techniques, including fuzzy matching and phonetic analysis. This helps ensure that users can still find what they are looking for, even if they make a mistake when entering their search query.

Solr also offers a range of configuration options and customization capabilities, which allow businesses to tailor the search experience to their specific needs. For example, businesses can configure Solr to weight certain fields more heavily in search results, or to prioritize certain products or categories over others. Solr can also be integrated with a variety of third-party tools and services, including content management systems, analytics platforms, headless e-commerce engines like unchained and more.

Benefits of Using Solr for E-Commerce Search

Solr is a powerful search engine that can significantly enhance the search capabilities of e-commerce websites. In this section, we will discuss the benefits of using Solr for e-commerce searches.

  1. More Accurate Search Results:
    Solr's advanced search algorithms enable it to deliver more accurate search results compared to traditional search engines. This is because Solr uses data indexing and querying to match user queries with relevant products. Solr's faceted search feature also allows users to filter search results based on specific product attributes such as color, price, and size.

  2. Scalability and Fault-tolerance:
    Solr is designed to be highly scalable and fault-tolerant, making it a reliable choice for large e-commerce sites. Solr can handle large volumes of data and can be easily scaled up or down depending on the website's needs. Additionally, Solr's fault tolerance ensures that search functionality is not disrupted in case of system failures or downtime.

  3. Multilingual Search Capabilities:
    With Solr, you can implement multilingual search capabilities on your e-commerce site, which can help attract and retain customers from diverse linguistic backgrounds. Solr provides support for multiple languages out of the box and allows for advanced language processing features such as stemming, synonyms, and more. It understands the language that it indexes gramatically. So a search for "some books for" will stemm to "book" and show you the same results as searching for "book". It can also split up German words like "Massgeschneidert", indexing it as "mass" "schneidern" for example so it can also find text that contains "ein auf mass geschnidertes hemd".

  4. Customizability and Flexibility:
    Solr is highly customizable and flexible, allowing you to tailor the search experience to your specific e-commerce site requirements. You can customize the search algorithm, indexing, and ranking techniques to ensure that the most relevant products appear at the top of search results.

A Short Introduction to Unchained Engine for E-Commerce

Unchained E-commerce Engine is an open-source e-commerce platform that is designed to provide a flexible and scalable solution for online retailers. It was developed by a team of experienced e-commerce professionals with the goal of providing a platform that is easy to use, customizable, and capable of handling high levels of traffic and transaction volume.

One of the key features of the Unchained E-commerce Engine is its modular architecture and the strict "code is law" methodology, which allows retailers to add and remove functionality as needed and implement even the most complex discounting and pricing logics you can think of without boundaries.

This makes it easy to customize the platform to meet the specific needs of a business, and also ensures that the platform can be easily scaled up or down as traffic and transaction volume fluctuates.

How Solr Improves Product Search on Unchained E-commerce Backend

Unchained engine is a code-first e-commerce engine that is designed to handle most of an e-commerce transactional needs. However, it only provides full-text search out of the box. This means that the search capabilities of an unchained e-commerce backend can be limited, which can negatively impact the user experience and ultimately lead to lost sales.

This is where Solr comes in as a powerful search engine that is specifically designed to address the challenges of implementing search functionalities on unchained e-commerce backends. Solr provides a range of features that can improve the search capabilities of an unchained e-commerce backend and enhance the overall user experience.

One of the key benefits of using Solr with an unchained e-commerce backend is its advanced spell-checking and query suggestion capabilities. With Solr, users can find what they are looking for even if they make a spelling mistake or use a synonym. This is particularly important for e-commerce sites that have a large and diverse product catalog, as users may use different terms to describe the same product.

In addition, Solr supports a range of customization options that can further improve search accuracy. For example, businesses can boost or prioritize certain products or categories in search results, which can be particularly useful for promoting seasonal products or best-sellers. This level of customization can help businesses to provide a more personalized and relevant search experience for their customers.
Another powerful feature of Solr is its faceted search capabilities, which allow users to filter search results based on different attributes such as price, brand, or color. This can help users to quickly and easily find the products they are looking for, even if they are unsure of the exact product name or description.

Solr's scalability and fault tolerance also makes it a reliable choice for large e-commerce sites. As the size of an e-commerce site grows, so does the complexity of the search queries and indexing required to support it. Solr's distributed architecture allows it to scale with the growing demands of an e-commerce site, while its fault tolerance ensures that search functionality remains available even in the event of hardware or software failures.

Finally, Solr has a proven track record of success in improving the search capabilities of e-commerce businesses using the unchained engine as their backend. For example, Publicare a medical supply retailer implemented Solr to improve its site search and saw an increase in conversion rates. These demonstrate the value of implementing Solr to improve the search capabilities of an unchained e-commerce backend.

In summary, Solr is a powerful search engine that can greatly enhance the search capabilities of an unchained e-commerce backend. Its advanced spell-checking and query suggestion capabilities, faceted search options, scalability and fault-tolerance, and proven track record of success make it an excellent choice for e-commerce businesses looking to improve their search functionality and drive sales.

How to Implement Solr on Your Unchained engine backed E-commerce Site

Implementing Solr on your unchained engine-backed e-commerce site involves several steps that require careful planning and consideration. By organizing your product data into a format that Solr can understand and writing a filter plugin to integrate the power of Solr into your unchained engine, you can provide more accurate and relevant search results, which can increase engagement and drive sales.

The first step in implementing Solr is to decide whether to use a hosted Solr service or set up Solr on your own servers. Hosted services can be a good option for smaller e-commerce businesses or those with limited technical resources, as they allow you to outsource the infrastructure and maintenance of your search engine. However, larger businesses with more complex requirements may prefer to set up Solr on their own servers for greater control and customization.

Once you've decided on your implementation approach, the next step is to create a Solr core, which is a collection of data that Solr will use to generate search results. This can be done through the Solr web interface, through a command-line tool, or using Solr API endpoints. You'll need to specify the schema for your data, which defines the fields and data types that will be indexed by Solr.
In this article, we will use the Solr API approach to manage the Solr instance and below you will find some code snippets used to perform just that.

const { SOLR_URL } = process.env;
const createCollection = async () => {
 const url = `${SOLR_URL}/api/collections`;
 const response = await fetch(url, {
   method: 'POST',
   mode: 'cors',
   cache: 'no-cache',
   credentials: 'same-origin',
   headers: {
     'Content-Type': 'application/json',
   },
   redirect: 'follow',
   referrer: 'no-referrer',
   body: JSON.stringify({
     create: { name: 'products', config: 'publicare', numShards: 1 },
   }),
 });
 return response.json();
};
Enter fullscreen mode Exit fullscreen mode

The above snippet will create a collection of products to store indexes of product on Solr instance. In this case, we are using the create action available in Solr.

Because unchained provides a graphql API endpoint we will be using graphql tools to extract the product information from the engine.

After creating the core, you'll need to index your data. This involves extracting and organizing the product information stored in your unchained e-commerce engine database into a format that Solr can understand. There are several ways to do this, including using a data import handler or a custom script to send the data to the configured Solr endpoint. You'll also need to configure the data source, which tells Solr where to find the data to be indexed. Since we will be using the products data store we created above for this case.

import gql from 'graphql-tag';
import { linkToExecutor, createServerHttpLink } from '@graphql-tools/links';
import localizedProducts from './queries/localizedProducts';
const BATCH_SIZE = 250;
const link = createServerHttpLink({
 uri: process.env.GRAPHQL_ENDPOINT,
});
const request = linkToExecutor(link);
const { SOLR_URL } = process.env;
const updateSolr = async (body: any) => {
 if (!SOLR_URL) return false;


 const response = await fetch(`${SOLR_URL}/solr/products/update?commit=true`, {
   method: 'POST',
   mode: 'cors',
   cache: 'no-cache',
   credentials: 'same-origin',
   headers: {
     'Content-Type': 'application/json',
   },
   redirect: 'follow',
   referrer: 'no-referrer',
   body: JSON.stringify(body),
 });
 if (response.status !== 200) {
   return false;
 }
 return response.json();
};


async function productsBatch(offset = 0) {
 const { data } = await request({
   document: gql(localizedProducts),
   variables: {
     limit: BATCH_SIZE,
     offset,
     filter: {
       published: true,
       title: { $ne: '' },
     },
     frFilter: {
       published: true,
       title_fr: { $ne: '' },
     },
     itFilter: {
       published: true,
       title_it: { $ne: '' },
     },
   },
 });
 const prepared = data.map((item: any) => {
   return {
     id: item._id,
     collection_s: 'products',
     title_txt_de: item?.de?.newsletterTitle,
     subtitle_txt_de: item?.de?.subTitle,
     slug_txt_de: item?.de?.slug,
     title_txt_fr: item?.fr?.newsletterTitle,
     subtitle_txt_fr: item?.fr?.subTitle,
     slug_txt_fr: item?.fr?.slug,
     title_txt_it: item?.it?.newsletterTitle,
     subtitle_txt_it: item?.it?.subTitle,
     slug_txt_it: item?.it?.slug,
     last_modified: new Date().toISOString(),
   };
 });
 const importedIDs = await updateSolr(prepared as any);
 if (data.length) {
   return [...importedIDs, ...(await productsBatch(offset + BATCH_SIZE))];
 }
 return importedIDs;
}

const updateProductsSearchIndex = async () => {
   await productsBatch(0);
};
Enter fullscreen mode Exit fullscreen mode

Once your data is indexed, you'll need to configure Solr to ensure that it is optimized for your specific e-commerce site's needs. This may involve configuring search fields, synonyms, stop words, and other parameters that affect how Solr processes search queries. It's important to test and refine your Solr configuration to ensure that it produces accurate and relevant search results.

To optimize Solr performance, there are several best practices you can follow. One tip is to monitor search analytics to identify patterns and trends in user behavior, which can help you fine-tune query parameters and boost certain products or categories in search results. Another tip is to use caching to reduce query times and improve overall search performance. You may also want to consider implementing features like faceted search, spell-checking, and query suggestions to improve the user experience and increase engagement.

There are many resources available for learning more about Solr and troubleshooting common issues. The Apache Solr website provides comprehensive documentation, tutorials, and forums where users can ask questions and get support. Additionally, there are many online courses and communities dedicated to Solr, such as Solr Training, which offers in-depth training on Solr implementation and optimization.
Finally, after indexing and optimizing Solr, you will need to create a custom Filter plugin to extend the default unchained full-text-based product search plugin to use Solr instead.

import { FilterDirector, FilterAdapter } from '@unchainedshop/core-filters';
import type { IFilterAdapter } from '@unchainedshop/types/filters';
import doProductsSearch from '../lib/search/doProductsSearch';
const SolrSearch: IFilterAdapter = {
 ...FilterAdapter,


 key: 'solr-products-search',
 label: 'Solr Product search',
 version: '1.0',
 orderIndex: 1,


 actions: (params) => {
   return {
     ...FilterAdapter.actions(params),
     async searchProducts({ productIds }) {
       const { queryString } = params.searchQuery;
       if (!queryString) return productIds;
       const language = params.localeContext?.language || 'de';
       const docs = await doProductsSearch(queryString, language, {
         log: this.log,
       });
       return docs.map((doc) => doc.id);
     },
   };
 },
};
FilterDirector.registerAdapter(SolrSearch);
Enter fullscreen mode Exit fullscreen mode

Conclusion

In conclusion, implementing an efficient search function is critical for any e-commerce site looking to drive sales and retain customers. While there are many search engines available in the market, Solr stands out as a powerful open-source search engine that provides fast and accurate search results, even when dealing with large amounts of data.

One of the main advantages of Solr is its highly customizable search functionality. With Solr, you can easily configure your search parameters to provide highly relevant results to your users. It provides advanced search features, such as faceted search, which allows your users to easily filter and refine their search results based on specific attributes like price, category, color, and more.

Additionally, Solr offers powerful spell-checking and query suggestion features that can help users find what they're looking for even if they misspell a keyword or use a term that is not commonly used. This helps improve the user experience and ultimately drives more sales.

Another important benefit of Solr is its scalability. With Solr, you can easily index and search large volumes of data in real time, providing fast and accurate search results to your users, even during peak traffic periods.

Setting up and configuring Solr may require some technical knowledge, but there are many resources available online that can help you get started. Once set up, Solr offers a range of tools for monitoring and optimizing your search performance, such as query analysis and optimization, monitoring search analytics, and tuning query parameters.

In summary, Solr is a highly customizable, scalable, and powerful search engine that can help e-commerce sites improve their search capabilities, drive more sales, and ultimately increase customer loyalty. As a CTO, investing in Solr can be a valuable addition to your e-commerce tech stack. Take the time to research and explore Solr, and see how it can help take your e-commerce site's search capabilities to the next level.

Top comments (2)

Collapse
 
oculus42 profile image
Samuel Rouse

Hi Mikael!

Thanks for cross-posting your article about Solr to DEV! My experience with Solr is limited, but I have worked with teams that leveraged it at billion-dollar-scale e-commerce sites, so it's exciting to see more articles about it.

Please forgive me if I am mistaken, but it appears this article may have leveraged generative AI. DEV's Guidelines for AI-assisted Articles include the following:

AI-assisted and -generated articles should not…
Promote any business, program, or course (including your own).

I am not certain if this article is against these guidelines, but I wanted to warn of it, just in case.

Collapse
 
mikearaya profile image
Mikael Araya

Hi Samuel,

Thank you for showing interest in this article. I understand your concern but this article is not intended for a promotion purpose, I'm just sharing my real-world experience working with SOLR.
Please feel free to reach out if you think this is breaking any guidelines and I'll gladly make the changes necessary to make it compliant.