DEV Community

Cover image for Speed up your search app with Redis
Thinking out code
Thinking out code

Posted on

Speed up your search app with Redis

Searching apps, browsers, results found, etc; a huge percentage of apps have something related to finding detailed info in the database, in some cases with many requests per minute. This feature came from finding a word to entire rows of information.

Let’s pretend that we have a hypothetical user, called Bob, he has the requirement for a Star Wars application that allows him, to find info about his favorites character from star wars movies, searching by a complete word, Bob needs an application that has an effective time of response and consumes from different sources of data.

We’re talking about a considerable amount of content, including people, planets, starships, and any interesting topic about that movie. Let’s say we are the team behind that development. First of all, let’s clarify the original premise with the flow:

Basic flow app

Basic flow app

As shown in the diagram below, there is one case with 3 possible scenarios:

  • Bob searches for a word, the word is found on the API source, and the result list is returned
  • Bob searches for a word, the word isn’t found on the API source but is found on the database source, and the result list is returned.
  • Bob searches for a word, but the word isn’t found nor database source or API source, no content response is returned.

Bob has his desired application, at least in theory, with a global search around all possible content from his favorite movie with two different data sources.
Let’s bring some code to live, to test this design, I’ll show step by step every component.

Hands-on code

@RestController("/finder")
public class SearchRestController {

    @Autowired
    private SearchService searchService;

    @GetMapping("/")
    public ResponseEntity<List<ItemDTO>> searching(@RequestParam String word) {
        Long start = System.currentTimeMillis();
        Optional<List<ItemDTO>> results = searchService.getListBy(word);
        System.out.println("Searching for " + (System.currentTimeMillis() - start) + " ms");
        return ResponseEntity.of(results);
    }
}
Enter fullscreen mode Exit fullscreen mode

This is a simple controller with one method declared, searching, with GetMapping request, that receives a word through params from request, calls to the SearchService method, and responds with the list of results founds.

@Component
public class SearchService {

    @Autowired
    private ApiClient apiClient;
    @Autowired private StarWarsRepository starWarsRepository;
    @Autowired private ObjectMapper mapper;

    public Optional<List<ItemDTO>> getListBy(String word) {
        return apiClient.findBy(word)
                .or(() -> findAllByName(word));
    }

    private Optional<List<ItemDTO>> findAllByName(String word) {
        List<Item> items = starWarsRepository.findAllByName(word);
        if (items.isEmpty()) return Optional.empty();
        List<ItemDTO> itemDTOS = new ArrayList<>();
        items.forEach(item -> itemDTOS.add(mapper.convertValue(item, ItemDTO.class)));
        return Optional.ofNullable(itemDTOS);
    }
}
Enter fullscreen mode Exit fullscreen mode

It receives a word from the controller and:

  • Call the API client.

    @Component
    public class ApiClient {
    
        @Autowired
        private RestTemplate restTemplate;
        static final String API_URL = "https://swapi.dev/api/people";
    
        public Optional<List<ItemDTO>> findBy(String word) {
            ResponseEntity<ResultsDTO> responseEntity = restTemplate
                    .getForEntity(API_URL.concat("?search=").concat(word), ResultsDTO.class);
            if (Objects.requireNonNull(responseEntity.getBody()).getResults().isEmpty())
                return Optional.empty();
            return Optional.of(Objects.requireNonNull(responseEntity.getBody()).getResults());
        }
    }
    

    It executes a request to the StarWars Api with the search.

  • If no result has returned, it executes a call to the database repository.

    @Repository
    public interface StarWarsRepository extends CrudRepository<Item, Long> {
    
        List<Item> findAllByName(String word);
    }
    

    It searches for similar terms in the table called star_word. In this case, it will be using a simple h2 database due to the concept test, you could implement it with another type of SQL driver. To fill this database you have to execute the next script. You could adapt it as you need it.

    insert into item(name, height) values('name1', '121');
    insert into item(name, height) values('name2', '121');
    insert into item(name, height) values('name3', '121');
    insert into item(name, height) values('name4', '121');
    insert into item(name, height) values('name5', '121');
    

The structure of the principal object would be:

public class Item {
    private Long id;
    private String name;
    private String height;
}
Enter fullscreen mode Exit fullscreen mode

Nothing crazy, just basic fields. Helped by Lombok annotations.

Our pom file has declared the next dependencies:

       <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-data-redis</artifactId>
       </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>
        <dependency>
            <groupId>com.h2database</groupId>
            <artifactId>h2</artifactId>
            <scope>runtime</scope>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-data-jpa</artifactId>
        </dependency>
        <dependency>
            <groupId>org.projectlombok</groupId>
            <artifactId>lombok</artifactId>
            <version>1.18.24</version>
            <scope>compile</scope>
        </dependency>
        <dependency>
            <groupId>redis.clients</groupId>
            <artifactId>jedis</artifactId>
            <version>3.9.0</version>
        </dependency>
Enter fullscreen mode Exit fullscreen mode
  • Jedis: a Java client for Redis.
  • Spring-boot-starter-data-redis: Provides easy configuration and access to Redis from Spring Apps, offering low-level and high-level abstractions for interactions with the database without architecture worries.
  • h2 In-memory database.
  • Lombok: Oriented to decorator pattern to avoid boilerplate code.
  • Spring-boot-starter-data-jpa: Implements a data access layer.

Having those configurations and custom classes, now the application had covered the case of the basic design. Let’s see how would be the performance in terms of time(ms) of the search process, having it “up and running”, in the next 2 possible scenarios:

  • API

    curl http://localhost:8080/?word=luke
    
    {
        "name": "Luke Skywalker",
        "height": "172"
    }
    
    Searching for 1237 ms
    
  • Database

    curl http://localhost:8080/?word=name1
    
    {
        "name": "name1",
        "height": "121"
    }
    
    Searching for 1289 ms
    

After executing those cases, take a look at the next table:

Case API Datasource(ms) Database source(ms)
Results found(time response) 1237 1289

As is shown, the database source takes a little more time than the API source. The response time is not a big thing right now but what could happen when the database makes it bigger? or there are not only two sources of data(API, database), but a lot more? Our app could make better searches?

The answer is yes, but it needs to take into count what corner cases it has right now. In the next implementation, we’ll see one of them.

The problem of search the same thing over and over again

What about the case of one o more Bob users, searching by the same word, with many requests but the same word to search? Is it worth the effort of making a searching process for every similar term?

Taking this scenario as an example, let's think about how could improve the performance, in that case, to avoid ping to several API or data sources when a term has previous searching tries, and ping to a source in common that stores this similar searches with a lesser time response, something like that:

Search design with a third source feature.

Search design with a third source feature.

The third data source act like cache storage, understanding as cache as a temporary store.

A possible new improvement…

Same one case than the beginning but adding 2 additional steps:

  • Bob searches for a word, and the word was found on an API source, the result is stored in a third source, and it is returned.
  • Bob searches for a word, the word wasn’t found on the API source but is found on the database source, the result is stored in a third source, and it is returned.
  • Bob searches for a word, the word was found on a third source, and the result list was returned.
  • Bob searches for a word, but the word wasn’t found nor database source nor API source, or third source, no content response is returned.

There are no additional steps when data was searched before because it was stored in the first request, being available for after requests, avoiding repetitive searches, and improving the app performance.

It looks simple, but what would be the characteristics that must have this third data source?

  • Low latency.
  • Capable of saving and returning data quickly.
  • Searching by word.
  • Native integration with our tech stack.

An option that probably fits with these specs, would be Spring cache, which offers different options for dealing with this type of design, allowing storing and getting data for recent results with useful configuration related and a minimal time of response.

Making it real with Spring and Redis cache integration

Spring cache is a configuration that uses proxies pattern properties to interrupt request data flow, adding a third component with Redis, acting as cache storage, due to the nature of Redis, easy integration, and fast response time.

Redis also :

  • Can perform more than 11000 sets per second and more than 8000 gets per second.
  • Due to the principle of no schemas, in this kind of database(key-value), it doesn’t need a strong definition of the objects to store, could be starting from a simple string value to a POJO.
  • Has a property to expire values stored, making it temporarily store data.
  • Easy indexes configuration.

Updating our previous design, we have the following:

Redis, representing our cache storage

Redis, representing our cache storage

Now, we have the whole picture, with Redis in it as our third data source.

Let’s update our code

Adding EnableCaching annotation to the main class

@SpringBootApplication
@EnableCaching
public class GrapefruitApplication {

    public static void main(String[] args) {
        SpringApplication.run(GrapefruitApplication.class, args);
    }
}
Enter fullscreen mode Exit fullscreen mode

This annotation is responsible of register related components to cache management like the CacheInterceptor and other proxies that allows @Cacheable works. Spring has the next annotations available:

  • @CacheEvict evict a mapping based in a key.
  • @CachePut causes the method to be invoked and its result to be stored in the associated cache related to the condition() and unless() expressions.
  • @Caching grouping cache annotations.
  • @Cacheable indicate that the result of invoking a method or all method in a class can be cached.

How works @Cacheable in terms of Redis?

It executes a get-by key to the Redis database, if it doesn’t return a result, it executes a put command to store the newly found result after executing the method invocation. Redis allows the storing of null results by key but this property could be set by the previous configuration.

Let’s see how to look at the service method after updating it.

    @Cacheable(value = "itemCache",
            key = "{#word}", unless="#result == null")
    public Optional<List<ItemDTO>> getListBy(String word) {
        return apiClient.findBy(word)
                .or(() -> findAllByName(word));
    }
Enter fullscreen mode Exit fullscreen mode

As I said before, there is a key that represents the index of objects stored in Redis, the value property represents the name of cache storage in Redis, and the unless the property is for preventing null values.
In the same idea, there are RedisCacheManagerBuilderCustomizer and RedisCacheConfiguration beans, which allow set expiring time for cache values, serialization strategy, etc.

        @Bean
    public RedisCacheConfiguration cacheConfiguration() {
        return RedisCacheConfiguration.defaultCacheConfig()
                .entryTtl(Duration.ofMinutes(60))
                .disableCachingNullValues()
                .serializeValuesWith(
                        RedisSerializationContext.SerializationPair.fromSerializer(new GenericJackson2JsonRedisSerializer()));
    }

    @Bean
    public RedisCacheManagerBuilderCustomizer redisCacheManagerBuilderCustomizer() {
        return (builder) -> builder
                .withCacheConfiguration("itemCache", cacheConfiguration());
    }
Enter fullscreen mode Exit fullscreen mode

Setting up this with the Redis configuration file(application.yaml).

spring:
  datasource:
    url: jdbc:h2:mem:starwars
    driverClassName: org.h2.Driver
  jpa:
    defer-datasource-initialization: true
  redis:
    host: yourHost
    port: yourPort
    username: yourUsername
    password: yourPassword
Enter fullscreen mode Exit fullscreen mode

You could use a local environment for Redis or the Redis Lab environment with the free plan that allows 30MB of storage with high availability and also is easy to set up. In this link, there are some references on how to connect with the Redis cloud environment that you could find useful.

Having our application updated, with all the configurations before, it only takes some test to see how works this new adding feature:

Let’s try with a luke word search:

curl http://localhost:8080/?word=luke

{
    "name": "Luke Skywalker",
    "height": "172"
}

Searching for 1237 ms
Enter fullscreen mode Exit fullscreen mode

Let’s try again with the same searching:

curl http://localhost:8080/?word=luke

{
    "name": "Luke Skywalker",
    "height": "172"
}

Searching for 175 ms
Enter fullscreen mode Exit fullscreen mode

175 ms! More than 70% of improvement.

Happy Lisa Simpson GIF - Find & Share on GIPHY

Discover & share this The Simpsons GIF with everyone you know. GIPHY is how you search, share, discover, and create GIFs.

giphy.com

Look what we have here! the response time is lower than its preview results by more than 70 percent, and of course, the last request was executed in a previews step, so the request is redirected through our cache interceptor, which stored the previous data list result.

Updating the previous comparator table, there is the following:

Case API Datasource(ms) Database source(ms) Redis source (ms)
Results found 1237 1289 175

If we make a comparison in terms of big o notation efficiency, for a case with previously requested searches, we have an O(1) result, making an improvement in terms of time and steps because there are no additional steps when a search has been requested before.

Think about how this could improve content search, dictionary apps, browsers, and wikis explorer.

Oh yeah! This looks really good but what about the cons?

Well, yes, there is a disadvantage that I would like to mention; this feature cannot be applied in the case where our data sources are constantly changing, because the cache store doesn’t have the last version of them, only the last snapshot when were requested by the user. If we talking about banking accounts and their related balance, this could be a problem. Another case that could be a problem is when you are checking a tracking application about delivery service, this cannot show the recent state of your shopping.

Or when you are searching for items in a catalog of an e-commerce site, you need that item stocks to be updated for not to buy stuff without inventory. An improvement that could fit with this type of app, is recent searches, suggesting to the user based on its first letters written where the key of cache object would bet those starting letters and the object related to the items list.
Surely there are a few more disadvantages coming from the cache store, but like most of the tools that are currently used, this is not a golden hammer to fix any problem but a tool for a particular case.

So, resuming this in a nutshell

  • A common search engine that could be found in many apps.
  • Identify a way to make it faster.
  • Find a third factor, a data source, that could help to achieve it.
  • Use Redis mixed with a known framework like Spring.
  • Whoala! Run and see results.

Here is the app repo.

GitHub logo JesusIgnacio / grapefruit

Redis with Spring cache for fast search results

Note: Yes, I called it Grapefruit , nothing to do with the app objective but I found refreshing to call my repos as fruits.

This is all by now, I hope you could enjoy this article as I enjoy writing it, and thank you for your time in reading it.
If you have any questions, please let me know.
Best to you!

This post is in collaboration with Redis.

Top comments (0)