DEV Community

Cover image for Optimising a web application (III): projecting
Rubén Rubio
Rubén Rubio

Posted on

Optimising a web application (III): projecting

Introduction

With this post, we finish the series of optimising a web application. As a reminder: the application consists of a Symfony application to manage the entrance and exit to buildings using validation of previously purchased tickets.

In the previous post, we performed some optimisations in an application by indexing the database. We improved the performance by a 57 %. However, the absolute response times were of around 300 ms, that we consider too high. In this post, we will investigate further issues and solve them.

Issue

As in the previous case, we relied on New Relic to identify the problem. After reviewing both the endpoints and the database metrics, we found the queries that were causing the issue. The application provides some real-time data for its users: the number of people inside a building and the names of the people inside a building (only for administrators). To provide this information, the application uses tables with logs of all the check-ins and check-outs of users.

So, in order to calculate the number of people there are now inside a building, the application counts the total of check-ins and checkouts along its whole existence, i.e., it scans data from years ago to give the numbers of the present moment. The query it uses is:

SELECT COUNT(event_checkin.id)
FROM event_checkin
WHERE (event_checkin.building_id = :building_id)
  AND (DATE(event_checkin.created_on) = DATE(:date)
  /*...*/
Enter fullscreen mode Exit fullscreen mode

The query is quite complex and used in several paths, affecting business rules, so we decided not to change it and look for alternatives.

Solution

We want to improve the performance for end-users and administrators, as they are using the application to check in/checkout: they need it to be fast to access/leave the building. The rest of paths are not critical in terms of response time.

We actually only want two pieces of data for these critical paths:

  • The number of people inside the building.
  • The names and some extra data, like the email, of the people inside the building (only available for the administrators).

What we can do is have this data prepared: we can act when the user checks-in and checks-out. We will use events to denormalize the data we need and generate projections: we will save together all data we need, so we only have to perform simple queries to get it.

The schema for the check-in will be like this:

Check-in/Checkout schema

When a user checks in, an event is fired to a queue service. Afterwards, two consumers process the event and execute the required action: updating the number of people inside a building or generating a projection with the user data. The checkout case is analogous.

We will only perform the changes to these paths, and leave the rest of paths untouched, so we do not affect the domain logic.

Implementation

The application uses hexagonal architecture, so these two commands, for check-in and checkout, are isolated and well identified in code. We had to perform the following changes:

  • Fire the events from the domain entities when a ticket is checked-in/checked-out.
  • Consume the events and generate/update the projections:
    • Update the number of people inside a building, by increasing or decreasing a value, depending on if the command is a check-in or a checkout, respectively.
    • Insert or delete a row with the data of the use inside a building, depending on if the command is a check-in or a checkout, respectively.
  • Change the paths from the external layer (the controllers) to the domain layer, so it reads the projected data instead of querying the database.

We could use Redis or another persistence system, but we chose to use MySQL to avoid performing more changes to the project. As the project uses hexagonal architecture, it is easy to swap the implementation in the future if required. We used Symfony Messenger with RabbitMQ as broker for messaging. How we set it up is beyond the scope of this post.

In the end, the queries we had to perform were simple and fast. To get the data of the people inside a building, the query becomes:

SELECT total_in
FROM view_building_data
WHERE building_id = :building_id
Enter fullscreen mode Exit fullscreen mode

And the query to get the information of people inside a building:

SELECT *
FROM view_users_inside_building
WHERE building_id = :building_id
Enter fullscreen mode Exit fullscreen mode

Both queries are using primary keys, so they are optimal.

Results

Unfortunately, the application is seasonal, i.e., it is only used in Summer, and we deployed these changes in Autumn. Therefore, we can not validate the results in production until next Summer.

We did, nonetheless, some checks ourselves. Both endpoints' response time was of around 1.1 seconds (with the possibility to get higher as the data grows up). After deploying the changes, the response time decreased to 150ms. As both queries are simple and use primary indexes, we are sure that this response time will always be similar.

Conclusions

We can allegedly affirm that we improved the response time. We will know for sure next Summer (there will be a post about it).

As for future improvements, we could stop using the logs tables everywhere they are used, and analyse that data differently, instead. It will be done if needed.

Summary

  • Again, using New Relic we identified performance issues in the queries of the application.
  • We proposed a solution based on events to decouple the application and project the data we need.
  • We implemented that solution by only changes those paths of the application that are most important for end-users.
  • We could not assert that the changes work in production, as the application is seasonal.

Top comments (0)