DEV Community

Cover image for My AWS SQS Requests Skyrocketed to 1 Million at Month's Start. This is How I Implemented a Cost-Effective Solution.
Ganesh Kumar
Ganesh Kumar

Posted on

My AWS SQS Requests Skyrocketed to 1 Million at Month's Start. This is How I Implemented a Cost-Effective Solution.

We started using Amazon SQS for our push notifications, triggering emails, etc but the way it handles polling led to a big increase in usage. To address this and potentially save on cloud costs, we switched to Redis. Here’s our story.

At Hexmos, we are building innovative products like Feedback By Hexmos and FeedZap.

To keep our systems efficient and adaptable, we embraced a microservices architecture. This means we built separate, smaller services that work together seamlessly.

One of these services is a unified Identity Service, which manages user accounts and payments across all our products.

But behind the scenes, ensuring smooth communication between different parts of our system can be a challenge.

We leverage a microservices architecture, which breaks down our application into smaller, independent services that work together seamlessly.

However, coordinating communication between these services, especially for time-sensitive tasks like sending emails, requires a robust and reliable solution.

This article explores how we overcame the limitations of our initial approach and discovered a powerful solution with Redis Streams.

We'll delve into our challenges, why traditional message queuing systems weren't the perfect fit, and how Redis Streams helped us build a resilient notification system that keeps users informed and engaged.

Our Existing Microservices Architecture

Building Scalable Systems with Microservices

Our current architecture involves two services:

  1. Leave Request API:
    • Processes leave requests.
    • Creates a message containing relevant data.
    • Pushes the message to the message queue.
  2. Email API:
    • Processes received messages by sending appropriate emails.

Direct communication between these services can cause problems:

  • Tight Coupling: Changes in one service impact the other, hindering independent development.
  • Data Loss: If the Email API is down during signup, the email notification might be lost. Customers might not receive verification emails or leave notifications.

 Centered Image

The Identity Service: A Foundation for User Experience

Why Two Services?

Explaining the use of microservices in our architecture:

  • Backend dedicated to product: Handles core functionalities of the product.
  • Backend dedicated to emails and notifications: Manages all email and notification-related tasks.

This separation allows each service to be developed, deployed, and scaled independently, increasing the system's resilience and maintainability.

AWS SQS: A Step Towards Decoupling

We integrated AWS SQS as a connection between these services:

  1. Leave Request API:
- Processes leave requests.
- Creates a message (e.g., JSON object) containing relevant data (user details, leave type, duration, etc.).
- Pushes the message to the message queue.
Enter fullscreen mode Exit fullscreen mode
  1. Message Queue (SQS) :
    • Stores the message until it's processed.
    • Provides reliability and durability guarantees.
    • Can handle varying message rates, ensuring system scalability.
  2. Email API:
    • Continuously polls the message queue for new messages.
    • Processes received messages by sending appropriate emails.

 Centered Image

If the Consumer BE (Email API) is down, messages are queued and not lost. Once the Consumer BE is back online, it processes the queued messages and sends email notifications using data from SQS.

Benefits of Using a Message Queue

  • Decoupling: Services become independent, improving maintainability and scalability.
  • Reliability: Messages are persisted in the queue, ensuring delivery even if the Email API is temporarily unavailable.
  • Performance: The Leave Request API can process requests faster without waiting for email delivery.
  • Scalability: Each service can be scaled independently to handle increasing load.
  • Error Handling: Implement retry mechanisms and dead-letter queues to handle failed message processing.  Centered Image

AWS SQS Challenges and Finding Perfect Alternatives

As We integrated SQS to 7 different services we started depleting usage limits Due to Free Tier Limitations.

AWS SQS Free Tier Limitations

While AWS SQS offers a convenient solution for message queuing, it presents some limitations for our specific needs:

Free Tier Limitations: The free tier offered by SQS restricts the number of requests, hindering the scalability of our growing application.

image

Hidden Costs: Exceeding the free tier results in significant cost increases, potentially impacting our budget.

image

Finding a Temporary Fix for It

Amazon SQS offers Short and Long polling options for receiving messages from a queue.

Short polling (default) – The ReceiveMessage request queries a subset of servers (based on a weighted random distribution) to find available messages and sends an immediate response, even if no messages are found.

Long polling – ReceiveMessage queries all servers for messages, sending a response once at least one message is available, up to the specified maximum. An empty response is sent only if the polling wait time expires. This option can reduce the number of empty responses and potentially lower costs.

For Temporary solutions, we choose Long polling. But this didn't solve Our Problem.

 Centered Image

 Centered Image

Why Pub/Sub Wasn't the Perfect Fit

We explored alternative message queuing solutions, including Pub/Sub. However, Pub/Sub wasn't a suitable choice due to specific requirements:

Pros of Pub/Sub

Scalability: Pub/Sub systems are designed to handle a large number of publishers and subscribers efficiently.

Decoupling: It promotes loose coupling between systems, as publishers and subscribers don't need to know about each other.

Flexibility: Pub/Sub can be used for various messaging patterns, including publish-subscribe, fan-out, and request-reply.

Cons of Pub/Sub

Message Loss: Pub/Sub systems typically don't guarantee message delivery, especially if subscribers are offline or unable to process messages.

Message Ordering: Message order is not guaranteed in most Pub/Sub systems, which can be a limitation for certain applications.

Complexity: Managing subscribers and handling message delivery can be complex, especially for large-scale systems.

Latency: While Pub/Sub is generally fast, it might not be suitable for applications with strict real-time requirements.

The search for a viable alternative led us to Redis Streams, a powerful data structure within the Redis database.

Continue Reading : Article

Top comments (2)

Collapse
 
youngfra profile image
Fraser Young

This is an insightful post! Could you elaborate more on the specific challenges faced with Pub/Sub that made Redis Streams a better alternative for your use case?

Collapse
 
ganesh-kumar profile image
Ganesh Kumar

Thank you,
Pub/Sub doesn't fit for is because If the consumer is down
the messages published by the Leave Request API are lost forever
So, we can use Redis Stream with XREAD with blocking functionalities.