DEV Community

Adeline Makokha for AWS Community Builders

Posted on • Edited on

A Symphony of EMR, Glue, SNS, SQS, and API Gateway for Seamless Cloud Orchestration

EMR
Amazon EMR is a service that enables businesses, researchers, data analysts, and developers to easily and cost-effectively process vast amounts of data.
▪ EMR utilizes a hosted Hadoop framework running on Amazon EC2 instances.
▪ Managed Hadoop framework for processing huge amounts of data.
▪ Also support Apache Spark, HBase, Presto and Flink.
▪ Most commonly used for log analysis, financial analysis, or extract, translate and loading (ETL) activities.
▪ A cluster is a collection of EC2 instances provisioned by EMR to run your Steps and A Step is a programmatic task for performing some process on the data

GLUE

AWS Glue is a fully-managed, pay-as-you-go, extract, transform, and load (ETL) service that automates the time-consuming steps of data preparation for analytics.
▪ Used to Organize, cleanse, validate, and format data for
storage in a data warehouse/data lake
▪ Simply point AWS Glue to your data stored on AWS, and AWS Glue discovers data and stores the associated metadata in the AWS Glue Data Catalog.

a

Simple Notification Service (SNS)
▪ Durable, secure pub/sub (Publisher - Subscriber) messaging service
▪ Public service-Simple Notification Service (SNS)
▪ Coordinates the sending and delivery messages
▪ Messages are <= 256kb payloads
▪ SNS topics are the base entity of SNS
▪ A Publisher sends a message to a topic
▪ Topics have Subscribers which receive messages
▪ Egs. Of subscribers could be email, SQS queues, Mobile push notifications, lambda, etc.

b

Simple Queue Service (SQS)
▪ Amazon Simple Queue Service (Amazon SQS) is a web
service that gives you access to message queues that
store messages waiting to be processed.
▪ It is a public service – fully managed
Simple Queue
Service (SQS)
▪ SQS is used for distributed/decoupled applications
▪ Messages up to 256kb in size – if you have a large message, you can store that message in S3 and give the link to the message to the queue.
▪ Polling is the process of checking and retrieving messages from the queue for messages

SQS
▪ SQS uses pull based (polling) not push based.
▪ Messages can be kept in the queue from 1 minute to 14 days
(default is 4 days)
▪ The visibility timeout is the amount of time a message is
invisible in the queue after a reader picks up the message.
▪ If a job is processed within the visibility timeout the message SQS will be deleted.
▪ If a job is not processed within the visibility timeout the message will become visible again.
▪ Dead letter queues can be used for problem messages. E.g. If message remains unprocessed after delivery for a certain
number of times, it can be moved to a dead letter queue for a
different set of processing to be done on it.

c

d

Types of SQS Queues

▪ Queues can be either standard or first-in-first-out
(FIFO).
▪ Standard Queues guarantees at least once delivery
▪ Because standard queues are designed to be massively scalable using a highly distributed architecture, receiving
messages in the exact order they are sent is not guaranteed.
▪ FIFO (first-in-first-out) queues preserve the exact order in which messages are sent and received.
▪ FIFO queues provide exactly-once processing, which means that each message is delivered once and remains available until a consumer processes it and deletes it.
▪ FIFO, 3000 messages per second with batching, or up to
300 messages per second without batching.

Polling in SQS

▪ Billing is SQS is Based on requests
▪ 2 types of polling,
▪ Short polling (immediate)
▪ Long Polling (waitTimeSeconds) – up to 20 seconds.
Recommended
▪ Encryption of messages at rest using KMS
▪ Uses IAM policies to grant access to messages.
▪ Amazon MQ, Amazon SQS, and Amazon SNS are messaging services that are suitable for anyone from startups to enterprises.
If you're using messaging with existing applications and want to move your messaging to the cloud quickly and easily, we
recommend you consider Amazon MQ. It supports industry standard APIs and protocols so you can switch from any standards-based message broker to Amazon MQ without
rewriting the messaging code in your applications. If you are
building brand new applications in the cloud, we recommend you consider Amazon SQS and Amazon SNS. Amazon SQS and SNS are lightweight, fully managed message queue and topic services that scale almost infinitely and provide simple, easy-to- use APIs. You can use Amazon SQS and SNS to decouple and scale microservices, distributed systems, and serverless
applications, and improve reliability.

API Gateway

r

t

Top comments (0)