Vaibhav Namburi

Posted on May 28, 2020 • Edited on Jan 5, 2021 • Originally published at buildingstartups.co

7 Vital AWS Concepts easily explained

#aws #beginners #javascript #backend

Let's face it, AWS can make you pull your hair out if you don't understand what's happening.

Scratch that, that's programming in general.

What I'm about to share with you is basically what I wish I knew 4 years ago when I was working at a company as the only dev and they told me these exact words:

"Hey V, We've decided to move to AWS, and the old dev quit, can you help"

Seems like a straightforward sentence, but what followed was a lot of stress. Stress because as someone who always did front end and some backend work, I wasn't fully aware of deployment infrastructures or devops systems

So this quick, and (what I think) simple guide, is to give you, an overview of AWS (conceptually) that I wish I had when I started - this is not a setup tutorial (that will come later)

40 Apps deployed, millions of requests maintained and an AI startup later, here we go:

What is an EC2? How does it work?

This is one of the building blocks of AWS. You will definitely interact with an EC2 instance at some point in your AWS journey provided you're not going completely serverless (more on this later).

EC2 stands for Elastic Cloud Compute, and it's an AWS service that provides you with a server (like a box, a MacBook without a screen) to run your application. You get to decide all sorts of configurations, memory, box size and power. But in short, it is a server with a public IP address (if you want it to be public) as well as an HTTP address

Once you build an EC2 instance, you can access it by SSHing into the box, i.e the equivalent of username and password into the server. Once inside you can do anything you want in a server

Run node jobs
Do a hello world application
Launch a server
Route your server localhost:3000 to the outside world using NGINX

PS if you're wondering how the configuration is set up, AWS has this concept called Amazon Machine Images, which are basically "blueprints" for server configurations

You might wonder, who decides what data goes in/out of the server and this is dependant upon the security group your EC2 belongs to as well as the VPC ACL (this will be in a follow up blog)

PPS: With EC2 you can also run a "spot server", let's say you want to do a job once a week but don't want to pay for the server the whole time, a spot server basically turns on, charges you for the time it's operating, performs the task and then turns off. Saving you $$$

AWS S3

S3 is fantastic if you treat it right. Amazon S3 stands for Amazon Simple Storage Service (hope you're picking up their vibe with numbers in the abbreviations)

S3 is a programmatic dropbox. You can upload photos, videos, JSON, gzips, entire frontend web projects and let it get served via a public URL. It is also used for holding versions of your server when you're trying to auto-deploy your server using github or bitbucket (more on this later) - basically, it can host a heap of different s**t

The most common uses I've had for S3 have been 2 fold. One to host assets uploaded by users (if your customers upload a profile photo etc for example) and the second to serve my actual frontend website.

See S3 has this magical feature where it lets you upload the (for eg) the dist file of your Vue/React/Angular project into an S3 bucket and serve it to your customers. You can do this quite literally by routing your S3 URL (which they create for you automatically) with a CNAME you set up on godaddy or any hosting service.

In order for you to "Authenticate" or "secure (put https)" your S3 bucket website URL, you'll need to associate it with something called CloudFront (I know, F me so many things) which is Amazons CDN network, this service allows for you connect your actual custom domain "banana.com" to the S3 bucket by providing the S3 bucket as an "origin".

I won't go into the benefits of a CDN, so if your S3 bucket is a public-facing bucket, I wouldn't see why you wouldn't make it part of a CDN network (content delivery network) to pace up asset delivery

Message Queue Services via SQS

Amazon has its own service (of course) for message queues. If you're not completely aware of what a message queue is, here's my way of understanding it.

If you've ever stood in line at a McDonalds, you see this little holding area where there are bags of food sitting around waiting to be distributed by a staff member.

That is the queue, and the message (i.e the food) can only be processed once (i.e once a message to make food, or once the food is given to the customer, that's it)

Message queues are a form of async communication, the main role of Message Queues is to batch large loads of work, smoothen spiky workloads, and decouple heavyweight tasks (large cron job processing)

(Image credits AWS)

Queue services are used extensively in modern architecture to speed up application build and also simplify the process of building apps. Modern-day builds include several micro-services that are isolated from each other and SQS allows for data to be transferred from a producer (the one sending a message) to the consumer (the receiver) in fast and effective way. Since its async, there are no "thread blockages" that happen therefore stopping the entire service.

Going back to the McDonalds example, imagine how crap the service would be if only one order can be delivered at a time, and until one order is delivered the other can begin.

The process effectively works by sending and receiving message signals, the producer sends a message by adding a task to the queue (putting an order on the delivery table at an McDs) the message sits on that table until a receiver takes the message and does something with it (give it to the customer)

You might ask, okay how does this work when there are one producer and many receivers, this is called a Pub/Sub system (Publish/Subscribe)

An example would be, if a sale is made on a Shopify store, there would be multiple services hooked into that "topic of a sale" to perform multiple, different/isolated tasks. For eg. Send a Slack notification to the shop owner, print out an order label, trigger an email sequence.

Load Balancers

The name says it all, a Load Balancer's job is to sit on top of a network of (for this example) EC2 boxes and check to see if each server is currently on overload or not.

If a server is on overload, the job of the load balancer is to divert traffic to the next closest available server.

You might wonder, wait what if I have an open socket with a server behind the load balancer, how is that session magically maintained/transferred across to a whole new server running in parallel. The answer is if you do have situations like this, AWS Application Load Balancer is smart enough to sustain ongoing sessions (Just need to tick the make it sticky checkbox when creating a load balancer)

Another use case of load balancers is that they provide you with a SSL certified endpoint (don't need to add your own at least during testing), you can expose this route via a CNAME or a Masked route (https://server.myapp.com). At this point, you need to make sure your EC2 instances are only accessible internally (i.e remove any external IP access), this will make sure that any security threat is isolated to minimal points of entry

If you've liked reading so far, feel free to follow me for heaps more epic content

API Gateways

I learnt of API gateways during my quest to set up an SSL for an EC2 server. The first attempt was painful, I tried doing it within the EC2 instance, I was breaking my head (in hindsight, I overcomplicated things) but as a happy surprise, I came to learn of API gateways.

Think of an API gateway as a proxy, i.e its the middleman that receives your requests, do something to it if you want, and then sends that request to someone else you have no clue about.

There are many use-cases for API Gateways, but the 2 I'm mentioning, in particular, are acting as a secure proxy for an EC2 instance and second, wrapping a request with auth tokens.

Have you ever had that experience where you might need to make a request from the front end to a 3rd-party service, but the only way you can access that service is by adding to the request header an auth token, but that auth token is sensitive. You might think you need to go ahead and build an entire server to receive these requests, amend it and then send it to the 3rd party API. That's a very painful way, an easier way is using an API gateway, where it gives you the capability to mutate the request (in a limited way) before you send it off to the 3rd party API

Lambda Functions

AWS Lambda functions let you run "functions" in the cloud without needing to maintain a server. The function executes your code only when you need it to (certain time of day, or when it receives a request from somewhere) and it can scale really fast!

The common use I've seen is mainly to respond to changes in your DB, react to HTTP requests it receives from AWS API gateway.

So you can treat lambda functions as part of a "serverless" architecture.

Supply the code to a lambda function, tell it what event it needs to react to and let it run free.

Amazon VPC

A Virtual Private Cloud is a private cloud within AWS' public cloud. Think of it as your own little office space inside a WeWork (LOL) which is publically accessible to everyone

Within that room, you've got own systems set up your own processes and communication layer, however, it can only be accessed via a restricted endpoint i.e the front-door.

That's all for now, many more of these to come both in the format of a book and soon a course

Top comments (16)

Pacharapol Withayasakpunt • May 28 '20 • Edited

My two concerns about AWS

Lock-in
Cost monitoring / budget setting

Vaibhav Namburi • May 28 '20

Cost Monitoring is a really painful problem and trust me there's MANY people who have this issue!

There are some companies trying to fix it, but AWS is so cryptic in its spend sometimes it gets really hard to showcase

Lock-in, also fair, but thats the case with most cloud tools though and why they're valued at such high numbers, the LTV is off the chart

Rolf Streefkerk • May 29 '20

Lock-in,

this really depends, you can limit your lock in by using Docker based deployments. But even if you don't, there's always a form of lock in whatever solution you choose to use and there's always a cost to re-engineer the solution regardless of platform. So I believe the lock-in issue isn't that much of a deal, and especially not these days with virtualization and cloud agnostic deployment options available in the worst case.

Cost monitoring,
The tagging system in AWS is extremely important to use consistently such that you can more easily group your deployments and get pretty accurate day to day spend of those deployments. In cost explorer it's then a piece of cake to see which parts of your application incur what costs.
Then budget alerts are there to cover hard number value alerts.
If you're running a multi account setup, use organizations with consolidated billing. All your billing for all sub-accounts goes to one account. Again, the tagging system you've setup will help you sort out exactly what spend comes from where.
Really the tools are there.

Gayan Hewa • May 28 '20

Cost monitoring has improved a lot in the past couple of years. From my experience in the past couple of years, AWS spends a significant amount of engineering effort reduce churn due to cost especially for companies that are not fully locked in (😉), the alerting can be set to a smaller window such is 1 hr. Allowing you to monitor cost in really smaller increments. Also you should make the most with the Solutions Architects AWS provide. They bring in a lot of tips on how well you can cut down cost.

If you are using something like Terraform for IaC, there are some plugins that estimate the resource cost before provisioning.

In terms of vendor lock-in it's more or less a design problem and a trade off we make during development/product planning. Has nothing to be done on AWS side. Its a conscious decision we make when opting in.

Maximilian Burszley • May 28 '20

the Solutions Architects AWS provide

Anywhere else that's just called "Sales Engineer"

Vaibhav Namburi • May 29 '20

Great response Gayan!

Thanks for sharing this! Really appreciate it

Dave • May 28 '20

As I'm predominantly a back end developer, with a life-long experience in Linux, I think you've grossly over-simplified the terms, but kudos on the attempt at least.

Noting what you list as your job title, please, for the love of everything holy tidy up the reference to "tunnelling" - when you say that, those of us with experience in Linux read "SSH tunnel", and that's categorically not what happens when you connect to an EC2. It's much closer to simply "SSHing" to an EC2 than it is "tunnelling."

Vaibhav Namburi • May 29 '20 • Edited

Hey Dave

Hahaha, I have, the target userbase for this article is definitely not people like you and me who have an nth layer understanding of how things operate, but rather for people who're just getting into space

"Noting what you list as your job title, please, for the love of everything holy tidy up the reference to "tunnelling"" - hahaha this made me laugh - love it!

It is simply SSHing, but to make it sound cooler I called it tunnelling 😂

Thanks for the feedback though, I'll update the point