DEV Community

Cover image for Amazon S3 best practices
Supratip Banerjee for AWS Community Builders

Posted on • Updated on

Amazon S3 best practices

Below picture shows a pictorial view of Amazon S3 capabilities.

Alt Text

Amazon S3 evolution

a. Since 2006 S3 has shown 80% deduction in price
b. And over time S3 has added several storage classes

Alt Text

S3 New Releases

a. In 2020 S3 has launched new archive in deep archive access tiers, to intelligent tiering storage class, because customers wanted flexibility to automatically moving data to the lowest storage cost offering

In order to further reduce their storage costs, many customers prefer to archive rarely accessed objects directly to S3 Glacier or S3 Glacier Deep Archive. However, this requires you to build complex systems that understand the access patterns of objects for a long period of time and archive them when the objects are not accessed for months at a time.

Today we are announcing two new archive access tiers designed for asynchronous access that are optimized for rare access at a very low cost: Archive Access tier and Deep Archive Access tier. You can opt-in to one or both archive access tiers and you can configure them at the bucket, prefix, or object tag level.

Now with S3 Intelligent-Tiering, you can get high throughput and low latency access to your data when you need it right away, and automatically pay about $1 per TB per month when objects haven’t been accessed for 180 days or more. Already customers of S3 Intelligent-Tiering have realized cost savings up to 40% and now using the new archive access tiers they can reduce storage costs up to 95% for rarely accessed objects.

How does it work:

Once you have activated one or both of the archive access tiers, S3 Intelligent-Tiering will automatically move objects that haven’t been accessed for 90 days to the Archive Access tier, and after 180 days without being accessed to the Deep Archive Access tier. At any time that an object that is in one of the archive access tiers is restored, the object will move to the Frequent Access tier within a few hours and then it will be ready to be retrieved.

Objects in the archive access tiers are retrieved in 3-5 hours and if they are in the deep archive access tier within 12 hours. If you need access to an object in any of the archive tiers faster, you can pay for faster retrieval by selecting in the console expedited retrieval.

Alt Text

b. S3 also launched Amazon S3 outposts to meet customers requirement of keeping their data close to on-premise application

  1. On than note find a quick glimpse of the existing storage classes

o S3 storage classes
o S3 standard – 99.99% availability and 99.99999999% durability
o S3 Infrequently Access – data that is accessed less frequently but need rapid access when needed. Lower fee than S3 but retrieval fee is charged
o S3 One zone - IA (RRS – the old version) – lower cost for infrequently accessed data, and stored in one zone only
o S3 Intelligent Tiering – Designed by Artificial Intelligence, AI decides how often the objects are used and they are moved to the most cost-effective access tier without performance impact or operational overhead
o S3 Glacier – cheaper than on premise data archival
o S3 Glacier Deep Achieve – lowest cost storage class where retrieval time of 12 hours is accepted

Alt Text

Alt Text

Amazon S3 Analytics and Insights

Alt Text

a. S3 Inventory for analysing individual objects within your bucket
b. S3 storage class analysis which analyses access patter in bucket to recommend an optimal storage class for you
c. S3 storage lens (recently launched) provides centralised organisation wide visibility into your S3 storage, usage and activity and also recommendation on how you can optimize your storage
i. 29 metrices, updates daily
ii. Up to 15 months of historical data to analyse
iii. Gives recommendation on how to optimize storage costs

S3 storage use cases

Alt Text

In below example we can see that 43% of storage is in Standard infrequent access, so we can take judgement on moving them to a archival class to same cost

Alt Text

Amazon S3 outlier

Alt Text

Amazon S3 lifecycle

Alt Text

a. Based on rules S3 lifecycle can move objects across storage classes
b. These rules are based on the
i. Date of the object creation
ii. Can be filtered to apply to the

  1. whole bucket
  2. prefix
  3. tagged objects

S3 intelligent Tiering

Alt Text

Data protection with S3

  1. By default, all buckets are private
  2. We can change the policies by a. Bucket policies – bucket policy b. Access control lists – for individual access

Encryption in transit

  1. If the request is https the traffic is going to be encrypted – which is basically encrypted in transit
  2. Between my computer and the server, the traffic will be encrypted, no one will be able to break that in between and understand what I am looking at
  3. Encryption in transit is always achieved by a. SSL/TLS

Encryption at Rest (server side)

  1. IF there’s any word document in server is without encryption anyone have access to the drive will be able to read it
  2. This can be achieved at a. Server side b. Client side
  3. AWS does it at server side
  4. Client side I can encrypt it and upload
  5. Server-side AWS encryption a. S3 Managed Keys / SSE (server-side encryption) S3 – AWS manage to do this; we don’t have to do anything. It is just encrypting the object with a key b. SSE KME (Key Management Service) – Here Amazon and we together encrypt it c. SSE C (customer provided keys) – here we provide the key

Versioning

a. Store all versions of an object
b. Even delete versions maintained
c. Once enabled can’t be disabled, only the bucket can be deleted
d. MFA deletion policy
e. When we enable versioning the size of a file is the sum of all the versions

Alt Text

f. When we upload a new version, it is going to be private always, but the older versions visibility setting doesn’t change

Replication

Alt Text

Alt Text

Object lock

Alt Text

Top comments (0)