1. S3 is a secure and scalable storage service
You can store securely your files (called objects) to S3, the object size can be up to 5 TB.
2. Objects Attributes:
S3 objects can have:
- Key (name of the object)
- Value (data)
- Version ID.
- Metadata (data about data you are storing)
- Subresources: Access control list, torrents.
3. S3 Naming convention:
There are some rules that you must respect in order to name your S3 objects:
- No uppercase nor underscore
- 3-63 characters long
- Not an IP and it must start lowercase letter or number
- S3 is a universal namespace, so it’s unique.
4. S3 has the following features:
Tiered storage available
Lifecycle management
Versionning
Encryption
MFA Delete (multi factor auth): can be only configured in CLI mode.
Secure data using ACL (Access Control List) and bucket policies.
Signed URLs: URLs that are valid only for a limited time (ex: premium video service for logged in users)
5. S3 storage classes:
- S3 standard: 99.99% availability, 99.99999999% durability, it is the default storage class.
- S3 IA (infrequently Accessed)
- S3 one zone - IA
- S3 Intelligent Tiering
- S3 Glacier (for data archiving, 99.999999999% durability of archives )
- S3 Glacier Deep Archive (retrieve data in 12hours)
S3 Pricing Tiers:
You pay per:
- Storage
- Requests and data retrieval
- Data transfer
Most expensive: S3 standard, then followed by:
- S3 IA
- then S3 Intelligent Tiering
- then S3 one zone IA
- then S3 glacier
- and finally S3 glacier deep archive.
6. S3 Encryption:
Two types of encryption:
- Encryption in Transit: SSL/TLS
- Encryption at Rest (server side), there are three types of server side encryption:
- S3 managed keys -SSE -S3,
- AWS Key Management Service,
- Server side encryption with customer provided keys SSE-C
- Then there is client side encryption
8. S3 Security:
- User based: IAM policies.
- Resource based, that can be managed in three ways:
- Bucket policies, used to:
- Grant public access to the bucket
- Force a bucket to be encrypted at upload
- Grant access to another account (Cross Account)
- Object ACL,
- Bucket ACL.
9. S3 CORS:
- If you request data from another S3 bucket, you need to enable CORS.
- Cross Origin Resource Sharing allows you to limit the number of websites that can request your files in S3, thus limit your costs.
10. Consistency Model
- Read after write consistency for PUTS of new objects:
- As soon as an object is written, we can retrieve it, ex: PUT 200 -> GET 200)
- This is true, except if we did a GET before to see if the object existed (ex: GET 404 -> PUT 200 -> GET 404) – eventually consistent
- Eventual Consistency for DELETES and PUTS of existing objects
- If we read an object after updating, we might get the older version (ex: PUT 200 -> PUT 200 -> GET 200 (might be older version))
- If we delete an object, we might still be able to retrieve it for a short time (ex: DELETE 200 -> GET 200)
11. S3 Access Logs:
- For audit purpose
- Any request made to S3, from any account, authorized or denied will be logged into another S3 bucket
- That data ca be analyzed using data analysis tools like Athena.
12. S3 pre-signed URLs:
- Can generate pre-signed URLs using SDK or CLI
- For download (easy, can use the CLI)
- For uploads (harder, must use the SDK)
- Valid for a default of 3600s, can change timeout with –expires in {TIME_BY_SECONDS] argument
- Users given a pre-signed URL inherit the permissions of the person who generated the URL for GET / PUT. Examples:
- Allow only logged in users to download a premium video on your S3 bucket
- Allow an ever changing list of users to download files by generating URLs dynamically
- Allow temporarily a user to upload a file to precise location in our bucket
13. S3 Performance:
- Baseline Performance:
- S3 scale automatically to high request rates, latency 100-200ms
- Your app ca achieve at least 3500 PUT/COPY/POST/DELETE and 5500 GET/HEAD requests per second per prefix in a bucket.
- KMS Limitation:
- If you use SSE-KMS, you may be imapcted by the KMS limits
- When you upload, it call the GenerateDataKey KMS API
- When you download, it calls the Decrypt KMS API
- Count towards the KMS quota per second (5500, 10000, 3000 req/s based on region)
- You cant request a quota increase for KMS
- Multi Part upload:
- Recommended for files > 100MB, must use for files > 5GB
- Can help parallelize uploads (divied in parts and speed up transfers)
-
S3 Transfer Acceleration (upload only)
- Increase transfer speed by transferring file to an AWS edge location which will forward the data to the S3 bucket in the target region using AWS backbone.
- Compatible with multipart upload
- Check this url for S3 Acceleration speed: https://s3-accelerate-speedtest.s3-accelerate.amazonaws.com/
-
S3 Byte range Fetches
- Parallelize GETs by requesting specific byte ranges
- Better resilience in case of failures
- Can be used to speed up downloads
- Can be used to retrieve only partial data (for example the head of a file)
14. Select & Glacier Select:
- Retreive less data using SQL by performing server side filtering
- Can filter by rows & columns (simple SQL statements, server side filtering)
- Less network transfer, less CPU cost client side.
15. Object & Glacier Vault Lock:
Do you know any other functionnality of S3 that I didn't mention, please feel free to post it in the comment.
Top comments (0)