DEV Community

vikash-agrawal
vikash-agrawal

Posted on

AWS Storage Services

S3 (Simple Storage Services)

• It’s an object-based storage so it’s not suitable to store OS and DB.
• The objects are stored in bucket.
• The URL for bucket would be https://s3-.amazonaws.com/
• It consists of:

o Key: Name of the object
o Value: Actual data inside the file
o Version Id:
o Meta Data: information about the object like content-type.
o Sub resources:
▪ Access Control List: Individual permission.
▪ Torrent: Used for torrenting (Not so important)

• Charge:

o Storage
o Requests
o Storage Management Pricing: Tagging to particular department like HR, Engineering etc.
o Data Transfer Pricing
o Transfer Acceleration
▪ it enables to use the cloud front (edge location) to be used by the user to upload the files and then AWS uses the optimize route to upload it to its Region based S3 bucket.
▪ While enabling this property under S3 bucket, it provides the unique link, which connects to edge location.

• It also provides the web hosting and the URL looks like http://.s3-website-.amazonaws.com
• It provides unlimited storage
• Data is automatically distributed across a minimum of three physical facilities that are geographically separated within an AWS Region, and Amazon S3 can also automatically replicate data to any other AWS Region.
• Amazon S3 now provides increased performance to support at least 3,500 requests per second to add data and 5,500 requests per second to retrieve data, which can save significant processing time for no additional charge.
• It allows you to upload file, and the file size can be from 0B to 5TB.
• In S3 Multipart Upload, you can upload a maximum object size of 5 TB and a part size of 5 MB to 5 GB (last part can be < 5 MB)
• Please refer to the below table for the complete information:

o Maximum object size    5 TB
o Maximum number of parts per upload    10,000
o Part numbers    1 to 10,000 (inclusive)
o Part size    5 MB to 5 GB, last part can be < 5 MB
o Maximum number of parts returned for a list parts request    1000
o Maximum number of multipart uploads returned in a list multipart uploads request    1000

• S3 is universal namespace that means the bucket name has to be unique.
• S3 buckets are created region level but the listing happens on “Global” level.
• When upload of file to bucket is successful, you would receive HTTP code 200
• Read after write consistency for PUTS of new object immediately.
• Eventual consistency for overwrite PUTS and DELETES can take some time to propagate, reason being it’s spread across multiple AZs, and hence updating takes some time while creating a new object doesn’t take much time.
• Encryption

o In Transit
▪ SSL/TLS
o At Rest
▪ Server side
• S3 managed keys – SSE-S3 (AWS self-managed)
• AWS Key Managed Service – SSE-KMS, it’s same as SSE but with extra charge.
• Customer provided key – SSE-C
▪ Client side

• Access control list (for individual file)
• Bucket policy
• Version

o Once the versioning has been enabled to bucket, it cannot be disabled, it can be suspended.
o Being version control system, it would like to keep the deleted file as well. So when we delete any file, it deletes the file and put “Delete Marker” to it, which can be seen by clicking “show” option and then if you delete “delete marker”, it brings back the file, whichever is latest that time.
o Being object-based storage, it keeps every versioned file within it and hence space consuming, so not suggested for bucket having bigger file.
o MFA can be enabled to avoid accidental delete.

• Cross Region Replication

o It enables to copy the object from any bucket of any account to any bucket of any account.
o As the replications replaces the object hence it has to versioned. So only bucket with “Version enabled” can be replicated to any bucket with “Version enabled”.
o The time when the replication is enabled, it copies the data from that time onwards only, for the pre-existing objects, it has to copied explicitly.
o It works for object modified directly in the bucket not in the Versioning (Show option). E.g. if any file has 2 version and it has been deleted the same will be replicated in the destination bucket but if the “Delete Marker” has been deleted to bring the latest file back, the same wouldn’t be replicated in destination bucket.
o Replication to multiple bucket doesn’t work for now.

• Built for 99.99 % availability but Amazon guarantees for 99.9%
• Guarantees for 99.999999999% (11 9s) durability means 99.999999999% of the given file would be safe.
• Tiered storage

o S3 standard: 99.99 % availability and 99.999999999% (11 9s) durability
o S3-IA: infrequently accessed but spread across multiple AZs.
o S3 One Zone-IA: infrequently accessed but spread across only 1 AZ.
o Reduced Redundancy Storage (RRS): For frequently accessed data. Stores noncritical, reproducible data at lower levels of redundancy than Standard. It provides 99.99% availability and 99.99%durability. It will be useful when we have some retrieval policy like meta data in dynamo DB.
o Glacier:
▪ Expedited: Retrieval is quick
▪ Standard: Retrieval time is 3-5 hours.
▪ Bulk: Retrieval time is 7-12 hours

• Lifecycle management

o Can be used in conjunction with versioning.
o Can be applied to current and previous version.
o It enables the movement of file from standard to following category in the mentioned order, with minimum number of days and gap between the 2 categories should be 30:
▪ IA
▪ one zone IA
▪ glacier

Glacier

• Archival File storage, meant for contents accessed very less frequent.

Snowball

• It’s a way to bring in huge amount of data to AWS data center manually with the tamper proof and secured device. It’s mainly for legacy application. It can import and export to and from S3.
• Types

o Snowball
▪ it’s an appliance, which helps in transfer data to and from AWS.
▪ Once the data is processed and verified, AWS perform erasure.
▪ It supports up to 80TB
o Snowball edge
▪ It supports up to 100TB
▪ It supports compute facility as well.
o Snowball mobile
▪ It supports up to 100PB

Storage Gateway

• It’s way to connect the organization on premise DCs to AWS services.
• AWS provides VM based S/W to be installed in your Host computers and connect it.
• Types:

o File Gateway (NFS): Flat file.

o Volume Gateway (iSCSI): Block based storage like OS, it’s stored in Amazon Elastic Block Store and it stores only the changed block.
▪ Stored Volume: Entire data set is stored in onsite and backed up to cloud asynchronously. It stores from 1GB to 16TB

▪ Cached Volume: Most frequently accessed data is stored in onsite and rest is backed up in cloud. It stores from 1GB to 32TB

o Tape Gateway (VTL): Infrequently access data, which then moved to Glacier based on the Lifecycle management.

Elastic File Storage (EFS)

• It doesn’t need any re provisioning like what we do while creating EC2.
• EFS is block based storage.
• It shrinks and expands based on the need.
• It supports Network File System V4.
• It can scale up to petabytes.
• It supports up to thousands of NFS connections.
• Read After Write consistency.
• It spreads across Multiple AZs in the given region.
• EC2 instance access the EFS via mount targets.
• AWS recommends to create mount target in all the AZs so that each of the instances will have its own mount target.
• For EC2 to access EFS, EFS’ security group should be one of the EC2’s security groups
• By following instructions in EFS, we can mount it to all EC2 instances.

Top comments (0)