DEV Community

Cover image for Amazon S3 Fundamentals
Trịnh Duy Hưng
Trịnh Duy Hưng

Posted on • Updated on

Amazon S3 Fundamentals

1. Object Storage

Unlike the hierarchical structure used in file-based storage, object storage is a flat structure where the data, called an object, is located in a single repository known as a bucket. Object can be organized to imitate a hierarchy by attaching key name prefixes and delimiters. Prefixes and delimiters allow you to group similar items to help visually organize and easily retrieve your data. In the user interface, these prefixes give the appearance of a folder and subfolder structure but in reality, the storage is still a flat structure.

S3 was launched on March 14, 2006, which was the second service after Amazon SQS.

Amazon S3 is an object storage service which provides industry-leading scalability, data availability, security, and performance. It is intentionally built with a minimal feature set that focuses on simplicity and robustness. Customers have access to the same highly scalable, reliable, fast, inexpensive data storage infrastructure that Amazon uses to run its own global network of web sites.

To demonstrate how prefix works in S3, take a look at the picture below. To organize and group the oceanography data for the external vendor you created a logical hierarchy using the prefix ocean. Ocean looks like a subfolder but this is only to help make the structure readable.

Image description

In reality, the key name of the dolphin object is a little longer, allowing us to locate the ocean/dolphin.jpg object. The object still sits in a single flat-storage structure.

2.Bucket overview

Image description

Buckets are permanent containers that hold objects. You can create between 1 and 100 buckets in each AWS account. You can increase the bucket limit to a maximum of 1,000 buckets by submitting a service limit increase. Bucket sizes are virtually unlimited so you don't have to allocate a predetermined bucket size.

An Amazon S3 bucket is a storage option with the ability to:

  • Host a static website
  • Retain version information on objects
  • Apply life-cycle management policies to balance version retention with bucket size and cost.

Bucket limitations

There are some important restrictions and limitations:

  • Amazon S3 buckets are owned by the account that creates them and cannot be transferred to other accounts
  • Bucket names are globally unique. There can be no duplicate names within the entire S3 infrastructure. Once created, you cannot change a bucket name.
  • Buckets are permanent storage entities and only removable when they are empty. After deleting a bucket, the name becomes available for reuse by any account after 24 hours if not taken by another account.
  • There’s no limit to the number of objects you can store in a bucket. You can store all of your objects in a single bucket, or organize them across several buckets. However, you can't create a bucket from within another bucket, also known as nesting buckets.
  • By default, you can create up to 100 buckets in each of your AWS accounts. If you need additional buckets, you can increase your account bucket limit to a maximum of 1,000 buckets by submitting a service limit increase.

3.Organizing data using tags

A tag is a label that you assign to an AWS resource. Each tag consists of a key and an optional value, both of which you define to suit your company's requirements. Tags enable you to categorize your AWS resources or data in different ways.

For example, you could define a set of tags for your objects that help you track project data or owner. Amazon S3 tags are key-value pairs and apply to a whole bucket or to individual objects to help with identification, searches, and data classification. Using tags for your objects allows you to effectively manage your storage and provide valuable insight on how your data is used. Newly created tags assigned to a bucket, are not retroactively applied to its existing child objects.

  • Bucket tags allow you to track storage cost, or other criteria, by labeling your Amazon S3 buckets using cost allocation tags. A cost allocation tag is a key-value pair that you associate with an S3 bucket. After you activate cost allocation tags, AWS uses the tags to organize your resource costs on your cost allocation report. You can only use cost allocation tags on buckets and not on individual objects. AWS provides two types of cost allocation tags, an AWS-generated tag and user-defined tag. AWS defines, creates, and applies the AWS-generated tag, createdBy, for you after an S3 CreateBucket event. You define, create, and apply user-defined tags to your S3 bucket.
  • Each S3 bucket has a tag set. A tag set contains all of the tags that are assigned to that bucket and can contain as many as 50 tags, or it can be empty. Keys must be unique within a tag set but values don't. In the image below the Value: ocean-life is listed twice in tag sets named project/ocean-life and topic/ocean-life. Since values don't need to be unique these entries are fine. However, when attempting to add a second Key called "items" an error occurs because the key must be unique within the tag set.
  • Object tagging gives you a way to categorize and query your storage. You can add tags to an Amazon S3 object during the upload or after the upload. You can associate up to 10 tags with an object they must have unique tag keys.

Adding tags to your objects offer benefits such as the following:

  • Object tags enable fine-grained access control of permissions. For example, you could grant an IAM user permission to read-only objects with specific tags.
  • Object tags enable fine-grained object lifecycle management in which you can specify a tag-based filter, in addition to a key name prefix, in a lifecycle rule.
  • When using Amazon S3 analytics, you can configure filters to group objects together for analysis by object tags, key name prefix, or both prefix and tags.
  • You can also customize Amazon CloudWatch metrics to display information by specific tag filters. The next lesson provides more details.

4.Region location

Amazon S3 is a globally viewable service. This means that in the AWS Management Console you do not have to specify a region in order to view the buckets. Remember that when you initially create the bucket, you must choose a region to indicate where you want the bucket data to reside. The region you choose should be local to your users or consumers to optimize latency, minimize costs, or to address regulatory requirements.

For example, if you reside in Europe, you will want to create buckets in the Europe (Ireland) or Europe (Frankfurt) regions rather than creating your buckets in Asia Pacific (Sydney) or South America (Sao Paulo). This way the data is closer to your users and consumers, reducing latency and ensuring regulatory and meets country legal requirements.

  • Cross-Region Replication (CRR)
    If you need data stored in multiple regions, you can replicate your bucket to other regions using cross-region replication. This enables you to automatically copy objects from a bucket in one region to different bucket in a another, separate region. You can replicate the entire bucket or you can use tags to replicate only the objects with the tags you choose.

  • Same-Region Replication (SRR)
    Amazon S3 supports automatic and asynchronous replication of newly uploaded S3 objects to a destination bucket in the same AWS Region.
    SRR makes another copy of S3 objects within the same AWS Region, with the same redundancy as the destination storage class. This allows you to automatically aggregate logs from different S3 buckets for in-region processing, or configure live replication between test and development environments. SRR helps you address data sovereignty and compliance requirements by keeping a copy of your objects in the same AWS Region as the original.

5.Amazon S3 hosting static websites

You can use Amazon S3 to host a static website. With a static website, the individual web pages include only static content that does not change frequently. This is different from using a dynamic website, where the content is constantly changing and constantly updated. From the AWS Management Console, you can easily configure your bucket for static website hosting without needing to write any code.

To host a static website on Amazon S3, you configure a bucket for website hosting and then upload your content. When you configure a bucket as a static website, you must enable website hosting, set public read permissions, and create and add an index document.

Image description

Depending on your website requirements, you can also configure redirects, web traffic logging, and a custom error document. If you prefer not to use the AWS Management Console, you can create, update, and delete the website configuration programmatically by using the AWS SDKs.

Thank you, see you next time.

Top comments (0)