Amazon Simple Storage Service, widely known as Amazon S3, is a highly scalable, fast, and durable solution for object-level storage of any data type. Unlike the operating systems we are all used to, Amazon S3 does not store files in a file system, instead it stores files as objects. Object Storage allows users to upload files, videos, and documents like you were to upload files, videos, and documents to popular cloud storage products like Dropbox and Google Drive. This makes Amazon S3 very flexible and platform agnostic.
Objects also hold and store metadata which we will cover in detail later.
What is the difference between Amazon S3 and EC2?
To someone who’s never used AWS services, trying to understand their various products may be difficult the first time around. Between S3, EC2, RDS, VPC, and EKS, going through Amazon’s catalog of services can seem like alphabet soup.
In order to start putting the pieces of the AWS puzzle together, it’s important to understand the basic core services that AWS offers. Today we’ll cover EC2 with S3 and the different storage classes of S3.
Amazon EC2 and Amazon S3 are different enough that the average person should be able to draw the distinction without much difficulty.
Amazon EC2s provide a way to access cloud-based servers, also known as virtual machines. You can do pretty much anything on these virtual machines. Consider them the same as your own home computer, but running Linux (or Windows in some cases), and you need a terminal or shell to connect to them virtually.
S3 buckets are used as a storage location for backing up data in conjunction with EC2s. You can store photos, text logs, videos, songs, books, and other files in an S3 bucket.
In short, think of Amazon EC2 as your personal computer but it lives in the cloud, and Amazon S3 would be an external hard drive or cloud storage service similar to Dropbox.
How Amazon S3 works
Amazon S3 works as an object storage service. This is different from your typical file storage or even block storage. When a user uploads data to S3, that file is stored as an object with metadata intact and the object as a whole is given an ID.
There are two different kinds of metadata. System-Defined and User-Defined Metadata. System metadata is used for S3 to maintain important things such as creation date, size, and last-modified.
Objects also take in user-defined metadata. User-defined metadata allows users to assign key-value pairs to the data they upload. These key-value pairs help users identify, organize, and assign objects to specific resources, or allow for easy retrieval.
Advantages of using Amazon S3
As previously mentioned, Amazon S3 has some unique benefits as an object storage service as compared to traditional file or block storage. Some major advantages of using Amazon S3 include durability, security, and reliability. Per Amazon’s documentation, Amazon S3 provides customers with a 99.999999999% rate of durability.
How does Amazon achieve this level of durability? AWS S3 redundantly stores your data across multiple devices spanning at least three AZs (Availability Zones) in an S3 Region.
Amazon S3 Features
While durability and reliability are great features, AWS S3 has other features which set it apart from other data storage services. Data can be transferred to S3 via the AWS S3 API over the public internet which makes uploading and automating backups of data easier for developers.
Users are also able to take advantage of other S3 features such as:
- The ability to write, read, and delete objects from 1 byte to 5 terabytes
- Unlimited number of objects
- Authentication mechanisms provided to allow authentication and deny unauthorized access from outside users
- REST and SOAP API interfaces
- Simplicity in managing data by segregating data by buckets, monitoring access, and controlling data life-cycles
Amazon S3 Storage Classes
Amazon S3 offers customers and users a wide range of storage classes which allows users the flexibility in configuring their storage. The different storage classes include:
- Amazon S3 Standard
- This is your standard storage for “hot” or live data access. Availability is at 99.99% and usually the go-to storage class for most users.
- Amazon S3 Standard-Infrequent Access (S3 Standard-IA)
- This storage class is for infrequent access to data. It offers lower storage costs but higher storage restoration costs. This class is recommended for data that can be at rest for 30 days without needing to be accessed, such as archives, but if needed, can be accessed relatively quickly.
- Amazon S3 One Zone-infrequent Access (S3 One Zone-IA)
- Very similar to Amazon S3 Standard-IA, but costs 20% less due to only being available in one zone as opposed to three.
- Amazon S3 Intelligent-Tiering (S3 Intelligent-Tier)
- A relatively new storage class introduced in late 2018, data in S3 Intelligent-Tiering is monitored by AWS and automatically converted from S3 to S3-IA if there’s been 30 days of inactivity on the objects. If said object is then accessed, it is moved back to frequent access for cheaper access.
- Amazon S3 Glacier (S3 Glacier)
- S3 Glacier is great for archiving data for long storage which doesn’t necessarily need instant access or retrieval. S3 Glacier offers storage at a much lower cost than S3 Standard and even S3 Standard-IA but fines users for retrieving data before the 90-day minimum and anything over 10 GB per month.
- Amazon S3 Glacier Deep Archive (S3 Glacier Deep Archive)
- Amazon has added to their S3 Glacier service and introduced Amazon Glacier Deep Archive. At 1GB/month starting at $0.00099, it will be the cheapest solution to storing data on Amazon AWS. This storage class is specifically meant for data that you absolutely would never need instantly, as the fastest restore time takes 12 hours.
Amazon S3 Application Programming Interfaces (API)
Customers and end users have the option to interact with Amazon AWS S3 via API. Both REST and SOAP interfaces are provided, making S3 flexible and language neutral. This is great for developers who need to store and retrieve data via a programmable interface. S3 API provides users and customers with the capability to store, retrieve, list, delete, and move objects in Amazon S3.
Since its introduction, Amazon's S3 API has been adopted as the standard for object based storage interfaces. Many vendors and third-party software companies offer support and developer guides for Amazon S3, which continue to make it the leading object storage service.
Integrating AWS S3 with Sumo Logic
Monitoring and logging is an important part of ensuring availability, reliability, and performance for customers. By collecting and monitoring data logs from all sources and services within AWS, users gain valuable insight on how application infrastructure is performing, where failpoints are, and where tweaks can be added to improve functionality.
In the next part of this three part series, we’ll be going over AWS S3 monitoring and the importance of logging. We’ll go over how to gather data, leverage that data, understand AWS S3 monitoring metrics, and how AWS S3 is monitored using Sumo Logic.
Complete visibility for DevSecOps
Reduce downtime and move from reactive to proactive monitoring.