Today marks the launch of Amazon EBS (Elastic Block Store), the long awaited persistent storage service for EC2. Details can be found on the EC2 detail page, the press release and Jeff Barr's posting over on the AWS evangelists blog. Also the folks at Rightscale have two detailed postings: why Amazon EBS matters and Amazon EBS explained.
With the launch of the Elastic Block Store we complete an important milestone in offering a complete suite of storage solutions as part of the Amazon Infrastructure Services. Back in the days when we made the architectural decision to virtualize the internal Amazon infrastructure one of the first steps we took was a deep analysis of the way that storage was used by the internal Amazon services. We had to make sure that the infrastructure storage solutions we were going to develop would be highly effective for developers by addressing the most common patterns first. That analysis led us to three top patterns:
- Key-Value storage. The majority of the Amazon storage patterns were based on primary key access leading to single value or object. This pattern led to the development of Amazon S3.
- Simple Structured Data storage. A second large category of storage patterns were satisfied by access to simple query interface into structured datasets. Fast indexing allows high-speed lookups over large dataset. This pattern led to the development of Amazon SimpleDB. A common pattern we see is that secondary keys to objects stored in Amazon S3 are stored in SimpleDB, where lookups result in sets of S3 (primary) keys.
- Block storage. The remaining bucket holds a variety of storage patterns ranging special file systems such as ZFS to applications managing their own block storage (e.g. cache servers) to relational databases. This category is served by Amazon EBS which provides the fundamental building block for implementing a variety of storage patterns.
I have written before about the basic features of Amazon EBS:
- Amazon EBS will be offered in the form of storage volumes which you can mount into your EC2 instance as a raw block storage device. It basically looks like an unformatted hard disk. Once you have the volume mounted for the first time you can format it with any file system you want or if you have advanced applications such as high-end database engines, you could use it directly.
- Developers can create multiple volumes, in size ranging from 1 GB to 1TB. This volume will be created within a specified Availability Zone and will be accessible by your EC2 instances running in that Availability Zone. As to be expected with a volume abstraction only one instance can have the volume mounted at any given time. Volumes can migrate and be reattached to other instances if necessary for failure handling or application migration reasons.
- The consistency of data written to this device is similar to that of other local and network-attached devices; it is under control of the developer when and how to force flush data to disk if you want to bypass the traditional lazy-writer functionality in the operating systems file-cache. Because of the session oriented model for access to the volume you do not need to worry about eventual consistency issues.
However Amazon EBS isn't just a massive volume storage array within an Availability Zone, it provides a unique feature that allows for the creation of novel storage management scenarios: the ability to create snapshots and store those snapshots into Amazon S3. These snapshots can then be used as the starting point for creating new volumes within any availability zone.
We see developers use this feature for long term backup purposes, for use in rollback strategies, for (world-wide) volume re-creation purposes. Snapshots also play an important role in building fault-tolerance scenarios when combined with managing applications using Elastic IP addresses and Availability Zones.
Congratulations to the EBS team for delivering a great service that will help a lot of EC2 customers managing their storage efficiently.