Big Just Got Bigger - 5 Terabyte Object Support in Amazon S3

• 478 words

Today, Amazon S3 announced a new breakthrough in supporting customers with large files by increasing the maximum supported object size from 5 gigabytes to 5 terabytes. This allows customers to store and reference a large file as a single object instead of smaller 'chunks'. When combined with the Amazon S3 Multipart Upload release, this dramatically improves how customers upload, store and share large files on Amazon S3.

Who has files larger than 5GB?

Amazon S3 has always been a scalable, durable and available data repository for almost any customer workload. However, as use of the cloud as grown, so have the file sizes customers want to store in Amazon S3 as objects. This is especially true for customers managing HD video or data-intensive instruments such as genomic sequencers. For example, a 2-hour movie on Blu-ray can be 50 gigabytes. The same movie stored in an uncompressed 1080p HD format is around 1.5 terabytes.

By supporting such large object sizes, Amazon S3 better enables a variety of interesting big data use cases. For example, a movie studio can now store and manage their entire catalog of high definition origin files on Amazon S3 as individual objects. Any movie or collection of content could be easily pulled in to Amazon EC2 for transcoding on demand and moved back into Amazon S3 for distribution through edge locations throughout the word with Amazon CloudFront. Or, BioPharma researchers and scientists can stream genomic sequencer data directly into Amazon S3, which frees up local resources and allows scientists to store, aggregate, and share human genomes as single objects in Amazon S3. Any researcher anywhere in the world then has access to a vast genomic data set with the on-demand compute power for analysis, such as Amazon EC2 Cluster GPU Instances, previously only available to the largest research institutions and companies.

Multipart Upload and moving large objects into Amazon S3

To make uploading large objects easier, Amazon S3 also recently announced Multipart Upload, which allows you to upload an object in parts. You can create parallel uploads to better utilize your available bandwidth and even stream data into Amazon S3 as it's being created. Also, if a given upload runs into a networking issue, you only have to restart the part, not the entire object allowing you recover quickly from intermittent network errors.

Multipart Upload isn't just for customers with files larger than 5 gigabytes. With Multipart Upload, you can upload any object larger than 5 megabytes in parts. So, we expect customers with objects larger than 100 megabytes to extensively use Multipart Upload when moving their data into Amazon S3 for a faster, more flexible upload experience.

More information

For more information on Multipart Upload and managing large objects in Amazon S3, see Jeff Barr's blog posts on Amazon S3 Multipart Upload and Large Object Support as well as the Amazon S3 Developer Guide.