Expanding the Cloud: More memory, more caching and more performance for your data

• 1287 words

Today, we added two important choices for customers running high performance apps in the cloud: support for Redis in Amazon ElastiCache and a new high memory database instance (db.cr1.8xlarge) for Amazon RDS.

As we prepared to launch these features, I was struck not only by the range of services we provide to enable customers to run fully managed, scalable, high performance database workloads, including Amazon RDS, Amazon DynamoDB, Amazon Redshift and Amazon ElastiCache, but also by the pace at which these services are evolving and improving. Since you now have lots of choices to address your high performance database needs, I decided to write this blog to help you select the most appropriate services for your workload using lessons I have learnt by scaling the infrastructure for Amazon.com.

Choosing your database architecture may be the most critical decision you’ll make and has a disproportionate impact on the performance, scalability, and availability of your app. Get it right and your application will seamlessly scale from hundreds to tens of millions of users without difficulty, while remaining performant and available. Get it wrong and you’re looking at sleepless nights, struggling to keep up with growth and fighting to keep your app available while you rewrite critical portions of your code. Since databases are complex and have so much impact on our customers’ apps, from day 1 we have believed in delivering managed services and taking on the burden of provisioning, configuring, securing, backing up and restoring databases to enable our customers to focus on what they do best, which is to develop awesome apps for their users.

No single database architecture or solution can meet all of Amazon.com’s or our customers’ needs. For example, even within relational databases, some of the 3rd party apps we use at Amazon are only certified to run using Oracle databases whereas others use MySQL databases. Certain parts of our architecture used to run on relational databases but we just couldn’t scale them fast enough to meet the demands of our fast growing online retail business, particularly during the holiday shopping seasons. We endured significant disruptions to our retail infrastructure in early 2000s and had to invent a new category of databases like Dynamo that has come to be known as NoSQL. Since we moved these parts of our architecture to DynamoDB we can’t imagine doing it any other way because we don’t know of another solution that can seamlessly scale to our transaction rates while maintaining our stringent 100% availability demands. While we use DynamoDB extensively, we also have relational databases in other parts of our stack and they are equally critical.

AWS offers its customers a choice of different database services, each optimized for different workloads. DynamoDB is for customers who want high availability, predictable performance and scalability and we limit some relational functionality to achieve these critical requirements. Amazon RDS, with support for MySQL, SQL Server and Oracle databases, is for customers with apps where relational database features and support for a specific brand of database are critical. We offer high availability options called Amazon RDS Multi-AZ and commit to an availability SLA of 99.95%. We allow customers to provision the number of input and output operations (IOPS) they require by using Amazon RDS with Provisioned IOPS. Amazon ElastiCache is a fully managed, in-memory caching service for customers to optimize the latency, performance and cost of their read workloads. For our customers who need scalable datawarehouse we offer Amazon Redshift, a fast, fully managed, petabyte-scale data warehouse service that makes it simple and cost-effective to efficiently analyze all your data using your existing business intelligence tools.

Today, we are further expanding the choices available for designing and developing highly scalable and high performance apps. In a relational database, the memory working set size is critical for database performance. As the amount of data stored increases constantly, the amount of memory needed also goes up. To address this, we are adding support for a new memory-optimized instance to Amazon RDS. The db.cr1.8xlarge has 88 ECUs, 244GB of memory, high-bandwidth network, and the ability to deliver up to 20,000 IOPS for MySQL 5.6, an increase of 60 percent over the prior 12,500 IOPS limit for MySQL. This is an ideal instance for high-performance relational workloads.

Similar to how we offer multiple engines in Amazon RDS, starting today, we are supporting Redis as a new engine choice in Amazon ElastiCache, in addition to Memcached. I’ve seen Redis grow rapidly over the years and while some customers use it as a primary datastore, its main benefit is to augment your database tier to utilize data structures such as sorted sets and lists that are not readily available in traditional databases. Customers tell us that they love the ease of use and capabilities of Redis, but have been asking us to help simplify its management. Amazon ElastiCache for Redis provides the full capabilities of Redis and is designed to enable your existing libraries, applications and tools for Redis to just work. Amazon ElastiCache supports creation of Redis read replicas across availability zones and automatically detects and replaces failed read replicas. Integration with Amazon CloudWatch gives customers visibility into key performance metrics, further simplifying system management.

Many developers tell us that they want to rely on AWS to manage their databases so that they can spend their effort on building apps. For example, Scopely has built their gaming platform with DynamoDB as their primary datastore, while using Amazon RDS where they need complex query support. For features that need data structures like sorted sets (e.g., leaderboards) they have been using Redis. With the launch of Redis in ElastiCache, Scopely is planning to move its self-managed Redis to Amazon ElastiCache to obtain the added benefits of monitoring and management without having to change its existing Redis toolchain. Similarly, gumi, one of the top game developers in Japan, uses Redis extensively in their platform for real-time leaderboard tracking in addition to their usage of RDS Multi AZ and DynamoDB. gumi used to manage a large Redis fleet and is excited to begin moving away from the undifferentiated heavy lifting of self-managing Redis by adopting ElastiCache Redis.

There are many customers like Scopely and gumi who believe they should “use the best tool for the specific use case”, but selecting the right database architecture can be challenging. To simplify the selection process, I recommend a simple rule of thumb: For critical workloads which need to be highly available, at any scale, I generally recommend DynamoDB as it offers seamless scalability, predictable performance and high availability at low cost without any operational overhead. For workloads that need complex querying, transactions or specific relational features, I recommend Amazon RDS. It provides customers with familiar MySQL, Microsoft SQL Server or Oracle database engines while simplifying the monitoring and management of complex RDBMSs. You can augment your database tier with a caching layer using Amazon ElastiCache to lower read costs and reduce read latency using Memcached and now Redis, especially if you need those advanced data structures that are not typically provided by your database tier. You can analyze all of your data stored in DynamoDB or RDS using Amazon Redshift, a fully managed petabyte-scale data warehouse service that delivers increased query performance when analyzing virtually any size dataset for a tenth the cost of most traditional data warehousing solutions.

We believe in providing customers with building blocks that allow them to construct the apps they need and I’m excited to see what they’re going to do with the new options we’re announcing today.

To learn more about Amazon RDS and the new instance type, you can visit http://aws.amazon.com/rds

To learn more about ElastiCache Redis, please refer to Jeff Barr’s blog and visit http://aws.amazon.com/elasticache