Expanding the Cloud: The Amazon Relational Database Service (RDS)

• 1219 words

Today marks the launch of Amazon RDS - the Amazon Relational Database Service. Amazon RDS is a web service that makes it easy to set up, operate, and scale a relational database in the cloud. Amazon RDS handles all the "muck" of relational database management freeing up its users to focus on their applications and business.

Fine Tuning Data Management

At Amazon we have a long history of fine tuning our data management solutions to make sure that our systems can be reliable and cost-effective as we continue to scale. Almost from the beginning of operating the Amazon ecommerce platform it was clear that its scalability, reliability, performance, and cost-effectiveness were all dependent on the way that data was managed. In the first years of Amazon.com the site was architected like a traditional two-tier web system: a collection of application servers connected to a backend of databases. Many of the old-timer Amazonians recall how hard it was to scale the site and keep it reliable, as all of that work was rooted in scaling the centralized database servers. Looking back they jokingly talk about "duct tape and WD-40 engineering." With the move years ago from the two tier system to a fine grained, decentralized, service oriented architecture this changed dramatically.

In the Amazon services architecture, each service is responsible for its own data management, which means that each service team can pick exactly those solutions that are ideally suited for the particular application they are implementing. It allows them to tailor the data management system such that they get maximum reliability and guaranteed performance at the right cost as the system scales up. Early on already the distinction was made between key-values storage systems and structured data management. Key-Value storage systems play a very important role in the Amazon architecture and this has ultimately led to the creation of the Amazon Simple Storage Service (Amazon S3). Amazon S3 addresses the need for a highly scalable and reliable Key-Value data storage system while shielding customers from all the complexities such as geo-replication, capacity planning, and performance management at high scale.

datamodels.jpg Structured data management systems are traditionally served by relational databases but these sophisticated systems have their limitations, especially when it comes to scale and reliability. Often they also require tremendous expertise to operate efficiently and reliably especially when scaling up. Of course, a significant portion of the structured data world does not require RDBMS features such as complex transactions and relations, and can be served by a simpler, much more agile system. Such a simple structured storage system for example does not require the use of a rigid schema and can allow attributes and indexes to be adapted on the fly. This system has led to the creation of Amazon SimpleDB where its customers get the benefits of such a simple scalable structured storage system without having to worry about replication, backups, buffer cache optimizations, databases resizing, etc

There are a several applications and services that do need the feature richness of an RDBMS. Until now they were served through the use of the Relational Database AMIs that are available for Amazon EC2. These AMIs can be launched to create a compute instance with database technologies such as Vertica, Oracle, DB2, SQL Server, Sybase, and PostgreSQL. These RDBMS are best used in concert with the Amazon Elastic Block Store (EBS) to create a scalable and reliable storage volume that can be used for persisting the databases.

As I mentioned earlier, running your own database system efficiently and reliably requires expertise and dedication of resources. Quite a few of our AWS customers are running relational databases, either because they require the specific relational functionality or because they are using software packages that have been designed with RDBMS as the database solution. These customers typically spend a significant amount of time in database management. Indeed, for many of these customers database management is yet another form of "muck": the tremendous amount of work they have to do that doesn't differentiate them and prevents that from focusing more on delivering value with their product. For these customers who require a relational database but do not have a need to exert complete administrative control over their database server, there is now another option: the Amazon Relational Database Service (Amazon RDS).

Amazon Relational Database Service

Amazon RDS provides a MySQL 5.1 relational database in the cloud. It provides cost-efficient and resizable capacity, while managing time-consuming database administration tasks for customers. The service takes much of the hassle out of setting up and managing relational databases, such as backups and code patching, freeing up its users to focus on their applications and business

Amazon RDS provides the full capabilities of a MySQL Database, which means that libraries, applications and tools that have been designed for use with MySQL can be used without modification. This makes it very simple for customers to start using Amazon RDS. As with all AWS services Amazon RDS is a scalable resource; its storage, processing power and memory usage can be adjusted on demand and the customer only pays for those resources that have been used.

Amazon RDS is a very important addition to our offering of database solutions as it addresses a significant stumbling block for many of our customers; the management of relational databases. Amazon RDS makes this much simpler which will free up resources at our customers to focus on contributions that really matter to their customers.

AWS customers now have three database solutions available:

  • Amazon RDS for when the application requires a relational database but you want to reduce the time you spend on database management, Amazon RDS automates common administrative tasks to reduce your complexity and total cost of ownership. Amazon RDS allows you to manage your database compute and storage resources with a simple API call, and only pay for the infrastructure resources they actually consume.
  • Amazon EC2- Relational Database AMIs for when the application require the use of a particular relational database and/or when the customer wants to exert complete administrative control over their database. An Amazon EC2 instance can be used to run a database, and the data can be stored within an Amazon Elastic Block Store (Amazon EBS) volume. Amazon EBS is a fast and reliable persistent storage feature of Amazon EC2. Available AMIs include IBM DB2, Microsoft SQL Server, MySQL, Oracle, PostgreSQL, Sybase, and Vertica.
  • Amazon SimpleDB for applications that do not require a relational model, and that principally demand index and query capabilities. Amazon SimpleDB eliminates the administrative overhead of running a highly-available production database, and is unbound by the strict requirements of a RDBMS. With Amazon SimpleDB, you store and query data items via simple web services requests, and Amazon SimpleDB does the rest. In addition to handling infrastructure provisioning, software installation and maintenance, Amazon SimpleDB automatically indexes your data, creates geo-redundant replicas of the data to ensure high availability, and performs database tuning on customers' behalf. Amazon SimpleDB also provides no-touch scaling. There is no need to anticipate and respond to changes in request load or database utilization; the service simply responds to traffic as it comes and goes, charging only for the resources consumed.

More details at the Amazon RDS detail page and the AWS developer blog. Other relevant readings are James Hamilton's posting and the RightScale blog.