Good Advice on Keeping Your Database Simple and Fast.
Keeping your database simple and fast is often difficult if you use higher level frameworks such as ActiveRecords in Ruby or Java object persistence technologies such as Hibernate. There is a lot of magic that is happening out of sight that you have no control over. If you then have to scale your application it is often the relational database that these technologies require that becomes the performance and scaling bottleneck. Often requiring complex custom implementations of partitioning and sharding to make it work.
The AWS services Amazon S3 and Amazon SimpleDB were designed to handle the dominant storage usage patterns within Amazon and they greatly reduced our need to rely on relational storage for scaling our systems. But it is almost never the case that a single storage technique is used in applications and services that need to operate at enterprise scale. For example it is a common pattern that objects stored in S3 using a primary key, have a collection of secondary keys (e.g. metadata) stored in SimpleDB. SimpleDB provides very fast indexing for querying of the metadata that will return primary keys of objects located in S3.
At SXSW Interactive there was a great panel/presentation by Mike Subelsky, co-founder of AWS customer OtherInbox , about their experiences with scaling Ruby-on-Rails applications in the Cloud. They demonstrated that with Amazon EC2 and Amazon S3 Ruby/Rails scales just fine. The room was packed and there was some great Q&A.
During the Q&A presentation co-founder and CEO of OtherInbox Joshua Baer gave some great insight in the changing role of relational databases and some really good advice about how they were able to keep their database simple and fast. After the session I asked Joshua to explain it once more for the readers of this weblog.