All Things Distributed
Today, I’m thrilled to announce that we have expanded the query capabilities of DynamoDB. We call the newest capability Local Secondary Indexes (LSI). While DynamoDB already allows you to perform low-latency queries based on your table’s primary key, even at tremendous scale, LSI will now give you the ability to perform fast queries against other attributes (or columns) in your table. This gives you the ability to perform richer queries while still meeting the low-latency demands of responsive, scalable applications.
Our customers have been asking us to expand the query capabilities of DynamoDB and we’re excited to see how they use LSI. Milo Milovanovic, Washington Post Principal Systems Architect reports that “database performance and scalability are critical for delivering new services to our 34+ million readers on any device. For this reason, we chose DynamoDB to power our popular Social Reader app and site experience on socialreader.com. The fast and flexible query performance that local secondary indexes provide will allow us to further optimize our social intelligence, and continue to improve our readers’ experiences.”
As I discussed in a recent blog post, after years of building highly scalable and highly available e-commerce and cloud computing services, Amazon has come to realize that relational databases should only be used when an application truly needs the complex query, table join and transaction capabilities of a full-blown relational database. In all other cases, when such relational features are not needed, we default to DynamoDB as it offers a more available, more scalable and ultimately a lower cost solution.
When DynamoDB launched last year, it offered simple but powerful query capabilities. Customers could choose from two types of keys for primary index querying: Simple Hash Keys and Composite Hash Key / Range Keys:
Simple Hash Key gives DynamoDB the Distributed Hash Table abstraction. The key is hashed over the different partitions to optimize workload distribution. For more background on this please read the original Dynamo paper.
Composite Hash Key with Range Key allows the developer to create a primary key that is the composite of two attributes, a “hash attribute” and a “range attribute.” When querying against a composite key, the hash attribute needs to be uniquely matched but a range operation can be specified for the range attribute: e.g. all orders from Werner in the past 24 hours, or all games played by an individual player in the past 24 hours.
With LSI we expand DynamoDB’s existing query capabilities with support for more complex queries. Customers can now create indexes on non-primary key attributes and quickly retrieve records within a hash partition (i.e., items that share the same hash value in their primary key).
Since we launched DynamoDB, we have seen many database customers migrate their apps from traditional sharded relational database deployments to DynamoDB. Some of these developers who were used to the broad query flexibility offered by relational databases asked us to add more query functionality to DynamoDB. These developers will now find LSI to be useful and familiar, as it enables them to index non-primary key attributes and quickly query records within a hash partition. LSI enables more applications to benefit from DynamoDB’s scalability, availability, resilience, low cost and minimal operational overhead.
What are Local Secondary Indexes (LSI)?
As an example, let’s say that your social gaming application tracks player activity. Database scalability is important for social games, which can attract tens of millions of players soon after launch. Consistent, rock solid low-latency database performance is important too, because social games are highly interactive. Let’s examine how DynamoDB would support a social game, and then add the benefit of local secondary indexes.
DynamoDB stores information as database tables, which are collections of individual items. Each item is a collection of data attributes. The items are analogous to rows in a spreadsheet, and the attributes are analogous to columns. Each item is uniquely identified by a primary key, which is composed of its first two attributes, called the hash and range.
DynamoDB queries refer to the hash and range attributes of items you’d like to access. Local secondary indexes let you query for hash keys together with other attributes besides the range key. LSI queries are local in the sense they always refer to the same hash key as standard queries.
Based on the design of your game, you might decide to record each player’s final score for each game he completes. You would track at least three pieces of data:
<img src=”/images/ddb-model.png”/ width="619">
In DynamoDB, your Player Activity table might look like this:
<img src=”/images/ddb-tab1.png”/ width="619">
Suppose you always want to show players a history of the last 10 games they played. This is a natural fit for DynamoDB. By setting up a DynamoDB table with PlayerName as the hash key and GameStartTime as the range key, you can quickly run queries like: “Show me the last 10 games played by John”. However, once you set up your table like this, you couldn’t run efficient queries on other attributes like “Score”. That was before LSI. Now, you can use LSI to define a secondary index on the “Score” attribute and quickly run queries like “Show me John’s all-time top 5 scores.” The query result is automatically ordered by score.
With LSI, your application can get the data it needs much more quickly and efficiently than ever before. No more downloading and sorting through results. By using LSI, you can now push that work to DynamoDB. Crucially, it does so while protecting the scalability and performance that our customers demand. Tables with one or more LSI’s will exhibit the same latency and throughput performance as those without any indexes.
<img src=”/images/ddb-tab2.png”/ width="650">
Start with DynamoDB
The enhanced query flexibility that local secondary indexes provide means DynamoDB can support an even broader range of workloads. As I mentioned earlier, since scalability and availability of our apps are of critical importance at Amazon, we have already come to start with DynamoDB as the default choice for every application that does not require the flexibility of relational databases like Oracle or MySQL. Customers tell us they’re adopting the same practice, particularly in the areas of digital advertising, social gaming and connected device applications where high availability, seamless scalability, predictable performance and low latency are very critical.
Valentino Volonghi, Chief Architect of retargeting platform AdRoll, says “we use DynamoDB to bid on more than 7 billion impressions per day on the Web and FBX. AdRoll’s bidding system accesses more than a billion cookie profiles stored in DynamoDB, and sees uniform low-latency response. In addition, the availability of DynamoDB in all AWS regions allows our lean team to meet the rigorous low latency demands of real-time bidding in countries across the world without having to worry about infrastructure management.” In the past I have also highlighted other advertising applications from customers like Madwell and Shazam where seamless scale, high availability, predictable performance and low latency are very important.
Ankur Bulsara, CTO of the Scopely social gaming platform, says LSI will enable his team to deploy DynamoDB even more broadly. “We default to DynamoDB wherever we can, and also use MySQL for some query types,” he says. “We’re very excited that local secondary indexes will allow us to further remove traditional RDMSes from our ever-growing stack. DynamoDB is the future, and with LSI, the future is very bright.” In the past, I have highlighted many other gaming customers such as Electronic Arts and Halfbrick Studios. Gaming customers value DynamoDB’s seamless scale, since successful games can scale from a few users to tens of millions of users in a matter of weeks.
Today, local secondary indexes must be defined at the time you create your DynamoDB tables. In the future, we plan to provide you with an ability to add or drop LSI for existing tables. If you want to equip an existing DynamoDB table to local secondary indexes immediately, you can export the data from your existing table using Elastic Map Reduce, and import it to a new table with LSI.
You can get started with DynamoDB and Local Secondary Indexes right away with the DynamoDB free tier – LSI is available today in all AWS regions except GovCloud.
For more information, please see the appropriate topics in the Amazon DynamoDB developer guide.