<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

 <title>All Things Distributed</title>
 <link href="http://www.allthingsdistributed.com/atom.xml" rel="self"/>
 <link href="http://www.allthingsdistributed.com/"/>
 <updated>2013-05-15T18:37:29-07:00</updated>
 <id>http://www.allthingsdistributed.com/</id>
 <author>
   <name>Werner Vogels</name>
   <email>werner@allthingsdistributed.com</email>
 </author>

 
 <entry>
   <title>DynamoDB Keeps Getting Better (and cheaper!)</title>
   <link href="http://www.allthingsdistributed.com/2013/05/dynamodb-keeps-getting-better-and-cheaper.html"/>
   <updated>2013-05-15T18:30:00-07:00</updated>
   <id>http://www.allthingsdistributed.com/2013/05/dynamodb-keeps-getting-better-and-cheaper</id>
   <content type="html">&lt;p&gt;We love getting feedback so we can deliver the improvements and new features that really matter to our customers. You can see from the pace at which we roll out new functionality that teams across AWS take this very seriously. One of the teams that’s iterating quickly is DynamoDB. They recently launched &lt;a href=&quot;http://www.allthingsdistributed.com/2013/04/dynamdb-local-secondary-indices.html&quot;&gt;Local Secondary Indexes&lt;/a&gt; and today they are releasing several new features that will help customers build faster, cheaper, and more flexible applications:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Parallel Scans&lt;/strong&gt; – To be able to increase the throughput of table scans, the team has introduce new functionality that allows you to scan through the table with multiple threads concurrently. Until now scans could only be performed sequentially, but with this new feature the scan can be split into multiple segments, each retrieved on their own thread.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Provisioned Throughput&lt;/strong&gt; – To allow customer to respond to changes in load more rapidly, DynamoDB will allow the provisioned throughput to be decreased 4 times per day (was twice per day).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Read Capacity Metering&lt;/strong&gt; – The read capacity unit will be increased from 1KB to 4KB.  As a result many read scenarios will have their cost reduced to ¼ of the original cost. This also makes the DynamoDB/Redshift integration even more cost-effective, as exporting data from DynamoDB into Redshift could be up to four times cheaper.&lt;/p&gt;

&lt;p&gt;We’re excited to give DynamoDB customers better read performance, lower costs, and more provisioning flexibility. Please keep the feedback coming – we’re listening.&lt;/p&gt;

&lt;p&gt;For more details see the &lt;a href=&quot;http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/QueryAndScan.html#QueryAndScanParallelScan&quot;&gt;DynamoDB Developer Guide&lt;/a&gt; and for an overview visit the &lt;a href=&quot;http://aws.typepad.com/aws/2013/05/amazon-dynamodb-parallel-scans-and-other-good-news.html&quot;&gt;AWS developer blog&lt;/a&gt;.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Back-to-Basics Weekend Reading - Using continuations to implement thread management and communication in operating systems</title>
   <link href="http://www.allthingsdistributed.com/2013/05/continuations-in-operating-system.html"/>
   <updated>2013-05-10T09:30:00-07:00</updated>
   <id>http://www.allthingsdistributed.com/2013/05/continuations-in-operating-system</id>
   <content type="html">&lt;p&gt;&lt;img src=&quot;/images/nycawssummit.jpg&quot;/ width=&quot;650&quot;&gt;&lt;/p&gt;

&lt;p&gt;I have returned from a great series of &lt;a href=&quot;http://aws.amazon.com/aws-summit-2013/&quot;&gt;AWS Summits&lt;/a&gt; in NYC and in Europe so it is time to get back to some weekend reading.&lt;/p&gt;

&lt;p&gt;During the nineties much operating systems research focussed on &lt;a href=&quot;http://en.wikipedia.org/wiki/Microkernel&quot;&gt;microkernels&lt;/a&gt;, which resulted in a large collection of prototype systems: Mach 3.0, L3/L4, Plan 9, Xenokernel, Minix and others. Not many of those made into production, the version of Mach that rolled into Mac OS X through the XNU integration was an earlier, monolithic version. I believe commercially QNX has been the most successful microkernel.&lt;/p&gt;

&lt;p&gt;There was a wealth of interesting, fundamental research triggered by the concepts of microkernels: new communication paradigms, memory management structures, schedulers, etc. It resulted in many publications that go back to the roots of OS research. For this weekends reading I picked a more esoteric paper. As part of the Mach 3.0 research &lt;a href=&quot;http://research.microsoft.com/en-us/people/richdr/&quot;&gt;Rich Draves&lt;/a&gt; implemented continuations inside the operating system as a fundament structuring component for communication, thread management  and exception handling. There were performance improvements but more importantly, and my reason for reading the paper again, it had impact on how the OS was structured, leading to a reduction in complexity.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://wv.ly/YKDUUy&quot;&gt;&lt;em&gt;Using continuations to implement thread management and communication in operating systems&lt;/em&gt;&lt;/a&gt;, Richard Draves , Brian N. Bershad , Richard F. Rashid , All W. Dean, Proceedings of the thirteenth ACM symposium on Operating systems principles
Pages 122-136&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Expanding the Cloud: Faster, More Flexible Queries with DynamoDB</title>
   <link href="http://www.allthingsdistributed.com/2013/04/dynamdb-local-secondary-indices.html"/>
   <updated>2013-04-17T10:30:00-07:00</updated>
   <id>http://www.allthingsdistributed.com/2013/04/dynamdb-local-secondary-indices</id>
   <content type="html">&lt;p&gt;Today, I’m thrilled to announce that we have expanded the query capabilities of &lt;a href=&quot;http://aws.amazon.com/dynamodb&quot;&gt;DynamoDB&lt;/a&gt;.  We call the newest capability &lt;strong&gt;Local Secondary Indexes (LSI)&lt;/strong&gt;.  While DynamoDB already allows you to perform low-latency queries based on your table’s primary key, even at tremendous scale, LSI will now give you the ability to perform fast queries against other attributes (or columns) in your table.  This gives you the ability to perform richer queries while still meeting the low-latency demands of responsive, scalable applications.&lt;/p&gt;

&lt;p&gt;Our customers have been asking us to expand the query capabilities of DynamoDB and we’re excited to see how they use LSI.  Milo Milovanovic, Washington Post Principal Systems Architect reports that “database performance and scalability are critical for delivering new services to our 34+ million readers on any device.  For this reason, we chose DynamoDB to power our popular &lt;a href=&quot;https://itunes.apple.com/us/app/washington-post-social-reader/id496610078&quot;&gt;Social Reader app&lt;/a&gt; and site experience on &lt;a href=&quot;http://www.socialreader.com&quot;&gt;socialreader.com&lt;/a&gt;.  The fast and flexible query performance that local secondary indexes provide will allow us to further optimize our social intelligence, and continue to improve our readers’ experiences.”&lt;/p&gt;

&lt;p&gt;As I discussed in a &lt;a href=&quot;http://www.allthingsdistributed.com/2013/03/dynamodb-one-year-later.html&quot;&gt;recent blog post&lt;/a&gt;, after years of building highly scalable and highly available e-commerce and cloud computing services, Amazon has come to realize that relational databases should only be used when an application truly needs the complex query, table join and transaction capabilities of a full-blown relational database.  In all other cases, when such relational features are not needed, we default to DynamoDB as it offers a more available, more scalable and ultimately a lower cost solution.&lt;/p&gt;

&lt;p&gt;When DynamoDB launched last year, it offered simple but powerful query capabilities.  Customers could choose from two types of keys for primary index querying: Simple Hash Keys and Composite Hash Key / Range Keys:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Simple Hash Key gives DynamoDB the Distributed Hash Table abstraction. The key is hashed over the different partitions to optimize workload distribution. For more background on this please read the &lt;a href=&quot;http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf&quot;&gt;original Dynamo paper&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Composite Hash Key with Range Key allows the developer to create a primary key that is the composite of two attributes, a “hash attribute” and a “range attribute.” When querying against a composite key, the hash attribute needs to be uniquely matched but a range operation can be specified for the range attribute: e.g. all orders from Werner in the past 24 hours, or all games played by an individual player in the past 24 hours.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;With LSI we expand DynamoDB’s existing query capabilities with support for more complex queries.  Customers can now create indexes on non-primary key attributes and quickly retrieve records within a hash partition (i.e., items that share the same hash value in their primary key).&lt;/p&gt;

&lt;p&gt;Since we launched DynamoDB, we have seen many database customers migrate their apps from traditional sharded relational database deployments to DynamoDB.  Some of these developers who were used to the broad query flexibility offered by relational databases asked us to add more query functionality to DynamoDB.  These developers will now find LSI to be useful and familiar, as it enables them to index non-primary key attributes and quickly query records within a hash partition. LSI enables more applications to benefit from DynamoDB’s scalability, availability, resilience, low cost and minimal operational overhead.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What are Local Secondary Indexes (LSI)?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;As an example, let’s say that your social gaming application tracks player activity.  Database scalability is important for social games, which can attract tens of millions of players soon after launch.  Consistent, rock solid low-latency database performance is important too, because social games are highly interactive.  Let’s examine how DynamoDB would support a social game, and then add the benefit of local secondary indexes.&lt;/p&gt;

&lt;p&gt;DynamoDB stores information as database tables, which are collections of individual items.  Each item is a collection of data attributes.  The items are analogous to rows in a spreadsheet, and the attributes are analogous to columns.  Each item is uniquely identified by a primary key, which is composed of its first two attributes, called the hash and range.&lt;/p&gt;

&lt;p&gt;DynamoDB queries refer to the hash and range attributes of items you’d like to access.  Local secondary indexes let you query for hash keys together with other attributes besides the range key.  LSI queries are local in the sense they always refer to the same hash key as standard queries.&lt;/p&gt;

&lt;p&gt;Based on the design of your game, you might decide to record each player’s final score for each game he completes.  You would track at least three pieces of data:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/ddb-model.png&quot;/ width=&quot;619&quot;&gt;&lt;/p&gt;

&lt;p&gt;In DynamoDB, your Player Activity table might look like this:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/ddb-tab1.png&quot;/ width=&quot;619&quot;&gt;&lt;/p&gt;

&lt;p&gt;Suppose you always want to show players a history of the last 10 games they played. This is a natural fit for DynamoDB.  By setting up a DynamoDB table with PlayerName as the hash key and GameStartTime as the range key, you can quickly run queries like: “Show me the last 10 games played by John”. However, once you set up your table like this, you couldn’t run efficient queries on other attributes like “Score”.  That was before LSI.  Now, you can use LSI to define a secondary index on the “Score” attribute and quickly run queries like “Show me John’s all-time top 5 scores.” The query result is automatically ordered by score.&lt;/p&gt;

&lt;p&gt;With LSI, your application can get the data it needs much more quickly and efficiently than ever before.  No more downloading and sorting through results.  By using LSI, you can now push that work to DynamoDB.  Crucially, it does so while protecting the scalability and performance that our customers demand.  Tables with one or more LSI’s will exhibit the same latency and throughput performance as those without any indexes.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/ddb-tab2.png&quot;/ width=&quot;650&quot;&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Start with DynamoDB&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The enhanced query flexibility that local secondary indexes provide means DynamoDB can support an even broader range of workloads.  As I mentioned earlier, since scalability and availability of our apps are of critical importance at Amazon, we have already come to start with DynamoDB as the default choice for every application that does not require the flexibility of relational databases like Oracle or MySQL.  Customers tell us they’re adopting the same practice, particularly in the areas of digital advertising, social gaming and connected device applications where high availability, seamless scalability, predictable performance and low latency are very critical.&lt;/p&gt;

&lt;p&gt;Valentino Volonghi, Chief Architect of retargeting platform AdRoll, says “we use DynamoDB to bid on more than 7 billion impressions per day on the Web and FBX. AdRoll’s bidding system accesses more than a billion cookie profiles stored in DynamoDB, and sees uniform low-latency response. In addition, the availability of DynamoDB in all AWS regions allows our lean team to meet the rigorous low latency demands of real-time bidding in countries across the world without having to worry about infrastructure management.” In the past I have also highlighted other advertising applications from customers like Madwell and Shazam where seamless scale, high availability, predictable performance and low latency are very important.&lt;/p&gt;

&lt;p&gt;Ankur Bulsara, CTO of the Scopely social gaming platform, says LSI will enable his team to deploy DynamoDB even more broadly.  “We default to DynamoDB wherever we can, and also use MySQL for some query types,” he says.  “We’re very excited that local secondary indexes will allow us to further remove traditional RDMSes from our ever-growing stack.  DynamoDB is the future, and with LSI, the future is very bright.” In the past, I have highlighted many other gaming customers such as &lt;a href=&quot;http://www.allthingsdistributed.com/2012/06/amazon-dynamodb-growth.html&quot;&gt;Electronic Arts and Halfbrick Studios&lt;/a&gt;.  Gaming customers value DynamoDB’s seamless scale, since successful games can scale from a few users to tens of millions of users in a matter of weeks.&lt;/p&gt;

&lt;p&gt;Today, local secondary indexes must be defined at the time you create your DynamoDB tables.  In the future, we plan to provide you with an ability to add or drop LSI for existing tables.  If you want to equip an existing DynamoDB table to local secondary indexes immediately, you can &lt;a href=&quot;http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/EMRforDynamoDB.html&quot;&gt;export the data from your existing table using Elastic Map Reduce&lt;/a&gt;, and import it to a new table with LSI.&lt;/p&gt;

&lt;p&gt;You can get started with DynamoDB and Local Secondary Indexes right away with the DynamoDB free tier – LSI is available today in all AWS regions except GovCloud.&lt;/p&gt;

&lt;p&gt;For more information, please see the appropriate topics in the &lt;a href=&quot;http://aws.amazon.com/documentation/dynamodb/&quot;&gt;Amazon DynamoDB developer guide&lt;/a&gt;.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Back-to-Basics Weekend Reading - Join Processing in Relational Databases</title>
   <link href="http://www.allthingsdistributed.com/2013/04/join-processing-relational-databases.html"/>
   <updated>2013-04-12T04:00:00-07:00</updated>
   <id>http://www.allthingsdistributed.com/2013/04/join-processing-relational-databases</id>
   <content type="html">&lt;p&gt;&lt;img src=&quot;/images/bayern.jpg&quot;/ width=&quot;650&quot;&gt;&lt;/p&gt;

&lt;p&gt;Joins are one of the fundamental relational database query operations. It is very hard to implement the join operation efficiently as there any many unknowns in the execution of the operation. In the early days much relation database research was done in understanding the complexity of performing joins, what exactly impacted their performance and which approach performed better under which conditions. In 1992 Priti Mishra and Margaret Eich conducted a survey on what was achieved until then in Join Processing and described in details the algorithms, the implementation complexity and the performance. Which make it a good back-to-basics paper to read this weekend.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://wv.ly/Zmxq95&quot;&gt;&lt;em&gt;Join Processing in Relational Databases&lt;/em&gt;&lt;/a&gt;, Priti Mishra and Margaret H. Eich, ACM Computing Surveys (CSUR) Surveys, Volume 24 Issue 1, March 1992, Pages 63 - 113&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Back-to-Basics Weekend Reading - Practical Applications of Triggers and Constraints: Successes and Lingering Issues</title>
   <link href="http://www.allthingsdistributed.com/2013/04/practical-applications-triggers-constraints.html"/>
   <updated>2013-04-05T08:30:00-07:00</updated>
   <id>http://www.allthingsdistributed.com/2013/04/practical-applications-triggers-constraints</id>
   <content type="html">&lt;p&gt;&lt;img src=&quot;/images/manhattan.jpg&quot;/ width=&quot;650&quot;&gt;&lt;/p&gt;

&lt;p&gt;At the end of the 80's Ceri and Widom were researching the fundamentals of integrity constraints in databases. In 2000 they were invited by the VLDB conference to review 10 years of work around Constraints and Triggers with an eye on the practical application of both abstractions. The resulting paper gives a good overview of the fundamentals of both concepts.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://wv.ly/10zvNHC&quot;&gt;&lt;em&gt;Practical Applications of Triggers and Constraints: Success Stories and Lingering Issues&lt;/em&gt;&lt;/a&gt;, Stefano Ceri, Roberta Cochrane, and Jennifer. Widom, In 26th Very Large Data Bases Conference Proceedings, Cairo, September 2000, Pages 254-262&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Back-to-Basics Weekend Reading - Principles of Transaction-Oriented Database Recovery</title>
   <link href="http://www.allthingsdistributed.com/2013/03/transaction-oriented-database-recovery.html"/>
   <updated>2013-03-29T15:30:00-07:00</updated>
   <id>http://www.allthingsdistributed.com/2013/03/transaction-oriented-database-recovery</id>
   <content type="html">&lt;p&gt;&lt;img src=&quot;/images/bbridge3.jpg&quot;/ width=&quot;650&quot;&gt;&lt;/p&gt;

&lt;p&gt;I have been reading mainly newer papers in the beginning of this year, but it is time to get back to the basics and start reading some more historical papers again. From the time when researchers and engineers where laying the foundations for our current systems. A good early paper to start again is the Survey that Härder en Reuter did on Database Recovery in 1983.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://www.minet.uni-jena.de/dbis/lehre/ws2005/dbs1/HaerderReuter83.pdf&quot;&gt;&lt;em&gt;Principles of Transaction-Oriented Database Recovery&lt;/em&gt;&lt;/a&gt;, Theo Härder and Andreas Reuter, ACM Computing Surveys, Volume 15 Issue 4, December 1983, Pages 287-317&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>The Netflix OSS Cloud Prize</title>
   <link href="http://www.allthingsdistributed.com/2013/03/netflix-oss-cloud-prize.html"/>
   <updated>2013-03-20T17:00:00-07:00</updated>
   <id>http://www.allthingsdistributed.com/2013/03/netflix-oss-cloud-prize</id>
   <content type="html">&lt;p&gt;&lt;img src=&quot;/images/cloudprize.jpg&quot; width=&quot;650&quot;&gt;&lt;/p&gt;

&lt;p&gt;Netflix has over the years become one of the absolute best engineering powerhouses for building cloud-native applications. At AWS we are very proud to be their infrastructure partner and every day we learn from how they use our cloud services.  Many of the observations I talk about in my “21st Century Application Architectures” presentation come from seeing Netflix architects at work.&lt;/p&gt;

&lt;p&gt;Netflix has gone beyond just building great applications; they have made fundamental pieces of their cloud platform available as open source and many in the industry have responded to that with great enthusiasm, evidenced by the packed Netflix House in February where people came to hear more about NetflixOSS.&lt;/p&gt;

&lt;p&gt;But Netflix has even gone a step beyond this by sponsoring a contest for the best open source contributions to the NetflixOSS platform. The &lt;a href=&quot;https://github.com/netflix/cloud-prize&quot;&gt;Netflix OSS Cloud Prize&lt;/a&gt; carries a cash reward of $100K distributed over 10 categories. Each of the category winners will also receive $5K worth of AWS credits. The contest will run through September 15 2013 after which a Judging Panel, which I am excited to be part of, will pick the winners. The winner will be announced on October 16 and the trophies will be presented at the AWS Re: Invent conference in November in Las Vegas.&lt;/p&gt;

&lt;p&gt;The ten categories are:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Best example application mash-up&lt;/li&gt;
&lt;li&gt;Best new monkey&lt;/li&gt;
&lt;li&gt;Best contribution to code quality&lt;/li&gt;
&lt;li&gt;Best new feature&lt;/li&gt;
&lt;li&gt;Best contribution to operational tools, availability and manageability&lt;/li&gt;
&lt;li&gt;Best portability enhancement&lt;/li&gt;
&lt;li&gt;Best contribution to performance improvements&lt;/li&gt;
&lt;li&gt;Best datastore integration&lt;/li&gt;
&lt;li&gt;Best usability enhancement&lt;/li&gt;
&lt;li&gt;Judges choice award&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;For more details check out the &lt;a href=&quot;https://github.com/Netflix/Cloud-Prize/wiki&quot;&gt;documentation wiki&lt;/a&gt; of the &lt;a href=&quot;https://github.com/netflix/cloud-prize&quot;&gt;Cloud Prize node on GitHub&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Then go fork the repo and start building! I am very much looking forward to the results.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Elastic Beanstalk a la Node</title>
   <link href="http://www.allthingsdistributed.com/2013/03/beanstalk-a-la-node.html"/>
   <updated>2013-03-11T16:00:00-07:00</updated>
   <id>http://www.allthingsdistributed.com/2013/03/beanstalk-a-la-node</id>
   <content type="html">&lt;p&gt;I spent a lot of time talking to AWS developers, many working in the gaming and mobile space, and most of them have been finding Node.js well suited for their web applications. With its asynchronous, event-driven programming model, Node.js allows these developers to handle a large number of concurrent connections with low latencies. These developers typically use EC2 instances combined with one of our database services to create web services used for data retrievals or to create dynamic mobile interfaces.&lt;/p&gt;

&lt;p&gt;Today, &lt;a href=&quot;http://aws.amazon.com/elasticbeanstalk/&quot;&gt;AWS Elastic Beanstalk&lt;/a&gt; just added support for Node.js to help developers easily deploy and manage these web applications on AWS. Elastic Beanstalk automates the provisioning, monitoring, and configuration of many underlying AWS resources such as Elastic Load Balancing, Auto Scaling, and EC2. Elastic Beanstalk also provides automation to deploy your application, rotate your logs, and customize your EC2 instances. To get started, visit the &lt;a href=&quot;http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/create_deploy_nodejs.html&quot;&gt;AWS Elastic Beanstalk Developer Guide&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Two years, lots of progress, and more to come…&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Almost two years ago, we launched Elastic Beanstalk to help developers deploy and manage web applications on AWS. The team has made significant progress and continues to iterate at a phenomenal rate.&lt;/p&gt;

&lt;p&gt;Elastic Beanstalk now supports Java, PHP, Python, Ruby, Node.js, and .NET. You can deploy and manage your applications in any AWS region (except for GovCloud). Many tools are available for you to deploy and manage your application, just choose your favorite flavor. If you’re building Java applications, you can use the AWS Toolkit for Eclipse. If you’re building .NET applications, you can use the AWS Toolkit for Visual Studio. If you prefer to work in a terminal, you can use a command line tool called ‘eb’ along with Git. Partners like eXoCloud IDE also offer integration with Elastic Beanstalk.&lt;/p&gt;

&lt;p&gt;Elastic Beanstalk seamlessly connects your application to an RDS database, secures your application inside a VPC, and allows you to integrate with any AWS resource using a new mechanism called configuration files. Simply put, Elastic Beanstalk is highly customizable to meet the needs of your applications.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who is using Elastic Beanstalk?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Companies of all sizes are using Elastic Beanstalk. &lt;a href=&quot;http://www.youtube.com/watch?v=4kQzzhwmY1M&quot;&gt;Intuit&lt;/a&gt; for example uses Elastic Beanstalk for a mobile application backend called txtweb. &lt;a href=&quot;http://aws.amazon.com/solutions/case-studies/peel/&quot;&gt;Peel&lt;/a&gt; uses Elastic Beanstalk to host a real-time web service that interacts with DynamoDB.&lt;/p&gt;

&lt;p&gt;The one commonality for all these customers is the time savings and the productivity increase that they get when using Elastic Beanstalk. Elastic Beanstalk helps them focus on their applications, on scaling their applications, and on meeting tight deadlines. Productivity gains don’t always mean that you merely deliver things faster. Sometimes with increased productivity, you can also do more with less.&lt;/p&gt;

&lt;p&gt;The Elastic Beanstalk team helped me get some data around how small teams can build large scale applications. One company with less than 10 employees runs a mobile backend on Elastic Beanstalk that handles an average of 17,000 requests per second and peak traffic of more than 20,000 requests per second. The company drove the project from development to delivery in less than 2 weeks. It’s very impressive to see the innovation and scale that Elastic Beanstalk can provide even for small teams.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>DynamoDB One Year Later: Bigger, Better, and 85% Cheaper…</title>
   <link href="http://www.allthingsdistributed.com/2013/03/dynamodb-one-year-later.html"/>
   <updated>2013-03-07T20:00:00-08:00</updated>
   <id>http://www.allthingsdistributed.com/2013/03/dynamodb-one-year-later</id>
   <content type="html">&lt;p&gt;Time passes very quickly around here and I hadn’t realized until recently that over a year has gone by since we launched &lt;a href=&quot;http://aws.amazon.com/dynamodb&quot;&gt;DynamoDB&lt;/a&gt;. As I sat down with the DynamoDB team to review our progress over the last year, I realized that DynamoDB had surpassed even my own expectations for how easily applications could achieve massive scale and high availability with DynamoDB. Many of our customers have, with the click of a button, created DynamoDB deployments in a matter of minutes that are able to serve trillions of database requests per year. I’ve written about it before, but I continue to be impressed by &lt;a href=&quot;http://www.shazam.com&quot;&gt;Shazam’s&lt;/a&gt; use of DynamoDB, which is an extreme example of how DynamoDB’s fast and easy scalability can be quickly applied to building high scale applications. Shazam’s mobile app was &lt;a href=&quot;http://www.shazam.com/music/web/pressrelease.html?nid=NEWS20120206135526&quot;&gt;integrated with Super Bowl ads&lt;/a&gt;, which allowed advertisers to run highly interactive advertising campaigns during the event. Shazam needed to handle an enormous increase in traffic for the duration of the Super Bowl and used DynamoDB as part of their architecture. After working with DynamoDB for only three days, they had already managed to go from the design phase to a fully production-ready deployment that could handle the biggest advertising event of the year.&lt;/p&gt;

&lt;p&gt;In the year since DynamoDB launched, we have seen widespread adoption by customers building everything from e-commerce platforms, real-time advertising exchanges, mobile applications, Super Bowl advertising campaigns, Facebook applications, and online games. This rapid adoption has allowed us to benefit from the scale economies inherent in our architecture. We have also reduced our underlying costs through significant technical innovations from our engineering team. I’m thrilled that we are able to pass along these cost savings to our customers in the form of significantly lower prices - as much as 85% lower than before.&lt;/p&gt;

&lt;p&gt;The details of our price drop are as follows:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Throughput costs&lt;/strong&gt;: We are dropping our provisioned throughput costs for both read requests and write requests by 35%. We are also introducing a Reserved Capacity model that offers customers discounted pricing if they reserve read and write capacity for one or three years.  For customers reserving capacity for three years, the price of throughput will drop from today’s prices by 85%. For customers reserving capacity for one year, the price of throughput will drop from today’s prices by 70%. For more details on reserved capacity, please read the DynamoDB FAQs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Indexed Storage costs&lt;/strong&gt;: We are lowering the price of indexed storage by 75%.  For example, in our US East (N. Virginia) Region, the price of data storage will drop from $1 per GB per month to $0.25. All data items continue to be stored on Solid State Drives (SSDs) and are automatically replicated across multiple distinct Availability Zones to provide very high durability and availability.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How are we able to do this? &lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;DynamoDB runs on a fleet of SSD-backed storage servers that are specifically designed to support DynamoDB. This allows us to tune both our hardware and our software to ensure that the end-to-end service is both cost-efficient and highly performant. We’ve been working hard over the past year to improve storage density and bring down the costs of our underlying hardware platform. We have also made significant improvements to our software by optimizing our storage engine, replication system and various other internal components. The DynamoDB team has a mandate to keep finding ways to reduce the cost and I am glad to see them delivering in a big way. DynamoDB has also benefited from its rapid growth, which allows us to take advantage of economies of scale. As with our other services, as we’ve made advancements that allow us to reduce our costs, we are happy to pass the savings along to you.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When is it appropriate to use DynamoDB?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I am often asked: When is it appropriate to use &lt;a href=&quot;http://aws.amazon.com/dynamodb&quot;&gt;DynamoDB&lt;/a&gt; instead of a relational database?&lt;/p&gt;

&lt;p&gt;We used relational databases when designing the Amazon.com ecommerce platform many years ago. As Amazon’s business grew from being a startup in the mid-1990s to a global multi-billion-dollar business, we came to realize the scaling limitations of relational databases. A number of high profile outages at the height of the 2004 holiday shopping season can be traced back to scaling relational database technologies beyond their capabilities. In response, we began to develop a collection of storage and database technologies to address the demanding scalability and reliability requirements of the Amazon.com ecommerce platform.  This was the genesis of NoSQL databases like &lt;a href=&quot;http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html&quot;&gt;Dynamo&lt;/a&gt; at Amazon. From our own experience designing and operating a highly available, highly scalable ecommerce platform, we have come to realize that relational databases should only be used when an application really needs the complex query, table join and transaction capabilities of a full-blown relational database. In all other cases, when such relational features are not needed, a NoSQL database service like DynamoDB offers a simpler, more available, more scalable and ultimately a lower cost solution.&lt;/p&gt;

&lt;p&gt;We now believe that when it comes to selecting a database, no single database technology – not even one as widely used and popular as a relational database like Oracle, Microsoft SQL Server or MySQL - will meet all database needs. A combination of NoSQL and relational database may better service the needs of a complex application. Today, DynamoDB has become very widely used within Amazon and is used every place where we don’t need the power and flexibility of relational databases like Oracle or MySQL. As a result, we have seen enormous cost savings, on the order of 50% to 90%, while achieving higher availability and scalability as our internal teams have moved many of their workloads onto DynamoDB.&lt;/p&gt;

&lt;p&gt;So, what should you do when you’re building a new application and looking for the right database option? My recommendation is as follows: Start by looking at &lt;a href=&quot;http://aws.amazon.com/dynamodb&quot;&gt;DynamoDB&lt;/a&gt; and see if that meets your needs. If it does, you will benefit from its scalability, availability, resilience, low cost, and minimal operational overhead. If a subset of your database workload requires features specific to relational databases, then I recommend moving that portion of your workload into a relational database engine like those supported by Amazon RDS. In the end, you’ll probably end up using a mix of database options, but you will be using the right tool for the right job in your application.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Expanding the Cloud - Introducing AWS OpsWorks, a Powerful Application Management Solution</title>
   <link href="http://www.allthingsdistributed.com/2013/02/aws-opsworks.html"/>
   <updated>2013-02-18T23:30:00-08:00</updated>
   <id>http://www.allthingsdistributed.com/2013/02/aws-opsworks</id>
   <content type="html">&lt;p&gt;Today Amazon Web Services launched &lt;a href=&quot;http://aws.amazon.com/opsworks&quot;&gt;AWS OpsWorks&lt;/a&gt;, a flexible application management solution with automation tools that enable you to model and control your applications and their supporting infrastructure. OpsWorks allows you to manage the complete application lifecycle, including resource provisioning, configuration management, application deployment, software updates, monitoring, and access control.&lt;/p&gt;

&lt;p&gt;As with all the &lt;a href=&quot;http://aws.amazon.com/application-management&quot;&gt;AWS Application Management services&lt;/a&gt; AWS OpsWorks is provided at no additional charge. AWS customers only pay for those resources that they have used.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Simplified Application Management&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;OpsWorks is designed for IT administrators and ops-minded developers who want an easy way to manage applications of nearly any scale and complexity without sacrificing control. With OpsWorks you can create a logical architecture, provision resources based on that architecture, deploy your applications and all supporting software and packages in your chosen configuration, and then operate and maintain the application through lifecycle stages such as auto-scaling events and software updates.&lt;/p&gt;

&lt;p&gt;Application management has traditionally been complex and time consuming because developers have had to choose among different types of application management options that limited flexibility, reduced control, or required time to develop custom tooling. Designed to simplify processes across the entire application lifecycle, OpsWorks eliminates these challenges by providing an end-to-end flexible, automated solution that provides more operational control over applications:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Flexible&lt;/strong&gt; – OpsWorks is designed to support a wide variety of application architectures and can work with any software that has a scripted installation. Because OpsWorks uses the Chef framework, developers can use existing recipes or leverage hundreds of community-built configurations.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Automated&lt;/strong&gt; – OpsWorks uses automation to simplify operations. OpsWorks provides an event-driven configuration system with rich deployment tools that allow you to efficiently manage applications over their lifetime, including support for customizable deployments, rollback, patch management, auto scaling, and auto healing. You can roll out an application update by simply updating a single configuration and clicking a button, reducing the time spent on routine tasks.
For example, OpsWorks can set up instances to host your apps based on the exact configurations that you specify (code to deploy, RAID configuration, etc.), scale your apps using load-based or time-based auto scaling, and maintain the health of your apps by detecting and replacing failed instances. When a new app server instance starts, OpsWorks will use built-in recipes to configure the app server software and deploy your apps, and can also apply your specified recipes to make changes to your database and monitoring infrastructure.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Operational Control&lt;/strong&gt; – OpsWorks promotes conventions and sane defaults, such as template security groups, but also supports the ability to customize any aspect of an application’s configuration.  You can then reproduce the exact configuration on new instances and apply changes to all instances, ensuring consistent configuration at any time.  With support for scripted changes using Chef recipes at defined stages in the application lifecycle, you have fine-grained control of your application and its interaction with related components. Your recipes can be stored with your source code, making it easy to track changes. From one-time deployments to auto scaled growth, your application will reflect your settings through its complete lifecycle.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;&lt;img src=&quot;/images/firstrun.png&quot;/ width=&quot;650&quot;&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AWS OpsWorks Unique Features&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Customers for a long time have been asking for an Application Management solution that allows them to manage the whole application lifecycle. OpsWorks has some unique features that help customers achieve this:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Model and support any application&lt;/em&gt;&lt;br/&gt;
You can deploy your application in the configuration you choose on Amazon Linux and Ubuntu. OpsWorks lets you model your application with layers. Layers define how to configure a set of resources that are managed together. For example, you might define a web layer for your application that consists of EC2 instances, EBS volumes including RAID configuration and mount points, and Elastic IPs.  You can also define the software configuration for each layer, including installation scripts and initialization tasks. When an instance is added to a layer, OpsWorks automatically applies the specified configuration.&lt;/p&gt;

&lt;p&gt;OpsWorks provides pre-defined layers for common technologies such as Ruby, PHP, HAProxy, Memcached, and MySQL. OpsWorks promotes conventions but is flexible enough to let you customize any aspect of your environment. You can extend or modify pre-defined layers, or create your own from scratch. Because OpsWorks supports Chef recipes, you can leverage hundreds of community-built configurations such as PostgreSQL, Nginx, and Solr. For example, you can create an application that consists of multiple Python apps installed on Django connected to a CouchDB database.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Automate tasks&lt;/em&gt;&lt;br/&gt;
OpsWorks enables you to automate management actions so that they are performed automatically and reliably. You can benefit from automatic failover, package management, EBS volume RAID setup, and rule-based or time-based auto scaling.  Common tasks automatically handled for you, and you can also extend and customize that automation.  OpsWorks supports continuous configuration through lifecycle events that automatically update your instances’ configuration to adapt to environment changes, such as auto scaling events.  With OpsWorks there is no need to log in to several machines and manually update your configuration.  Whenever your environment changes, OpsWorks updates your configuration.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Control access&lt;/em&gt; &lt;br/&gt;
OpsWorks lets you control access to your application.  You choose which IAM users should have access to the application’s resources, and assign permissions that define what they can do. These controls can prevent users from inadvertently changing production resources.  An event view shows change history to simplify root cause analysis.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AWS Application Management Solutions&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;With the availability of AWS OpsWorks Amazon Web Services now has a number of different &lt;a href=&quot;http://aws.amazon.com/application-management&quot;&gt;Application Management Services&lt;/a&gt; that address the different needs of Administrators and Developers.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/app-svcs-comparison-graphic.png&quot;/ width=&quot;650&quot;&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;AWS Elastic Beanstalk&lt;/strong&gt; is an easy-to-use solution for building web apps and web services with popular application containers such as Java, PHP, Python, Ruby and .NET. You upload your code and Elastic Beanstalk automatically does the rest. Elastic Beanstalk supports the most common web architectures, application containers, and frameworks.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;AWS OpsWorks&lt;/strong&gt; is a powerful end-to-end solution that gives you an easy way to manage applications of nearly any scale and complexity without sacrificing control. You model, customize, and automate the entire application throughout its lifecycle. OpsWorks provides integrated experiences for IT administrators and ops-minded developers who want a high degree of productivity and control over operations&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;AWS CloudFormation&lt;/strong&gt; is a building block service that enables customers to provision and manage almost any AWS resource via a domain specific language. You define JSON templates and use them to provision and manage AWS resources, operating systems and application code.  CloudFormation focuses on providing foundational capabilities for the full breadth of AWS, without prescribing a particular model for development and operations.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Next to these solutions you can of course manage your compute resources directly, for example using CloudWatch, AutoScaling and Elastic Load Balancing. There are also various free tools available such for example &lt;a href=&quot;http://techblog.netflix.com/2012/06/asgard-web-based-cloud-management-and.html&quot;&gt;Asgard the Management and Deployment tool&lt;/a&gt; made available by Netflix. Also many AWS partners provide commercial solutions for managing you applications.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Summary&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;With the launch of AWS OpsWorks customers now have a very powerful solution that allows them to manage their application easily without giving up control. For more information on AWS OpsWork visit &lt;a href=&quot;http://aws.amazon.com/opsworks&quot;&gt;their detail page&lt;/a&gt;. For a comparison of the different Application Management services visit this &lt;a href=&quot;http://aws.amazon.com/application-management&quot;&gt;overview page&lt;/a&gt;. For a hands-on overview of AWS OpsWork visit the &lt;a href=&quot;http://aws.typepad.com/aws/2013/02/aws-opsworks-flexible-application-management-in-the-cloud.html&quot;&gt;AWS developer blog&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;AWS OpsWorks is built on technology developed by Berlin company &lt;a href=&quot;http://peritor.com&quot;&gt;Peritor&lt;/a&gt;, the creators of &lt;a href=&quot;http://scalarium.com&quot;&gt;Scalarium&lt;/a&gt;, which was acquired by AWS in 2012.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Amazon Redshift and Designing for Resilience</title>
   <link href="http://www.allthingsdistributed.com/2013/02/amazon-redshift-resilience.html"/>
   <updated>2013-02-15T00:01:00-08:00</updated>
   <id>http://www.allthingsdistributed.com/2013/02/amazon-redshift-resilience</id>
   <content type="html">&lt;p&gt;&lt;img src=&quot;/images/redshift.jpg&quot;/ width=&quot;650&quot;&gt;&lt;/p&gt;

&lt;p&gt;As you may remember from our announcement at re: Invent in November 2012, Amazon Redshift is a fast and powerful, fully managed, petabyte-scale data warehouse service that delivers fast query performance at less than one tenth the cost of most traditional data warehouse systems. I’ve been eagerly waiting for Amazon Redshift’s launch since we announced the service preview at re: Invent and I’m delighted that it’s now available for all customers in the US East (N. Virginia) Region, with additional AWS Regions planned for the coming months. To get started with Amazon Redshift, visit: &lt;a href=&quot;http://aws.amazon.com/redshift&quot;&gt;http://aws.amazon.com/redshift&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Amazon Redshift and Resilience&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Previously, I’ve written at length about &lt;a href=&quot;http://www.allthingsdistributed.com/2012/11/amazon-redshift.html&quot;&gt;how Amazon Redshift achieves high performance&lt;/a&gt;. Today, I’m going to focus on Amazon Redshift’s durability and fault tolerance.&lt;/p&gt;

&lt;p&gt;Amazon Redshift uses local attached storage to deliver high IO performance. To provide data durability, Amazon Redshift maintains multiple copies of your data at all times. When you load data into an Amazon Redshift cluster, it is synchronously replicated to multiple drives on other nodes in the cluster. Your data is also automatically replicated to Amazon S3 which is designed for 99.99999999% durability. Backups of your data to Amazon S3 are continuous, incremental, and automatic. This combination of in-cluster replication and continuous backup to Amazon S3 ensures you have a highly durable system. You simply load the data and Amazon Redshift takes care of the rest.&lt;/p&gt;

&lt;p&gt;Amazon Redshift implements a number of features that make the service resilient to drive and node failures within the data warehouse cluster. Although individual component failures are rare, as the number of components in a system increases, the probability of any single component failing also increases. The probability of a drive failure in a large cluster is the probability of an individual drive failure times the number of drives in the cluster. If you have a 50-node 8XL cluster containing a total of 1,200 hard drives, you will inevitably experience a drive failure at some point. You have to anticipate these sorts of failures and design your systems to be resilient to them.&lt;/p&gt;

&lt;p&gt;Amazon Redshift continuously monitors your data warehouse cluster for drive and node failures. If Amazon Redshift detects a drive failure, it automatically begins using the other in-cluster copy of the data on that drive to serve queries while also creating another copy of the data on healthy drives within the cluster. If all of the copies within the cluster are unavailable, it will bring the data down from S3. This is all entirely transparent to the running system. If Amazon Redshift detects a failure that requires a node to be replaced, it automatically provisions and configures a new node and adds it to your cluster so you can resume operations.&lt;/p&gt;

&lt;p&gt;But what about a scenario in which you need to restore an entire cluster? You can use any of the saved system or user backups to restore a copy of your cluster with a few clicks. Amazon Redshift automatically provisions and configures your cluster and begins restoring data from Amazon S3 to each node in your cluster in parallel. Amazon Redshift’s streaming restore feature enables you to resume querying as soon as the new cluster is created and basic metadata is restored. The data itself will be pulled down from S3 in the background, or brought in on demand as needed by individual queries. This is important since most queries in a typical data warehouse only access a small fraction of the data. For example, you might have three years of data in the warehouse, but have most queries referencing the last day or week. These queries will become performant quickly, as the hot data set is brought down.&lt;/p&gt;

&lt;p&gt;I tell developers all the time to plan for failure and to design their systems around it. Performance is important, but it just doesn’t matter unless the system is up. I’m very happy to see Amazon Redshift incorporating sound principles of distributed systems design for achieving availability and durability at petabyte scale. I can’t wait to see how our customers will use the service.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Amazon Redshift and Amazon DynamoDB&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I’m also pleased that we have built a powerful and easy-to-use integration between Amazon Redshift and one of our other highly available and durable services: Amazon DynamoDB. You can move all of your Amazon DynamoDB data into an Amazon Redshift table with a single command run from within Amazon Redshift:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;copy table_redshift from 'dynamodb:// table_dynamodb' 
credentials 'aws_access_key_id=xxx;aws_secret_access_key=xxx' readratio 50; 
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;I’m excited that Amazon Redshift is now available to everyone. I can’t wait to see how our customers will use the service.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Expanding the Cloud: The Amazon Elastic Transcoder</title>
   <link href="http://www.allthingsdistributed.com/2013/02/amazon-elastic-transcoder.html"/>
   <updated>2013-02-11T16:00:00-08:00</updated>
   <id>http://www.allthingsdistributed.com/2013/02/amazon-elastic-transcoder</id>
   <content type="html">&lt;p&gt;&lt;img src=&quot;/images/superbowl.jpg&quot;/ width=&quot;650&quot;&gt;&lt;/p&gt;

&lt;p&gt;While I was returning from an exciting time in New Orleans watching the Super Bowl, AWS launched a very cool, brand new service: &lt;a href=&quot;http://aws.amazon.com/elastictranscoder/&quot;&gt;Amazon Elastic Transcoder&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Amazon Elastic Transcoder is designed to be very easy to use, scalable and cost-effective video transcoding in the cloud. Jeff Barr did an excellent job running through the service on his &lt;a href=&quot;http://aws.typepad.com/aws/2013/01/amazon-elastic-transcoder.html&quot;&gt;blog&lt;/a&gt; and you can also check out the &lt;a href=&quot;http://aws.amazon.com/elastictranscoder&quot;&gt;detail page&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I spent a lot of time talking to AWS customers who were also in New Orleans, many of them who work with media, and all emphasized that transcoding fits naturally with services that we already offer like storage (&lt;a href=&quot;http://aws.amazon.com/s3&quot;&gt;Amazon S3&lt;/a&gt; and &lt;a href=&quot;http://aws.amazon.com/glacier&quot;&gt;Glacier&lt;/a&gt;) and delivery (&lt;a href=&quot;http://aws.amazon.com/cloudfront&quot;&gt;Amazon CloudFront&lt;/a&gt;). The Amazon Elastic Transcoder is designed to integrate seamlessly with all the other AWS services.&lt;/p&gt;

&lt;p&gt;The service has been met with great &lt;a href=&quot;https://twitter.com/collin1000/status/297964106386911232&quot;&gt;enthusiasm&lt;/a&gt; from many of our customers. I’ll leave you with a couple of fantastic videos that our customers created themselves of how to use Amazon Elastic Transcoder. The first is from &lt;a href=&quot;http://cloudshoring.wordpress.com/2013/01/30/transcoding-a-1080p-hd-video-on-amazon-elastic-cloud-video-transcoder-in-30-seconds/&quot;&gt;Sankar Nagarajan of Cloudshoring&lt;/a&gt; and the second is from &lt;a href=&quot;http://webvideouniversity.com/podcast/video/2013/02/01/amazon-elastic-transcoder-what-it-is-and-how-to-use-it/&quot;&gt;Web Video University&lt;/a&gt;.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Back-to-Basics Weekend Reading - Epidemics</title>
   <link href="http://www.allthingsdistributed.com/2013/01/epidemics.html"/>
   <updated>2013-01-25T18:00:00-08:00</updated>
   <id>http://www.allthingsdistributed.com/2013/01/epidemics</id>
   <content type="html">&lt;img src=&quot;/images/dumbo.jpg&quot;/ width=&quot;650&quot;&gt;
&lt;p&gt;
My paper to read this weekend was the Alan Demers' seminal paper on epidemic techniques for database replication. I realized that in 2004, before my Amazon days, I already wrote a blog post about the fundamental publications in the area of epidemics, so this seems like a good moment to revisit that with updated links, etc.
&lt;/p&gt;

&lt;h3&gt;History of Epidemics&lt;/h3&gt;

&lt;p&gt;In the past 6-8 years we have been using various epidemic techniques in building our 
reliable and scalable distributed systems 
with great success. Now that industry is starting to deal with issues of scale 
that can almost only be solved by using epidemic techniques, it becomes 
important to produce some basic pointers to the origin of the use of epidemics 
in distributed systems.&lt;/p&gt;
&lt;p&gt;In a nutshell: An epidemic style of communication or state 
sharing gives you a highly robust medium for distributed interaction. The main 
advantages are&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;&lt;i&gt;Probabilistic model.&lt;/i&gt; This doesn't mean that it gives less 
	guarantees than a deterministic model, but that we now have a good framework 
	for reasoning about the spread of information through a system over time.&lt;/li&gt;
&lt;li&gt;&lt;i&gt;A-synchronous communication pattern&lt;/i&gt;. Any good epidemic communication 
system allows you to operate in a 'fire-and -forget' mode, where, even if the 
initial sender fails, all surviving nodes will receive the update (or none 
will). &lt;/li&gt;
	&lt;li&gt;&lt;i&gt;Autonomous &amp;amp; decentralized actions&lt;/i&gt;. Epidemics techniques enable you to take actions 
	based on the data you received without the need for additional communication to reach 
	agreement with your partners; you can take decisions autonomously.&lt;/li&gt;
&lt;li&gt;&lt;i&gt;Robust with respect to message loss &amp;amp; node failures.&lt;/i&gt; Once a message 
has been received by at least one of your peers it is almost impossible to 
prevent the spread of the information through the system. In the most popular 
demo I give, the system still operates under 90% message loss with limited or no 
loss in functionality.&lt;/li&gt;
	&lt;li&gt;&lt;i&gt;Rigorous mathematical underpinnings&lt;/i&gt;. Finally we have a set 
	protocols where we can use rigorous mathematical techniques to reason about 
	the operation of the protocols under all sorts of conditions&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These 
techniques have a long history in science, but mainly in biology and 
epidemiology, and in mathematics. The bible of epidemics from a theorectic point 
of view is:&lt;/p&gt;
&lt;blockquote&gt;
	&lt;p&gt;&lt;i&gt;Epidemic Theory of Infectious Diseases and its Applications&lt;/i&gt; N.T.J. 
	Bailey Hafner Press Second Edition, 1957&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is not a computer science text, but it explains the real 
fundamentals. If you interested in more CS oriented texts than the following 
list has some general papers that deal with the basics of epidemic communication 
or 'gossip' and are a good starting point to learn about fundamentals.&lt;/p&gt;
&lt;p&gt;I purposely left the Cornell papers off this list, as not to appear too 
self-serving. I believe the work at Cornell has been ground-breaking in bringing 
the techniques to a larger CS audience and applying it to building robust and 
scalable distributed systems. I'll give a reading list of Cornell work in a 
follow-up posting.&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;B. Baker, R. Shostak, &lt;i&gt;
	&lt;a href=&quot;files/gossips-telephones.pdf&quot;&gt;Gossips and telephones&lt;/a&gt;&lt;/i&gt;, Discrete Mathematics 
	2(1972), pp. 191--193. 
	&lt;/li&gt;
	&lt;li&gt;A. Demers, D. Greene, A. Hauser, W. 
	Irish, J. Larson, S. Shenker, H. Sturgis, D. Swinehart, and 
	D. Terry.&lt;i&gt;
	&lt;a href=&quot;http://ftp.se.scene.org/pub/bitsavers.org/pdf/xerox/parc/techReports/CSL-89-1_Epidemic_Algorithms_for_Replicated_Database_Maintenance.pdf&quot;&gt;Epidemic algorithms for replicated database maintenance&lt;/a&gt;&lt;/i&gt;. 
	In Proc. ACM Symp. on the Principles of Distr. Computing, 
	pages 1--12, August 1987. &lt;/li&gt;
	&lt;li&gt;R. Golding and K. Taylor. &lt;i&gt;
	&lt;a href=&quot;http://www.soe.ucsc.edu/share/technical-reports/1992/ucsc-crl-92-13.pdf&quot;&gt;Group membership in the epidemic style&lt;/a&gt;&lt;/i&gt;. Technical 
	Report UCSC-CRL-92-13, University of California, Santa Cruz, 
	May 1992.&lt;/li&gt;
	&lt;li&gt;D. Agrawal, A. El-Abbadi, and R. Steinke. &lt;i&gt;
	&lt;a href=&quot;http://www.cse.scu.edu/~jholliday/112609-2.pdf&quot;&gt;Epidemic algorithms in replicated databases&lt;/a&gt;&lt;/i&gt;. In 
	Proc. 16th ACM SIGACT-SIGMOD Symp. Princip. of Database 
	Systems (PODS), Tucson, Arizona, May 1997 &lt;/li&gt;
	&lt;li&gt;R. Karp, C. Schindelhauer, S. Shenker, B. 
	Vocking, &lt;i&gt;
	&lt;a href=&quot;http://archive.cone.informatik.uni-freiburg.de/pubs/rumor.pdf&quot;&gt;Randomized rumor spreading&lt;/a&gt;&lt;/i&gt;, Proc. IEEE Symp. 
	Foundations of Computer Science, 2000.&lt;/li&gt;
&lt;/ul&gt;
</content>
 </entry>
 
 <entry>
   <title>My Best Christmas Present – Root Domain Support for Amazon S3 Website Hosting</title>
   <link href="http://www.allthingsdistributed.com/2012/12/root-domain-amazon-s3-website.html"/>
   <updated>2012-12-27T12:00:00-08:00</updated>
   <id>http://www.allthingsdistributed.com/2012/12/root-domain-amazon-s3-website</id>
   <content type="html">&lt;p&gt;I have been a big fan of the &lt;a href=&quot;http://docs.amazonwebservices.com/AmazonS3/latest/dev/WebsiteHosting.html&quot;&gt;Amazon S3 Static Website Hosting&lt;/a&gt; feature since its launch and this blog happily is being served from it. S3 is not only a highly reliable and available storage service but also one of the most powerful web serving engines that exists today.  By storing your website in Amazon S3, you suddenly no longer have to worry about scaling, replication, performance, security, etc. All of that is handled seamlessly by S3.&lt;/p&gt;

&lt;p&gt;As such I am very happy that the Amazon S3 team has finally knocked off the last piece of dependency on an external infrastructure piece. Until the launch today of &lt;a href=&quot;http://docs.amazonwebservices.com/AmazonS3/latest/dev/WebsiteHosting.html&quot;&gt;S3 Website Root Domain&lt;/a&gt; support you could not host your website at the root domain, but only at a subdomain. For example this website is served from the &lt;a href=&quot;http://www.allthingsdistributed.com&quot;&gt;www.allthingsdistributed.com&lt;/a&gt; domain. To have visitors also be able to type in &lt;a href=&quot;http://allthingsdistributed.com&quot;&gt;allthingsdistributed.com&lt;/a&gt; (without the www) I had to make use of a “naked domain redirect” service. I happily made use of the great service that the folks at &lt;a href=&quot;http://wwwizer.com&quot;&gt;wwwizer&lt;/a&gt; (thanks!!) provided. However I can now rely on the excellent reliability and scalability of Amazon S3 for the redirect as well.&lt;/p&gt;

&lt;p&gt;With the launch of the support for hosting root domains in Amazon S3 Website Hosting, I now can manage the whole site via &lt;a href=&quot;http://aws.amazon.com/s3/&quot;&gt;Amazon S3&lt;/a&gt; and &lt;a href=&quot;http://aws.amazon.com/route53/&quot;&gt;Amazon Route 53&lt;/a&gt; (AWS’s DNS service). Each service has one new feature. Route 53 can now specify that a root domain (e.g. allthingsdistributed.com) use an S3 Website alias. And, S3 Website Hosting can redirect that incoming traffic to your preferred domain name (e.g. www.allthingsdistributed.com).&lt;/p&gt;

&lt;p&gt;I needed to take only two steps to get this working for All Things Distributed:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;In the first step I created a new Amazon S3 bucket with the root domain name and enabled it for Website hosting. The URL for this is then &lt;em&gt;http://allthingsdistributed.com.s3-website-us-east-1.amazonaws.com/&lt;/em&gt; In the website hosting section I selected the new option to “Redirect all request to another hostname” which in my case is &lt;em&gt;www.allthingsdistributed.com&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;Then in the Route 53 console I assign the new URL (&lt;em&gt;http://allthingsdistributed.com.s3-website-us-east-1.amazonaws.com/&lt;/em&gt;) as an IPv4 Address Alias to the &lt;em&gt;allthingsdistributed.com&lt;/em&gt; record.&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;This is of course if you want both DNS names to end up at the same website. But the new Route 53 functionality by itself allows you to send traffic to your Amazon S3 website hosted at the root domain, which was something that was not possible before. You can read more about this functionality with &lt;a href=&quot;http://docs.amazonwebservices.com/AmazonS3/latest/dev/WebsiteHosting.html&quot;&gt;the walkthrough for setting up a static website in S3&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Christmas couldn’t have been better this year thanks to the Amazon S3 and Route 53 teams.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>An Album for Each Year - 2012 Version</title>
   <link href="http://www.allthingsdistributed.com/2012/12/an-album-for-each-year-2012.html"/>
   <updated>2012-12-22T18:00:00-08:00</updated>
   <id>http://www.allthingsdistributed.com/2012/12/an-album-for-each-year-2012</id>
   <content type="html">&lt;p&gt;&lt;a href=&quot;http://www.allthingsdistributed.com/2008/07/an_album_for_each_year.html&quot;&gt;About 5 years ago&lt;/a&gt; I joined a challenge to
list &amp;quot;a favorite album for every year of your life.&amp;quot; The challenge has two restrictions: &lt;i&gt;only one album per year and
there can be no repeats of artists&lt;/i&gt;. I added for myself the
restriction that I should actually own the album, which restricts the set to
choose from significantly and also makes for some peculiar choices.&lt;/p&gt;

&lt;p&gt;My list stopped in 2007, so now that 2012 is almost at its end it is a good moment to add the next 5 years to the list&lt;/p&gt;

&lt;p&gt;
1958: Jerry Lee Lewis, &lt;i&gt;Great Balls of Fire &lt;/i&gt;&lt;br&gt;
1959: Ray Charles, &lt;i&gt;What I'd Say &lt;/i&gt;&lt;br&gt;
1960: Miles Davis, &lt;i&gt;Sketches of Spain &lt;/i&gt;&lt;br&gt;
1961: Robert Johnson, &lt;i&gt;King of the Delta Blues Singers &lt;/i&gt;&lt;br&gt;
1962: Booker T &amp; MG, &lt;i&gt;Green Onions &lt;/i&gt;&lt;br&gt;
1963: James Brown, &lt;i&gt;Live at the Apollo &lt;/i&gt;&lt;br&gt;
1964: John Coltrane, &lt;i&gt;Love Supreme &lt;/i&gt;&lt;br&gt;
1965: Bob Dylan, &lt;i&gt;Highway 61 Revisted &lt;/i&gt;&lt;br&gt;
1966: Cream, &lt;i&gt;Fresh Cream &lt;/i&gt;&lt;br&gt;
1967: The Doors, &lt;i&gt;The Doors &lt;/i&gt;&lt;br&gt;
1968: Johnny Cash , &lt;i&gt;At Folsom Prison &lt;/i&gt;&lt;br&gt;
1969: Rolling Stones, &lt;i&gt;Let it Bleed &lt;/i&gt;&lt;br&gt;
1970: The Who, &lt;i&gt;Live at Leeds &lt;/i&gt;&lt;br&gt;
1971: Marvin Gaye, &lt;i&gt;What's going on &lt;/i&gt;&lt;br&gt;
1972: Deep Purple, &lt;i&gt;Made in Japan &lt;/i&gt;&lt;br&gt;
1973: Pink Floyd, &lt;i&gt;Dark Side of the Moon &lt;/i&gt;&lt;br&gt;
1974: Genesis, &lt;i&gt;The Lamb Lies Down on Broadway &lt;/i&gt;&lt;br&gt;
1975: Led Zeppelin, &lt;i&gt;Physical Graffiti &lt;/i&gt;&lt;br&gt;
1976: Eagles, &lt;i&gt;Hotel California &lt;/i&gt;&lt;br&gt;
1977: The Stranglers, &lt;i&gt;Rattus Norvegicus  &lt;/i&gt;&lt;br&gt;
1978: Herman Brood and his Wild Romance, &lt;i&gt;Shpritsz &lt;/i&gt;&lt;br&gt;
1979: The Clash, &lt;i&gt;London Calling &lt;/i&gt;&lt;br&gt;
1980: AC/DC, &lt;i&gt;Black in Black &lt;/i&gt;&lt;br&gt;
1981: The Police, &lt;i&gt;Ghost in the Machine &lt;/i&gt;&lt;br&gt;
1982: Steel Pulse, &lt;i&gt;True Democracy &lt;/i&gt;&lt;br&gt;
1983: U2, &lt;i&gt;Under a Blood Red Sky &lt;/i&gt;&lt;br&gt;
1984: Talking Heads, &lt;i&gt;Stop Making Sense &lt;/i&gt;&lt;br&gt;
1985: John Cougar Mellencamp, &lt;i&gt;Scarecrow &lt;/i&gt;&lt;br&gt;
1986: Run DMC, &lt;i&gt;Raising Hell &lt;/i&gt;&lt;br&gt;
1987: Guns N' Roses, &lt;i&gt;Appetite for Destruction &lt;/i&gt;&lt;br&gt;
1988: Public Enemy, &lt;i&gt;It Takes A Nation of Millions to Hold Us Back &lt;/i&gt;&lt;br&gt;
1989: Eric Clapton, &lt;i&gt;Journeyman &lt;/i&gt;&lt;br&gt;
1990: Angelo Badalamenti, &lt;i&gt;Twin Peaks Soundtrack &lt;/i&gt;&lt;br&gt;
1991: Nirvana, &lt;i&gt;Nervermind &lt;/i&gt;&lt;br&gt;
1992: Rage Against the Machine, &lt;i&gt;Rage Against the Machine &lt;/i&gt;&lt;br&gt;
1993: Live, &lt;i&gt;Throwing Copper &lt;/i&gt;&lt;br&gt;
1994: Neil Young , &lt;i&gt;Sleeps with Angels &lt;/i&gt;&lt;br&gt;
1995: Garbage, &lt;i&gt;Garbage &lt;/i&gt;&lt;br&gt;
1996: James Cotton, &lt;i&gt;Deep in the Blues &lt;/i&gt;&lt;br&gt;
1997: Erykah Badu, &lt;i&gt;Baduizm &lt;/i&gt;&lt;br&gt;
1998: DMX, &lt;i&gt;Flesh of my Flesh, Blood of my Blood&lt;/i&gt;&lt;br&gt;
1999: Red Hot Chili Peppers, &lt;i&gt;Californication &lt;/i&gt;&lt;br&gt;
2000: Eminem, &lt;i&gt;The Marshal Mathers LP &lt;/i&gt;&lt;br&gt;
2001: The Strokes, &lt;i&gt;Is This It &lt;/i&gt;&lt;br&gt;
2002: Richard Locker, &lt;i&gt;Jewish Cello Master Pieces &lt;/i&gt;&lt;br&gt;
2003: Linkin Park, &lt;i&gt;Meteora &lt;/i&gt;&lt;br&gt;
2004: Green Day, &lt;i&gt;American Idiot &lt;/i&gt;&lt;br&gt;
2005: Fiona Apple , &lt;i&gt;Extraordinary Machine &lt;/i&gt;&lt;br&gt;
2006: Matisyahu, &lt;i&gt;Youth &lt;/i&gt;&lt;br&gt;
2007: Foo Fighters, &lt;i&gt;Echoes, Silence, Patience &amp; Grace &lt;/i&gt;&lt;br&gt;
2008: Hercules and Love Affair, &lt;i&gt;Hercules and Love Affair&lt;/i&gt;&lt;br&gt;
2009: Street Sweeper Social Club, &lt;i&gt;Street Sweeper Social Club&lt;/i&gt;&lt;br&gt;
2010: The Black Keyes, &lt;i&gt;Brothers&lt;/i&gt;&lt;br&gt;
2011: Skrillex, &lt;i&gt;Bangarang&lt;/i&gt;&lt;br&gt;
2012: Jack White, &lt;i&gt;Blunderbuss&lt;/i&gt;&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;The last year is always the hardest as it hasn't really distilled itself yet what is going to be this years best album. I might have picked Fiona Apple again, but she already has a 2005 spot. Same for Eminem. Nas is definitely one of this years better rappers, but just like Kanye in 2010 I can't convince myself it is this years best album. Although everyone hypes Frank Ocean and Kendrick Lamar they still have to grow on me. If I hadn't settled on Jack White I may have picked Hans Zimmer's soundtrack for Dark Knight Rising.&lt;/p&gt;

&lt;p&gt;And to repeats my comments from 5 years ago:&lt;/p&gt;

&lt;p&gt;&lt;i&gt;The hardest part was leaving Albums out; Too
many masterpieces in the 70's for example. But also some other era were
difficult: I really wanted Linton Kwesi Johnson in there but every time he had formidable
competition.  Madness got beaten by AC/DC, Beastie Boys by  Run DMC, Nirvana
kept Metallica out, The Stranglers win it from Jonny Rotten every time,.
Honorable mentions for Traffic, Apocalyptica, Counting Crows and Nine Inch
Nails; they almost made it.&lt;/i&gt;&lt;/p&gt;
 
</content>
 </entry>
 
 <entry>
   <title>The Back-to-Basics Readings of 2012</title>
   <link href="http://www.allthingsdistributed.com/2012/12/paper-readings-2012.html"/>
   <updated>2012-12-18T22:00:00-08:00</updated>
   <id>http://www.allthingsdistributed.com/2012/12/paper-readings-2012</id>
   <content type="html">&lt;p&gt;&lt;img src=&quot;/images/amsterdamcs.jpg&quot;/ width=&quot;650&quot;&gt;&lt;/p&gt;

&lt;p&gt;After the AWS re: Invent conference I spent two weeks in Europe for the last customer visits of the year. I have since returned and am now in New York City enjoying a few days of winding down the last activities of the year before spending the holidays here with family. Do not expect too many blog posts or twitter updates. Although there are still a few very exciting AWS news updates to happen this year.&lt;/p&gt;

&lt;p&gt;I thought this was a good moment to collect all the readings I suggested this year in one summary post. It was not until later in the year that I started to recording the readings here on the blog, so I hope this is indeed the complete list. I am pretty sure some if not all of these papers deserved to be elected to the hall of fame of best papers in distributed systems.&lt;/p&gt;

&lt;p&gt;Feb 11 - &lt;a href=&quot;http://www.cs.utexas.edu/users/lorenzo/papers/SurveyFinal.pdf&quot;&gt;&lt;em&gt;A Survey of Rollback-Recovery Protocols in Message-Passing Systems&lt;/em&gt;&lt;/a&gt;,  E. N. ( Mootaz) Elnozahy , Lorenzo Alvisi , Yi-min Wang , David B. Johnson, ACM Computing Surveys (CSUR) Surveys Volume 34 Issue 3, September 2002&lt;/p&gt;

&lt;p&gt;May 19 - &lt;a href=&quot;http://necsi.edu/research/sports/soccer/soccer.pdf&quot;&gt;&lt;em&gt;Science of Winning Soccer: Emergent pattern-forming dynamics in association football&lt;/em&gt;&lt;/a&gt;, L. Vilar, D. Araújo, K. Davids, Y. Bar-Yam, Journal of Systems Science and Complexity&lt;/p&gt;

&lt;p&gt;May 28 - &lt;a href=&quot;http://courses.csail.mit.edu/6.852/01/papers/VirtTime_GlobState.pdf&quot;&gt;&lt;em&gt;Virtual Time and Global States of Distributed Systems&lt;/em&gt;&lt;/a&gt;, Friedemann Mattern, Parallel and Distributed Algorithms, North-Holland (1989) , p. 215--226.&lt;/p&gt;

&lt;p&gt;Jul 4 - &lt;em&gt;&lt;a href=&quot;http://www.stanford.edu/class/cs240/readings/89-leases.pdf&quot;&gt;Leases: An efficient fault-tolerant mechanism for distributed file cache consistency&lt;/a&gt;&lt;/em&gt;, Gray, Cary, and David Cheriton, Vol. 23. No. 5. ACM, 1989.&lt;/p&gt;

&lt;p&gt;Jul 6 - &lt;em&gt;&lt;a href=&quot;http://wv.ly/LuOJ6T&quot;&gt;End-To-End Arguments in System Design&lt;/a&gt;&lt;/em&gt;, by J. H. Saltzer, D. P. Reed, and D. D. Clark, ACM Transactions on Computer Systems 2(4):277-288, November 1984&lt;/p&gt;

&lt;p&gt;Jul - 13 &lt;em&gt;&lt;a href=&quot;http://wv.ly/NrRpQ9&quot;&gt;Hints for Computer Systems Design&lt;/a&gt;&lt;/em&gt; Proceedings of the Ninth ACM Symposium on Operating Systems Principles, pp. 33-48, October 1983, Bretton Woods, NH, USA.&lt;/p&gt;

&lt;p&gt;Jul 20 - &lt;em&gt;&lt;a href=&quot;http://www.stanford.edu/class/cs240/readings/disco.pdf&quot;&gt;Disco: Running Commodity Operating Systems on Scalable Multiprocessors&lt;/a&gt;&lt;/em&gt; by Edouard Bugnion, Scott Devine, Kinshuk Govil, Mendel Rosenblum in the Proceedings of the 16th ACM Symposium on Operating Systems Principles, October 5-8, 1997, St. Malo, France.&lt;/p&gt;

&lt;p&gt;July 20 &lt;em&gt;&lt;a href=&quot;http://www.cl.cam.ac.uk/Research/SRG/netos/papers/2003-xensosp.pdf&quot;&gt;Xen and the art of virtualization&lt;/a&gt;&lt;/em&gt; by Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Timothy L. Harris, Alex Ho, Rolf Neugebauer, Ian Pratt, Andrew Warfield, in the Proceedings of the 19th ACM Symposium on Operating Systems Principles, October 19-22, 2003, Bolton Landing, NY USA.&lt;/p&gt;

&lt;p&gt;Aug 11 - &quot;&lt;em&gt;&lt;a href=&quot;http://wv.ly/OfxZku&quot;&gt;On the Naming and Binding of Network Destinations&lt;/a&gt;&lt;/em&gt;&quot;, Saltzer, J. H., RFC 1498, August 1993.&lt;/p&gt;

&lt;p&gt;Aug 17 - &lt;em&gt;&lt;a href=&quot;http://wv.ly/SxMFIN&quot;&gt;SEDA: An Architecture for Well-Conditioned, Scalable Internet Services&lt;/a&gt;&lt;/em&gt;, Matt Welsh, David Culler, and Eric Brewer. In Proceedings of the Eighteenth Symposium on Operating Systems Principles (SOSP-18), Banff, Canada, October, 2001.&lt;/p&gt;

&lt;p&gt;Aug 24 - &lt;em&gt;&lt;a href=&quot;http://www.hpl.hp.com/techreports/tandem/TR-86.1.pdf&quot;&gt;The 5 Minute Rule for Trading Memory for Disk Accesses and The 10 Byte Rule for Trading Memory for CPU Time&lt;/a&gt;&lt;/em&gt;, Jim Gray and Gianfranco Putzolu, Proceedings of the ACM SIGMOD Conference, pp. 395–398, 1987&lt;/p&gt;

&lt;p&gt;Aug 24 - &lt;em&gt;&lt;a href=&quot;ftp://ftp.research.microsoft.com/pub/tr/tr-97-33.pdf&quot;&gt;The Five-Minute Rule Ten Years Later, and Other Computer Storage Rules of Thumb&lt;/a&gt;&lt;/em&gt;, Jim Gray and Goetz Graefe, ACM SIGMOD Record 26 (4): 63–68, 1997&lt;/p&gt;

&lt;p&gt;Aug 24 - &lt;em&gt;&lt;a href=&quot;http://cacm.acm.org/magazines/2009/7/32091-the-five-minute-rule-20-years-later/fulltext&quot;&gt;The Five-Minute Rule 20 Years Later: and How Flash Memory Changes the Rules&lt;/a&gt;&lt;/em&gt;. Goetz Graefe, ACM Queue 6(4): 40-52 (2008)&lt;/p&gt;

&lt;p&gt;Sep 7 - &lt;em&gt;&lt;a href=&quot;http://wv.ly/OWQV4K&quot;&gt;Adaptive load sharing in homogeneous distributed systems&lt;/a&gt;&lt;/em&gt;, D Eager, ED Lazowska and J Zahorjan - IEEE transactions on software engineering, 1986&lt;/p&gt;

&lt;p&gt;Aug 31 - &lt;em&gt;&lt;a href=&quot;http://wv.ly/R2Dnml&quot;&gt;Granularity of Locks and Degrees of Consistency&lt;/a&gt;&lt;/em&gt;, J. Gray, R. Lorie, G.F. Putzolu, and I.L. Traiger, Modeling in Data Base Management Systems, G.M. Nijssen ed., North Holland Pub., 1976, pp. 364-394.&lt;/p&gt;

&lt;p&gt;Sep 14 - &lt;em&gt;&quot;&lt;a href=&quot;http://nms.lcs.mit.edu/papers/cryptdb-sosp11.pdf&quot;&gt;CryptDB: Protecting Confidentiality with Encrypted Query Processing&lt;/a&gt;&quot;&lt;/em&gt;, Raluca Ada Popa and Catherine Redfield and Nickolai Zeldovich and Hari Balakrishnan,Symposium on Operating Systems Principles Cascais, Portugal, October 2011&lt;/p&gt;

&lt;p&gt;Sep 21 - &lt;em&gt;&lt;a href=&quot;http://www.stanford.edu/class/cs240/readings/89-leases.pdf&quot;&gt;Leases: An Efficient Fault-Tolerant Mechanism for Distributed File Cache Consistency&lt;/a&gt;&lt;/em&gt;, Cary G. Gray and David R. Cheriton, Proceedings of the Twelfth ACM Symposium on Operating Systems Priciples (SOSP), December 1989, Litchfield Park, AZ, USA.&lt;/p&gt;

&lt;p&gt;Sep 28 - &lt;em&gt;&lt;a href=&quot;http://www.michaelnielsen.org/ddi/why-bloom-filters-work-the-way-they-do/&quot;&gt;Why Bloom filters work the way they do&lt;/a&gt;&lt;/em&gt;, Michael Nielsen, Data driven Intelligence, September 26,2012&lt;/p&gt;

&lt;p&gt;Sep 28 - &lt;em&gt;&lt;a href=&quot;http://pages.cs.wisc.edu/~jussara/papers/00ton.pdf&quot;&gt;Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol&lt;/a&gt;&lt;/em&gt;, Li Fan, Pei Cao, Jussara Almeida, Anrei Broder, IEEE/ACM Transactions on Networking, 8(3):281-293,2000.&lt;/p&gt;

&lt;p&gt;Oct 12 - &lt;em&gt;&lt;a href=&quot;http://www.hpl.hp.com/techreports/Compaq-DEC/SRC-RR-77.pdf&quot;&gt;Automatic Reconfiguration in Autonet&lt;/a&gt;&lt;/em&gt; Thomas Rodeheffer and Michael Schroeder, Proceedings of the 13th ACM Symposium on Operating Systems Principles, October 13-16, 1991, Pacific Grove, CA USA.&lt;/p&gt;

&lt;p&gt;Nov 2 - &lt;a href=&quot;http://www.cs.cmu.edu/~15-610/READINGS/required/availability/gifford79.pdf&quot;&gt;&lt;em&gt;Weighted voting for replicated data&lt;/em&gt;&lt;/a&gt;, David K. Gifford, Proceedings of the 7th ACM Symposium on Operating Systems Principles, December 10-12, 1979, Pacific Grove, CA USA&lt;/p&gt;

&lt;p&gt;Nov 9 - &lt;a href=&quot;http://jmiller.uaa.alaska.edu/cse465-fall2012/papers/needham1978.pdf&quot;&gt;&lt;em&gt;Using Encryption for Authentication in Large Networks of Computers&lt;/em&gt;&lt;/a&gt;, Roger M. Needham and Michael D. Schroeder, Communications of the ACM 21(12), December 1978, pp.993-998.&lt;/p&gt;

&lt;p&gt;Nov 17 - &lt;a href=&quot;http://inst.cs.berkeley.edu/~cs262/sp02/Papers/afs.pdf&quot;&gt;&lt;em&gt;Scale and performance in a distributed file system&lt;/em&gt;&lt;/a&gt;, John H. Howard, Michael L. Kazar, Sherri G. Menees, David A. Nichols, M. Satyanarayanan, Robert N. Sidebotham, and Michael J. West, ACM Trans. on Computer Systems 6(1), February 1988, pp. 51-81.&lt;/p&gt;

&lt;p&gt;Nov 30 - &lt;a href=&quot;http://courses.csail.mit.edu/6.885/spring06/papers/AwerbuchPeleg-focs.pdf&quot;&gt;&lt;em&gt;Sparse Partition&lt;/em&gt;&lt;/a&gt;, Baruch Awerbuch and David Peleg, Proceedings of the 31st Annual Symposium on Foundations of Computer Science (FOCS), 503-513, October 1990.&lt;/p&gt;

&lt;p&gt;I hope you enjoyed some of them. I certainly did. Next year there will be a new batch so catch up on what you haven't read during the holidays :-).&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Back-to-Basics Weekend Reading - Sparse Partitions</title>
   <link href="http://www.allthingsdistributed.com/2012/11/sparse-partitions.html"/>
   <updated>2012-11-30T11:00:00-08:00</updated>
   <id>http://www.allthingsdistributed.com/2012/11/sparse-partitions</id>
   <content type="html">&lt;p&gt;&lt;img src=&quot;/images/vegas.jpg&quot;/ width=&quot;650&quot;&gt;&lt;/p&gt;

&lt;p&gt;The amazing &lt;a href=&quot;reinvent.awsevent.com&quot;&gt;AWS re: Invent&lt;/a&gt; conference completed last night and I am on my way to Europe for a last visit to customers this year. I am carrying with me a more theoretical paper on the principles of distributed computing: Sparse Partitions by Awerbug and Peleg. It deals with the failure of control if networks grow larger and presents several solutions based on locality that have found practical applications.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://courses.csail.mit.edu/6.885/spring06/papers/AwerbuchPeleg-focs.pdf&quot;&gt;&lt;em&gt;Sparse Partition&lt;/em&gt;&lt;/a&gt;, Baruch Awerbuch and David Peleg, Proceedings of the 31st Annual Symposium on Foundations of Computer Science (FOCS), 503-513, October 1990.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Expanding the Cloud – Announcing Amazon Redshift, a Petabyte-scale Data Warehouse Service</title>
   <link href="http://www.allthingsdistributed.com/2012/11/amazon-redshift.html"/>
   <updated>2012-11-28T09:00:00-08:00</updated>
   <id>http://www.allthingsdistributed.com/2012/11/amazon-redshift</id>
   <content type="html">&lt;p&gt;Today, we are excited to announce the limited preview of &lt;a href=&quot;http://aws.amazon.com/redshift&quot;&gt;Amazon Redshift&lt;/a&gt;, a fast and powerful, fully managed, petabyte-scale data warehouse service in the cloud. Amazon Redshift enables customers to obtain dramatically increased query performance when analyzing datasets ranging in size from hundreds of gigabytes to a petabyte or more, using the same SQL-based business intelligence tools they use today. Customers have been asking us for a data warehouse service for some time now and we’re excited to be able to deliver this to them.&lt;/p&gt;

&lt;p&gt;Amazon Redshift uses a variety of innovations to enable customers to rapidly analyze datasets ranging in size from several hundred gigabytes to a petabyte and more. Unlike traditional row-based relational databases, which store data for each row sequentially on disk, Amazon Redshift stores each column sequentially. This means that Redshift performs much less wasted IO than a row-based database because it doesn’t read data from columns it doesn’t need when executing a given query. Also, because similar data are stored sequentially, Amazon Redshift can compress data efficiently, which further reduces the amount of IO it needs to perform to return results.&lt;/p&gt;

&lt;p&gt;Amazon Redshift’s architecture and underlying platform are also optimized to deliver high performance for data warehousing workloads. Redshift has a massively parallel processing (MPP) architecture, which enables it to distribute and parallelize queries across multiple low cost nodes. The nodes themselves are designed specifically for data warehousing workloads. They contain large amounts of locally attached storage on multiple spindles and are connected by a minimally oversubscribed 10 Gigabit Ethernet network. This configuration maximizes the amount of throughput between your storage and your CPUs while also ensuring that data transfer between nodes remains extremely fast.&lt;/p&gt;

&lt;p&gt;When you provision an Amazon Redshift cluster, you can select from 1 to 100 nodes depending on your storage and performance requirements and easily scale up or down as those requirements change. You have a choice of two node types when provisioning a cluster, an extra large node (XL) with 2TB of compressed storage or an eight extra large (8XL) with 16TB of compressed storage. Amazon Redshift’s MPP architecture makes it easy to resize your cluster to keep pace with your storage and performance requirements. You can start with 2TB of capacity in your data warehouse cluster and easily scale up to a petabyte and more.&lt;/p&gt;

&lt;p&gt;Parallelism isn’t just about queries. Amazon Redshift takes it a step further by applying it operations like loads, backups, and restores. For example, when loading data from Amazon S3, you simply issue a SQL copy command with the location of your S3 bucket. Redshift analyzes the contents of your bucket and parallel loads each node simultaneously, taking advantage of the increased bandwidth of multiple connections to S3. If you choose to load your data in a round robin fashion, you’re done. If you choose a hash-partitioning scheme, your data is automatically redistributed to the correct node. Amazon Redshift also extends this parallelism to backups, which are taken from each node and are automated, continuous, and incremental. Restoring a cluster from an S3 backup is also a node-parallel operation. With all of these operations, our goal is to minimize the time you spend performing operations with large data sets.&lt;/p&gt;

&lt;p&gt;The result of our focus on performance has been dramatic. Amazon.com’s data warehouse team has been piloting Amazon Redshift and comparing it to their on-premise data warehouse for a range of representative queries against a two billion row data set. They saw speedups ranging from 10x – 150x!&lt;/p&gt;

&lt;p&gt;Until now, these levels of performance and scalability were prohibitively expensive. I’m happy to say that this is not how we do things at Amazon. You can get started with a single 2TB Amazon Redshift node for $0.85/hour On-Demand and pay by the hour with no long-term commitments or upfront costs. This works out to $3,723 per terabyte per year. If you have stable, long running workloads, you can take advantage of our three year reserved instance pricing to lower Redshift’s price to under $1,000 per terabyte per year, one tenth the price of most data warehousing solutions available to customers today. In the case of Amazon.com’s data warehouse team, their existing data warehouse is a multi-million dollar system with 32 nodes, 128 CPUs, 4.2TB of RAM, and 1.6PB of disk. They achieved their speedups with an Amazon Redshift cluster with 2 8XL nodes and an effective 3 year reserved instance price of $3.65/hour, or less than $32,000 per year.&lt;/p&gt;

&lt;p&gt;In addition to being expensive, self-managed on-premise data warehouses require significant time and resource to administer. Loading, monitoring, tuning, taking backups, and recovering from faults are complex and time-consuming tasks. Amazon Redshift changes this by managing all the work needed to set up, operate, and scale a data warehouse enabling you to focus on analyzing your data and generating business insights.&lt;/p&gt;

&lt;p&gt;We designed Amazon Redshift with integration and compatibility in mind. Redshift integrates with Amazon Simple Storage Service (S3) and Amazon DynamoDB, with support for Amazon Relational Database Service (RDS) and Amazon Elastic MapReduce coming soon. You can connect your SQL-based clients or business intelligence tools to Amazon Redshift using standard PostgreSQL drivers over JDBC or ODBC connections. Jaspersoft and MicroStrategy have already certified Amazon Redshift for use with their platforms, with additional business intelligence tools coming soon.&lt;/p&gt;

&lt;p&gt;I believe that Amazon Redshift’s combination of performance, price, manageability, and compatibility will make analyzing larger and larger data sets economically justifiable. I look forward to seeing how our customers put this technology to work.&lt;/p&gt;

&lt;p&gt;To learn more about Amazon Redshift, visit the AWS blog and sign up for an invitation to the limited preview at &lt;a href=&quot;http://aws.amazon.com/redshift&quot;&gt;http://aws.amazon.com/redshift&lt;/a&gt;.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Back-to-Basics Weekend Reading - The Andrew File System</title>
   <link href="http://www.allthingsdistributed.com/2012/11/andrew-file-system.html"/>
   <updated>2012-11-17T23:00:00-08:00</updated>
   <id>http://www.allthingsdistributed.com/2012/11/andrew-file-system</id>
   <content type="html">&lt;p&gt;&lt;img src=&quot;/images/brussel.jpg&quot;/ width=&quot;650&quot;&gt;&lt;/p&gt;

&lt;p&gt;This weekend I am heading to Brussels for meetings with the European Commission, specifically with Vice-president Neelie Kroes who owns the Digital Agenda for the EU, about how to accelerate cloud usage in both business and government in Europe.&lt;/p&gt;

&lt;p&gt;I am bringing with me a paper with one of first distributed systems that had actually see wide-spread commercial deployment. The Andrew File System (AFS) was developed at CMU and was much more than just a distributed file systems and had a very interesting caching and volume replication architecture.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://inst.cs.berkeley.edu/~cs262/sp02/Papers/afs.pdf&quot;&gt;&lt;em&gt;Scale and performance in a distributed file system&lt;/em&gt;&lt;/a&gt;, John H. Howard, Michael L. Kazar, Sherri G. Menees, David A. Nichols, M. Satyanarayanan, Robert N. Sidebotham, and Michael J. West, ACM Trans. on Computer Systems 6(1), February 1988, pp. 51-81.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Expanding the Cloud – introducing the Asia Pacific (Sydney) Region</title>
   <link href="http://www.allthingsdistributed.com/2012/11/asia-pacifc-sydney-region.html"/>
   <updated>2012-11-12T05:00:00-08:00</updated>
   <id>http://www.allthingsdistributed.com/2012/11/asia-pacifc-sydney-region</id>
   <content type="html">&lt;p&gt;&lt;img src=&quot;/images/sydneypan.jpg&quot;/ width=&quot;650&quot;&gt;&lt;/p&gt;

&lt;p&gt;Today, Amazon Web Services has greater worldwide coverage with the launch of a new AWS Region in Sydney, Australia. This new Asia Pacific (Sydney) Region has been highly requested by companies worldwide, and it provides low latency access to AWS services for those who target customers in Australia and New Zealand. The Region launches with two Availability Zones to help customers build highly available applications.&lt;/p&gt;

&lt;p&gt;I have visited Australia at least twice every year for the past four years and I have seen first-hand evidence of the tremendous interest there is in the AWS service. Many young businesses as well as established enterprises are already using AWS, many of them targeting customers globally. Cool ecommerce sites such as &lt;a href=&quot;http://www.redbubble.com/&quot;&gt;redbubble.com&lt;/a&gt;, big traffic sites such as &lt;a href=&quot;http://realestate.com.au&quot;&gt;realestate.com.au&lt;/a&gt;, innovative crowd sourcing with &lt;a href=&quot;http://www.99designs.com&quot;&gt;99designs&lt;/a&gt;, big-data driven real-time advertising trading with &lt;a href=&quot;http://www.brandscreen.com&quot;&gt;Brandscreen&lt;/a&gt;, mobile sports apps by &lt;a href=&quot;http://www.vodafone.com.au/personal/cricket/cricket/live-app&quot;&gt;Vodafone Hutchinson Australia&lt;/a&gt;, big banks like &lt;a href=&quot;http://www.commbank.com.au/&quot;&gt;Commonwealth Bank of Australia&lt;/a&gt;, these are just a small sample of the wide variety of companies that have been using AWS extensively for quite some time already, and I know they will put the new Region to good use.&lt;/p&gt;

&lt;p&gt;But it is not only the Australian companies who frequently requested a local AWS Region, also companies from outside Australia who would like to start delivering their products and services to the Australian market are enthusiastic about serving Australia with low latency. Many of these firms have wanted to enter this market for years but had refrained due to the daunting task of acquiring local hosting or datacenter capacity. These companies can now benefit from the fact that the new Asia Pacific (Sydney) Region is similar to all other AWS Regions, which enables software developed for other Regions to be quickly deployed in Australia as well.&lt;/p&gt;

&lt;p&gt;You can learn more about our growing global infrastructure footprint at &lt;a href=&quot;http://aws.amazon.com/about-aws/globalinfrastructure&quot;&gt;http://aws.amazon.com/about-aws/globalinfrastructure&lt;/a&gt;. Please also visit the &lt;a href=&quot;http://aws.typepad.com/aws/2012/11/asia-pacific-sydney-region-open.html&quot;&gt;AWS developer blog&lt;/a&gt; for more great stories from our Australian customers and partners.&lt;/p&gt;
</content>
 </entry>
 

</feed>
