February 2006 Archives

Not Really A Top Ten

|

Halley Suitt, now CEO of Top Ten Sources, has a way of talking me into doing things where I actually have no time for. This time it was giving her a list of my top ten weblog feeds. This is an impossible task as I don’t think there is such an absolute list, but try and explain that to someone who runs a company called Top Ten Sources. So I compromised by selecting some of the categories I have organized my readings into, and from each category I have picked one or two weblogs that I find consistently worth while reading. You can find the list at werner-vogels-top-10.toptensources.com.

PS. In retrospect I think that in a past life Halley must have been Eve in the Garden of Eden, as I am sure she can talk any man into eating the whole darn apple.

Generating SiteInfo with MT

| | Comments (2)

On the A9 Developer weblog there is an article today on he SiteInfo feature that was until now only available when you used the A9 Toolbar. They now have developed a Firefox extension that will give you the same functionality. Normally the siteinfo tab gives you the Alexa information about a site: traffic ranking, reviews, etc., and it presents you with a short “people that visit this site also visit …” list of sites. However if you go to IBM.com or Amazon.com you will see that this tab changes into a site navigation menu. The posting on the A9 weblog today and an earlier posting in September give some other examples.

You can control this through the generation of a sitinfo.xml file, to be placed in the root directory of your website. The details of the file format are in the SiteInfo Spec. I added a template to my moveable type installation that automatically generates this with a menu for the last postings, categories, etc.

netinfo-small.JPG

My moveable type template for this can be downloaded from here. Set it up as a new template to generate a file siteinfo.xml on rebuild.

Associating

| | Comments (1) | TrackBacks (1)

I have to admit that one of the reasons for my “January Recommendations” posting was to experiment with the Amazon.com associates program. Associates earn up to 10% of the purchase if they drive traffic to Amazon.com and the customer actually places an order. Not that I was  expecting to make any money of this, but I was interested in seeing what the experience is for an associate if s/he needs to create a collection of links, and how good the tools are for tracking the results.

It appears that my readers are not going to earn my dinner for me: of the almost 2000 people that saw the positing only 63 actually clicked on one of the links to visit the product page at Amazon.com. And none of you actually ordered something…

This is the summary from associates central:

As you can see 3.2% of you actually looked at the product detail pages referred to in the January recommendations posting. The last four columns are conversion, items ordered, items shipped, and referral fees. Sadly they are all still 0 (Conversion is the percentage of clicks that results in an order)

I am having great fun though with the tools at associates central. There is lots of trending tools and other reporting that allows you to build a real business out of being an associate.

It also gives you many more ways to collect referral fees than just direct product links. There are different sizes of banners for promotions in different categories, search boxes, and recommended product links.

The link above here is what Amazon.com will produce dynamically as recommended products for the keywords “service oriented architecture” in the books category. Next project is to see how easy it would be to integrate this into the weblog feeds…

update: the recommendation code generated by the associate program uses an iframe and as such the dynamic code doesn't show up in some of the feed readers. In the feeds I have replaced it with an image for the moment to show what it should have looked like.

Opportunities in Modeling Complex Distributed Systems

|

Modeling systems has always been part of the toolkit of the computer scientist. We often try to bring systems back to simple queuing models to understand throughput and latency questions, and then use those results to predict resource usage and drive allocation.  Can one actually be confident that such a simple model can accurately reflect reality? With the increased complexity of distributed systems based on a large scale autonomous services model these techniques become a lot less reliable.

I would like to use modeling techniques to focus on more than just achieving simple SLAs. I want to understand the cost impact of using certain algorithms in combination with specific node and network configurations, especially under certain failure scenarios. I would use such an economic model during the design phase of a service or application, to evaluate different algorithms for achieving consistency and availability based on their cost impact. For example, if a service needs to achieve a state persistence that can survive a complete datacenter outage and the service needs to be accessed by clients in ten datacenters with a certain SLA, there is a range of algorithmic and configuration choices to be made.

In these situations, systems design has often focused on trying to achieve the performance and availability SLAs first, which in itself is difficult enough.  The economics of the different algorithm and configuration choices are often considered to be of secondary importance. However, when you are determining the cost of a system, you have to consider the choice of the size of replication units in combination with the density of the storage nodes, the reliability of the storage system, the step-function cost of inter-datacenter networking, the location and reliability of data caches. This results in a base cost plus a cost per storage operation that is different in a quorum-based system when compared to a probabilistic system. This holds especially true when you include in this modeling the cost of recovering from a cache node, storage node or datacenter failure.

Many have assumed that throwing a lot of cheap hardware at the problem is the answer to many of these questions, but our experience is that when taking complex multi-datacenter configurations into account, the answers are less obvious. As we build new services we need better models that can handle these very complex, multi-variant scenarios to make sure we build the right services at the right cost. At Amazon we are fortunate to have a lot of data that will allow us to make progress on these questions.

I have positions open for experienced engineers/scientists who want to work on the problems of modeling complex distributed systems with me. To qualify for these jobs these are some of the things that I will look for:

  • You have a very solid understanding of distributed systems and networking
  • You know how to do data-driven analysis and truly understand statistics
  • You understand large scale monitoring and data collection architectures
  • You are familiar with the current state of the art in distributed systems modeling
  • You are an experienced engineer with a track record of building complex systems
  • If you are not that experienced, you may have an advanced degree with a proven expertise in modeling complex distributed systems and have demonstrated involvement with large software projects (e.g. open-source).
  • You have a proven ability to effectively communicate the results of data analysis and modeling
  • You live in or are willing to relocate to the greater Seattle area

If you are interested in this work and feel that you are qualified, send me an email with your resume.

A Neutral Net

|

Today the Senate hearings on Net Neutrality took place. The details of the testimonies are online.

I believe that everyone who has a vested interest in seeing business and service innovation continue to flourish on the internet should be concerned about the proposal for access-tiering introduced by the last-mile broadband vendors.

I suggest you read some of the testimonies to get an idea what was discussed at the hearing. I believe that today Larry Lessig's testimony was particularly eloquent, building upon Powell's "Internet Freedoms" policy combined with the end-to-end argument. If there is one testimony you could read it should be his.

Another Reality

| | Comments (1)

Because of my absentmindedness last year I forgot to point to another writer that has joined the public Amazon/A9 weblog family: In december Claire Giordano of opensolaris fame joined A9 as director of product management. Please visit Claire's Alternate Version of Reality.

The Big Day

|

GO HAWKS!

kimhawks.JPG
Kim @ the Seahawks send-off party

Spring Systems Conferences

|

The programs for 2 conferences with mainly operating and distributed systems contributions are online now:

With 58 papers between the two conferences the systems research field appears to be very healthy. I participated in the program committee for Eurosys, and I believe there was a total of over 170 submissions. Assuming NSDI received a similar number of submissions, and of the same quality, there is a lot of good experimental work going on.

I sure hope the organizations for both conferences will be making the papers available online as soon as the authors produce their final versions. There is no gain in keeping the content of the papers obscured until the conference: making the papers available early will create a better prepared audience, with better questions and discussion.

The Eurosys program committee was the last remnant of my former academic life and it is going to be my last program committee for a while; I don’t have the bandwidth anymore to give papers the attention they deserve. For Eurosys I had to review between 30 and 40 papers, each with 14 pages of deep technical content. We did a lot of group reviewing, which was a great help, but I spent too many late nights on this to be fun.

Links Below Sea Level Needed

| | Comments (1)

In case you didn't know; I am Dutch. If you would have heard me speak you probably would have guessed that tidbit of useless information. In the past months I have received a number of criticizing comments that I am not keeping up with the all the developments in the Netherlands in terms of computing and ecommerce. To be honest I have very little insight into how the Dutch IT world has developed, given that I left the Netherlands almost 15 years ago. A lot has happened since.

Some of my Dutch contacts have now started to convince me that it would be interesting to follow some of the more recent developments. I am looking for a few good Dutch feeds that would allow me to track what is going on in IT and ecommerce in the Netherlands. I had been given 3 links to bootstrap: webwereld.nl, vnunet.nl and emerce.nl. Some of these seem to be mainly international news translated into Dutch, so I am looking for some more links to really good Dutch feeds.