These are the old pages from the weblog as they were published at Cornell. Visit for up-to-date entries.

April 27, 2004

History of Epidemics

In the past 6-8 years we have been using various epidemic techniques in building our reliable and scalable distributed systems with great success. Now that industry is starting to deal with issues of scale that can almost only be solved by using epidemic techniques, it becomes important to produce some basic pointers to the origin of the use of epidemics in distributed systems.

In a nutshell: An epidemic style of communication or state sharing gives you a highly robust medium for distributed interaction. The main advantages are

  • Probabilistic model. This doesn't mean that it gives less guarantees than a deterministic model, but that we now have a good framework for reasoning about the spread of information through a system over time.
  • A-synchronous communication pattern. Any good epidemic communication system allows you to operate in a 'fire-and -forget' mode, where, even if the initial sender fails, all surviving nodes will receive the update (or none will).
  • Autonomous & decentralized actions. Epidemics techniques enable you to take actions based on the data you received without the need for additional communication to reach agreement with your partners; you can take decisions autonomously.
  • Robust with respect to message loss & node failures. Once a message has been received by at least one of your peers it is almost impossible to prevent the spread of the information through the system. In the most popular demo I give, the system still operates under 90% message loss with limited or no loss in functionality.
  • Rigorous mathematical underpinnings. Finally we have a set protocols where we can use rigorous mathematical techniques to reason about the operation of the protocols under all sorts of conditions

These techniques have a long history in science, but mainly in biology and epidemiology, and in mathematics. The bible of epidemics from a theorectic point of view is:

Epidemic Theory of Infectious Diseases and its Applications N.T.J. Bailey Hafner Press Second Edition, 1957

This is not a computer science text, but it explains the real fundamentals. If you interested in more CS oriented texts than the following list has some general papers that deal with the basics of epidemic communication or 'gossip' and are a good starting point to learn about fundamentals.

I purposely left the Cornell papers off this list, as not to appear too self-serving. I believe the work at Cornell has been ground-breaking in bringing the techniques to a larger CS audience and applying it to building robust and scalable distributed systems. I'll give a reading list of Cornell work in a follow-up posting.

Posted by Werner Vogels at April 27, 2004 05:40 PM

Epidemic networking
Excerpt: We've already adopted the "virus" paradigm from the biologists, so why not use other natural phenomenon to describe - and improve - networking?
Weblog: Compendium
Tracked: April 28, 2004 09:57 AM