These are the old pages from the weblog as they were published at Cornell. Visit for up-to-date entries.

March 25, 2004

Simulating Transactions

I have been spending quite a bit of time on a (C#) simulator for distributed systems interaction. The simulator is to support investigations into the impact of node and network failures on protocols. I had been banging my head against the wall on some of the visualization support I wanted to build for use with the simulation. Thismorning Julia kindly helped me to take the hurdle I was stuck behind.

The work was triggered by a suggestion from Jim Gray to collaborate on the  simulation of two phase commit and the new paxos commit Jim wrote together with Leslie Lamport. Paxos commit is a transaction protocol based on the well-know paxos consensus protocol [1], intended to overcome the blocking failure scenarios 2PC is know for; Pat Helland used to call 2PC 'The Unavailability Protocol'. I am not necessarily convinced Paxos is the way to go about building fault-tolerance support for transactions, but I am certainly willing to help drive a stake through the heart of 2 Phase Commit.

The current package (all in C#) consist of:

  • Basic discrete event simulator. Calendar Queue (Brown), activities, events, entities, event-observable monitoring. performance counters for monitoring the operating of the simulator
  • Statistics package. Large collection of random number generators and event generators based on them. Basic stats, clouds, histograms, etc. Outputs in several formats, most importantly to scpl (and to excel).
  • Simple Node, Message, Network and Router package. Allows you to configure node and message failure rates and message delays, all based on distribution generators from the Stats package.
  • Epidemics. two demos for the basic simulator based on gossip communication and an epidemic failure detector.
  • Transaction simulation package. Derived form the Simple Net package to include operations, nodes and message handling common for transaction simulations. Includes extensive log writer and reader facilities for offline simulation analysis.
  • Two Phase Commit Simulation. Ported from code that Jim wrote.
  • Timeline User control. This is a winform user control that can be used to visualize a distributed interaction. Those of you who have studies distributed systems textbooks have seen these timelines before. The line representing a process changes color according to the state changes and arrows represent message flow. The control can display all events, display event by event or do an animation of the event flow. A lot more work is needed to make this work well.
  • Simulation Explorer. A demo app that brings reading a simulation, stats processing and timeline display together.

Here is a  windows media movie with the simulation explorer in action (280K).

Below are some screen shots from the demo app, click on the thumbnails to see the full images. The simulation explorer holds a datagrid with the transactions and the related events. It has scpl graphs for the commit and abort times and a time progress plot. The timeline controls shows the visualization of the current transaction. The timeline for each process changes with its state. The gray bar above the timeline indicate a store operation. If the timeline disappears it means the node has crashed. Text near the timeline indicates the firing of a particular timer. I need to find icons to go with these events. You can filter on aborted transactions, message loss and transactions with node failures.

This is the basic view with a the beginning of the normal 2PC transaction in the timeline

This is the same transaction scrolled to the end of the time line,
where the commit happens. (The gray bars are store operations).

Here the transactions are filtered to show only the aborted transactions
and the timeline display the end of the transaction  where it is aborted
because of second resource manager failed.

[1] the Paxos consensus protocol is described the in the Part-Time Parliament paperr by Leslie Lamport.

Posted by Werner Vogels at March 25, 2004 02:27 PM