Back-to-Basics Weekend Reading: Byzantine Generals
In Reliable Distributed Systems, we need to handle different failure scenarios. Many of those deal with message loss and process failure. However, there is a class of scenarios that deal with malfunctioning processes, which send out conflicting information. The challenge is developing algorithms that can reach an agreement in the presence of these failures.
Lamport described that he was frustrated with the attention that Dijkstra had gotten for describing a computer science problem as the story of dining philosophers. He decided the best way to attract attention to a particular distributed systems problem was to present it in terms of a story; hence, the Byzantine Generals.
Abstractly, the problem can be described in terms of a group of generals of the Byzantine army, who camped with their troops around an enemy city. Communicating only by messenger, the generals were required to agree upon a common battle plan. However, one or more of them may be traitors who would try to confuse the others. The problem is: to find an algorithm that ensures the loyal generals will reach an agreement.
It is shown, using only oral messages, this problem is solvable if and only if more than two-thirds of the generals are loyal. So, a single traitor can confound two loyal generals. With unforgeable written messages, the problem is solvable for any number of generals and possible traitors. This weekend, I will be going back in time and reading three fundamental papers that laid-out the problems, and its first solutions. In the SIFT paper, the problem is first described, the “reaching agreement” paper describes the fundamental 3n+1 processor solution, and the last paper reviews and generalizes the previous results.
Maybe you will enjoy them as well. “SIFT: Design and Analysis of a Fault-Tolerant Computer for Aircraft Control” John H. Wensley, Leslie Lamport, Jack Goldberg, Milton W. Green, Karl N. Levitt, P. M. Melliar-Smith, Robert E. Shostak, Charles B. Weinstock, in Proceedings of the IEEE 66, October 5, 1978
“Reaching Agreement in the Presence of Faults” M. Pease, R. Shostak, and L. Lamport, 1980, J. ACM 27, 2 (April 1980), 228-234.
“The Byzantine Generals Problem”, Lamport, L.; Shostak, R.; Pease, M. (1982), ACM Transactions on Programming Languages and Systems. 4 (3): 382–401. doi:10.1145/357172.357176.