Making Byzantine Fault Tolerant Systems Tolerate Byzantine Faults
Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI) 2009.
View
PDF or BibTeX.
areas
Distributed Systems
abstract
This paper argues for a new approach to building Byzantine fault tolerant replication systems. We observe that
although recently developed BFT state machine replication protocols are quite fast, they don’t tolerate Byzantine
faults very well: a single faulty client or server is capable of rendering PBFT, Q/U, HQ, and Zyzzyva virtually
unusable. In this paper, we (1) demonstrate that existing protocols are dangerously fragile, (2) define a set of
principles for constructing BFT services that remain useful even when Byzantine faults occur, and (3) apply these
principles to construct a new protocol, Aardvark. Aardvark can achieve peak performance within 40% of that of
the best existing protocol in our tests and provide a significant fraction of that performance when up to f servers
and any number of clients are faulty. We observe useful
throughputs between 11706 and 38667 requests per second for a broad range of injected faults.