Simon S. Lam (more publications)
Protocol Design for Peer-to-Peer Networks
Several proposed peer-to-peer networks use hypercube routing for scalability. Consistency of neighbor tables in hypercube routing guarantees the existence of a path from any source node to any destination node. Consistency, however, can be broken by the failure of one node. To improve the robustness of hypercube routing, we generalized the concept of consistency to K-consistency, for K ≥ 1. We showed that a K-consistent hypercube routing network provides at least K disjoint paths from any source node to any destination node with a probability close to 1. We discovered a semantic structure, called C-set tree, for protocol design and reasoning about K-consistency. We then designed and specified a new join protocol together with a proof that it generates K-consistent neighbor tables for an arbitrary number of concurrent joins (under the assumption that there is no concurrent leave or failure).
To investigate the question of how high a rate of node dynamics can be supported by P2P networks based upon hypercube routing, we designed and evaluated a failure recovery protocol based upon local information for K-consistent networks. The failure recovery protocol was then integrated with the join protocol using the Lam-Shankar approach for module composition. The integrated protocols were evaluated by a set of simulation experiments in which nodes joined a 2000-node network and nodes (both old and new) were randomly selected to fail concurrently over 10,000 simulated seconds. In each such churn experiment, we took a snapshot of neighbor tables in the network once every 50 seconds and evaluated connectivity and consistency measures over time as a function of the churn rate, timeout value in failure recovery, and K. Storage and communication overheads were also evaluated. We found our protocols to be effective, efficient, and stable for an average node lifetime as low as 8.3 minutes (the median lifetime measured for Napster and Gnutella is shown in a previous paper [Sariou 2002] to be 60 minutes).
We found that for K-consistency networks, where K was 2 or 3, with a timeout value of 5 seconds in our failure recovery protocol, the minimum average lifetime decreases as the number of nodes n increases for n > 500. At the same time, the maximum churn rate increases more than linearly as n increases for n > 500. This trend suggests that the ability of our protocols to sustain churn is scalable in network size.