A Scalable Information Management Middleware for Large Distributed Systems
Information management is one of the key tasks of any large-scale
distributed application. The goal of this dissertation is to design
and build a general and scalable information management middleware for
large distributed systems that will facilitate design, development,
and deployment of distributed applications and that will enable
application developers to explore the tradeoffs between communication
cost, response latency, and consistency.
In this dissertation, we present a Scalable Distributed Information
Management System (SDIMS) that aggregates information about
large-scale networked systems and that can serve as a basic building
block for a broad range of large-scale distributed applications by
providing detailed views of nearby information and summary views of
global information. To serve as a basic building block, an SDIMS
should have four properties: scalability to many machines and data
items, flexibility to accommodate a broad range of applications,
administrative isolation for security and availability, and robustness
to node and network failures.
We design, implement, and evaluate an SDIMS that (1) leverages
Distributed Hash Tables (DHT) to create scalable aggregation trees,
(2) provides flexibility through a simple API that lets applications
control propagation of reads and writes and through a self-tuning
mechanism that adapts the propagation to observed load in the system,
(3) provides administrative isolation through a novel Autonomous DHT
algorithm, and (4) achieves robustness to node and network
reconfigurations through lazy reaggregation, on-demand reaggregation,
and tunable spatial replication.
Through extensive simulations and micro-benchmark experiments on
several real testbeds, we observe that our system is an order of
magnitude more scalable than existing approaches, provides a wide
range of choices for applications to control the propagation of data
to tradeoff the bandwidth cost with the response latency, achieves
administrative isolation properties at a cost of modestly increased
read latency in comparison to flat DHTs, and gracefully handles
failures. We implement several applications on top of SDIMS including
a file location system and a multicast system. We also use SDIMS in
two other research efforts in our lab --- as a controller for a
distributed file replication system and as an information gathering
plane in a distributed network monitoring system.
Praveen Yalagandula
Last modified: Tue Aug 16 14:56:33 CDT 2005