John Carter
University of Utah
ABSTRACT
Many of the systems research problems of yesterday are the commercial realities of today: client-server computing, distributed file systems, the Internet, secure distributed communication, fault tolerance, clustered databases and filesystems, distributed object-based computing, and much more. The key question currently facing systems researchers is in what areas the research community can and should concentrate its effort, and in what areas industry is likely to dominate future development. An area that I believe is a critical foundation for all distributed systems efforts, both academic and industrial, is that of MANAGING DISTRIBUTED SHARED STATE. The systems community has significant experience that could be applied to this problem by exploiting the results of work from several disparate areas of research: distributed shared memory (DSM), software fault tolerance, reliable multicast protocols, distributed and clustered file systems, and distributed databases (to name a few). In this position paper, I motivate why state sharing in large, loosely-coupled distributed systems should be treated as a grand challenge for systems researchers, outline the problems that need to be overcome, and suggest a concrete testbed that should be developed to solve this grand challenge problem.
GRAND CHALLENGE PROBLEM STATEMENT Essentially all distributed systems applications at some level boil down to the problem of managing distributed shared state. Consider the following commercial application areas:
* Distributed file systems (AFS, NFS, client-server NTFS, ...) * Clustered file systems (DEC, Tandem, Microsoft, ...) * Distributed directory services (Novell's NDS, Microsoft's AD, ...) * Distributed databases (Oracle, Sybase, SQL Server, ...) * Distributed object systems (DCOM, CORBA, ...) * Collaborative groupware (e.g., Lotus Notes) * Internet commerce * The World Wide Web
All of these services, and many more, all perform essentially the same function, albeit in very different settings. That function is MANAGING DISTRIBUTED SHARED STATE and providing a convenient way for users and other applications to access, update, and delete the information being managed. Unfortunately, while the problem of managing distributed shared state is clearly shared by all of the above applications, there is no common means of managing the data, and in fact every system `rolls their own' solution. As the impact of the Internet and the ubiquity of networking in most areas of life increases, and as disparate distributed systems become increasingly integrated, it would significantly improve the underlying foundation of the Internet if there were a convenient way for a wide class of applications to SHARE DATA in a CONVENIENT, SECURE, and ROBUST way.
The thesis of this position statement is that IT SHOULD BE POSSIBLE FOR DISTANT CLIENTS AND SERVERS TO SHARE STATE (anything from web pages to database records to real time images being exchanged by a teleconferencing system) without each such instance of sharing requiring specially written code. Specifically, IT SHOULD BE NO MORE DIFFICULT FOR DISTANT CLIENTS AND SERVERS TO SHARE STATE THAN IT IS FOR THEM TO COMMUNICATE. Just as the common TCP/IP networking framework hides many complex and confusing issues such as handling link failures, routing, congestion, lost packets, and the like, there should be support for distributed state sharing that lets most applications remain oblivious to the many problems associated with managing shared state, such as heterogeneity, security, high availability, caching strategies, and coherence management. Unfortunately, it is very much *not* the case at this time that state sharing is easy, and without the help of the systems research community, it is unlikely to ever be the case.
Consider three examples of distributed systems with loosely-coupled producers and consumers of distributed state:
* a collection of designers and engineers from around the country collaborating through WebNFS (an internet-wide file system from SUN) to design the world's next great widget,
* a parallel computation being executed across the Internet using a distributed shared memory engine to share state, and
* a web-based commerce system that allows a consumer to safely buy a widget by requiring a wide variety of independent connected systems (e.g., data repositories at the client, the client's bank, the merchant, the merchant's bank, the merchant's supplier, the supplier's bank, ...).
Each of these systems can be built using today's technology. However, each requires an independent set of incompatible protocols for exchanging, caching, and managing shared state. In general, the state of the art in distributed state management is that every distributed system rolls its own solution. This limits interoperability, hurts performance, makes it hard to write new applications, and represents a tremendous amount of wasted (duplicated) effort.
PROPOSED SOLUTION
What is needed, in my opinion, is a COMMON INFRASTRUCTURE FOR WIDE AREA STATE SHARING, akin to the common TCP/IP infrastructure that underlies the Internet. This infrastructure would provide consumers with the ability to acquire coherent copies of requested data, without (necessarily) specifying the location(s) of the data, the best protocol for accessing the data, etc. Consumers would simply name the state that they desire and request that they be given a copy of the data, perhaps also specifying that they want to have it in an exclusive mode because they intend to modify it. The state sharing infrastructure would be responsible for such providing support for such issues as COHERENCE, HIGH AVAILABILITY, HETEROGENEITY, and SECURITY, depending on the needs of the consumers.
Currently, wide area state sharing is implemented in one of four ways: * PULL-BASED with NO CACHING: In this model, consumers contact the data producers directly every time they need some datum, e.g., a database client that querying a database server that is the sole repository for some database.
* PULL-BASED with SERVER-CENTRIC CACHING: In this model, consumers or `proxies' may cache recently pulled data, thereby allowing the request to be serviced in the local cache. These caches, however, are either managed via callbacks from the producer, or not kept coherent. This model is dominant for most distributed systems, including the WWW.
* PULL-BASED with DISTRIBUTED CACHING: In this model, there is no notion of a fixed `server' for data. When a consumer needs a piece of data, it is served from any machine that has a copy. Consumers that cache data cooperate to keep it coherent to the degree coherence is required. This model is prevalent in software DSM systems.
* PUSH-BASED: The hottest model in industry, the push-based model makes the producers of information responsible for forwarding this information to interested consumers. It has been advocated by many companies for content delivery on the internet (e.g., Pointcast, Marimba, Netscape, and Microsoft).
In my opinion, the common support mechanism for wide area state sharing should be a combination of pull- and push-based data distribution, combined with an extensive caching infrastructure. SUN CEO Scott McNealy has been quoted as saying that caching is absolutely essentially for the internet to avoid melting down due to its own popularity. I agree wholeheartedly!
Most often, consumers need to collect data on demand. Unfortunately, the current server-centric model of both the web and most distributed services (i.e., client-server systems) has severe scalability problems. These inherently pull-based systems have enormous bandwidth requirements, and servers are increasingly incapable of handling the load being placed on them as the number of clients and the variety of services these clients need explodes exponentially. Both of these factors make the decentralized data dissemination mechanism of the `pull-based with distributed caching' model attractive. This indicates that techniques developed for executing shared-memory parallel programs on tightly-coupled systems using DSM might be useable in the wide-area state sharing infrastructure I propose. Often producers know in advance that a set of consumers wants/needs all new data as it is generated, and in this scenario the push-based model fits most naturally.
PROPOSED RESEARCH AGENDA
I propose the following grand challenge problem. The systems research community should develop a testbed that allows a wide variety of distributed services and applications to be built on top of a common coherent data sharing infrastructure. This will require input from a number of major research areas, including:
* DISTRIBUTED SHARED MEMORY: As discussed above, the decentralized nature of software DSM's caching policies is likely to be critical if we are to build a SCALABLE shared storage infrastructure. Conventional DSM sytems, however, are poorly suited for building distributed applications, as they ignore critical issues such as heterogeneity, high availability, security, interoperability, and interoperability between multiple independent applications.
* DISTRIBUTED FILE SERVERS and DATABASES: Both the distributed file systems and distributed databases communities have addressed to varying degrees the problems of ensuring that shared state is kept consistent in the face of concurrent access, partial system failures, and insecure links. This experience will be invaluable.
* DISTRIBUTED OBJECT SYSTEMS: The DOS community has spent considerable effort solving the problems of resource location, distributed garbage collection, and coherence. These issues are all central to the grand challenge problem outlined above.
* MULTICAST PROTOCOLS: Keeping a distributed shared state coherent efficiently requires a potentially large number of entities to agree on the values of the shared state. A scalable multicast mechanism for passing along updates (for push-based `broadcasting') or invalidations (for invalidating caches) will likely be needed, and this community has much to offer.
* SOFTWARE FAULT TOLERANCE: If a common distributed shared state infrastructure is to succeed, it cannot fail as soon as any node or small subset of nodes fails. For these reasons, it will need to incorporate sophisticated mechanisms for duplexing and repairing state so that it can tolerate failures.
* DISTRIBUTED SYSTEM SECURITY: Finally, if such a system is to be used across an insecure Internet, it must be possible to ensure that shared state is not misused or corrupted by untrusted third parties.
I am sure that many more issues will arise in any discussion(s) we have concerning the proposed grand challenge problem outline above, but I believe that this is a solid starting point. It will require the input of researchers from a wide variety of areas, but if we succeed, it could well revolutionize the way complex distributed systems are built in the future.
CONCRETE SYSTEM GOAL
To ensure that something concrete comes from the effort, I propose that a common testbed be developed and evaluated that is capable of supporting at least three significant uses of distributed shared state across the Internet. Perhaps we could call it the DSS-BONE (Distributed Shared State backBONE). In any event, I propose that this DSS-BONE be demonstrated by supporting:
* a collection of web clients and servers, where web clients can load the desired web pages from any client or server that has a copy of the page,
* an internet-wide file system that allows geographically dispersed individuals to read and write from a shared filesystem without sacrificing coherence or performance (to the extent possible), and
* a distributed directory service built using a collection of shared objects (e.g., DCOM or CORBA objects) that export their state to distributed clients and that cooperate amongst themselves to keep the distributed directory up to date.
Initially, I propose that we just `make it work,' without concerning ourselves too much with performance, security, heterogeneity, or fault tolerance. Then, as we gain experience, we should iteratively add or improve upon these features until we have a working DSS-BONE on top of which a wide variety of heterogenous applications across unstable links and heterogenous platforms.
To encourage broad adoption of the underlying sharing infrastructure, those working towards this goal should endeavor to develop public standards (using the informal IETF standards process) and encourage industry input from those companies most interested in distributed applications and shared state management (e.g., Microsoft, IBM/Lotus, Netscape, Oracle, Novell, and the like).