B. Liskov and L. Shrira
MIT, Laboratory for Computer Science
The growth of the Internet and the Web is leading to global applications with geographically distributed components. Support for such applications is inadequate at present. For example, the Web provides no consistency guarantees when applications require access to mutable online information stored at multiple locations, nor is there adequate support for applications that must continue to provide possibly degraded service in the presence of failures.
Global applications will be extremely important in the future and their support poses a grand challenge in the sense that their needs must be addressed by systems research. However, to address these needs, it is not sufficient to focus only on operating systems and other low level system components; instead work on application requirements and support for those requirements is needed. Once the needed support has been defined, synergistic research in operating systems, networks, compilers, and storage systems technology can lead to performance improvements.
This paper focuses on research on support for global applications. Such research will lead to advances in two important areas: (1) an improved understanding of what a platform must provide to support the development of robust global applications; and (2) new software technology that makes possible a high-performance and scalable implementation of such a platform. Below we sketch what we think are promising directions to follow in this work.
Basic Assumptions
We are interested in supporting applications that provide service to users based on persistent online information. Application components run on a range of client machines, from powerful workstations to small devices. Persistent information is stored at {\em repositories} as objects, i.e., encapsulated data structures with access provided via method calls. This is a very general model. For example, a file system can be thought of as an object repository that stores just two kinds of objects (files and directories). Another example is a repository that stores web pages; yet another is an object-oriented database system in which users can define new object types, and can provide code that implements these types.
An application carries out transactions consisting of groups of method calls. Transactions are desirable because they work properly in the presence of failures and concurrency (without the need for applications to do explicit synchronization, which is error-prone). By restricting applications to using transactions, we greatly increase the probability that persistent state remains consistent in spite of concurrency and failures.
We assume that there is a large number of repositories and a much larger number of clients. Furthermore clients and repositories may be physically very far from one another. The research needs to address problems raised by large scale and physical distance, both at the level of the computation model, and at the implementation level.
Computation Model
We assume a computation model is needed that is based on transactions. However, we believe the model must both be more powerful than, and also weaker, than what is provided today. The power is required to allow applications to run transactions involving multiple clients and repositories. For example, a repository may only be willing to ship its information to certain trusted clients. Applications running at other clients that need to use the protected information must access it via the trusted client. This requires distributed transactions with subparts that run concurrently; furthermore, distribution might involve downloading of code.
Weaker semantics can reduce the cost of transactions, or allow transactions to work in the presence of failures, while still meeting application needs. For example, an application running a very long-running simulation as a parallel/distributed program running at many clients might make use of an old but consistent snapshot of online information. The simulation might run as a sequence of transactions, committing periodically as a way of taking checkpoints.
The trick with weaker semantics is to provide it in a way that still provides a sound basis for reasoning about applications. Such reasoning is greatly simplified if it is possible to rely on system invariants. The appeal of transactions is that they can ensure invariants: a transaction maps a system from one consistent state to another, and if this is impossible (because of a failure or a conflict with some other transaction), the transaction is forced to abort. Databases typically support weaker transaction semantics, e.g., ``level-2 consistency'', but at the loss of atomicity and consistency; in fact it can be hard to tell exactly what such mechanisms provide since they are defined in terms of particular concurrency control techniques rather than in terms of end-to-end semantics. We believe weak transaction semantics is a research area that is ripe for exploration, leading to better understanding of the tradeoffs between performance gains and semantic consequences.
The computation model additionally needs to support the construction of ``survivable'' applications, i.e., applications that can continue to provide service to users in spite of unmasked failures. We assume that the network and the repositories are highly reliable, but even so failures can occur that cannot be masked. Ways are needed to inform the application of the problem. Then it can continue to provide a possibly degraded service in an application-dependent way. Degraded service might be based on periodic snapshots of information some of which is now missing; weaker transaction semantics can allow the application to replace the missing information with the snapshot. Transactions may not be able to commit immediately, e.g., when modifications are made to objects at a failed repository; thus the model probably needs to include a notion of ``tentative'' commit.
The computation model might best be made available via extensions to a programming language such as Java, or it might be made available via a new programming language.
Implementation Issues
Efficient support for applications in this model requires development of mechanisms for implementing distributed/parallel transactions, and development for implementation techniques for weak transaction semantics. Additionally, we require efficient caching mechanisms for clients, and good replication techniques for repositories.
The caching mechanisms should allow applications to get the benefits that large caches provide, even if they run on very small clients (e.g., hand-held devices). This can be provided by ``cooperative caching'', i.e., multiple nodes cooperate to provide caching to clients. The cooperative caching techniques developed so far have typically not provided fine grained support for objects. An exception is our fragment reconstruction technique, but that technique is designed for use in a local area net, in which it is acceptable for for servers to maintain directories that track the content of client caches. Such a scheme will not work in a large, wide area net.
Furthermore, cooperative caching raises new security issues. Caches might not fully trust one another, either to avoid mistakes (e.g., to modify the cached copy by mistake) or to avoid malicious misuse, such as reading private information. Mechanisms based on encryption and digital signatures must be developed to protect objects cached at mutually distrusting replicas from unauthorized access, yet support fine-grained modifications to those objects.
With respect to replication, we believe that a primary/backup approach is the right one, but that it should be extended to provide better service. In particular, it is desirable for clients to be able to fetch information from backups as well as the primary, or even from ``weak'' replicas, whose state is not necessarily current since they are updated in the background rather than as part of committing a transaction. Weak replicas can be used by clients when the repository (primary and backups) is available as a way of improving performance, and when the repository has failed, as a way of providing service in the presence of unmasked failures.
The key question when clients fetch from nodes other than the primary is how to ensure that commit happens properly. When clients always fetch from the primary, the primary has accurate information about what objects might be stored at the client; when clients fetch elsewhere this will no longer be true. Such a situation would be major problem for some concurrency control schemes, e.g., locking, but we believe an optimistic approach can be extended to handle it well.
We also believe that more work is needed on efficient failover techniques, especially ones that work properly in the presence of weak replicas. The result of a failover is preferentially a replica group composed of only strong replicas, but if this is not possible, a group containing some (or all) weak replicas will be formed. In the former case the effects of all committed transactions survive the failover, so we still have a working repository. In the latter case, the effects of some transactions will not be known after the failover. Thus the repository, while still available, can provide only degraded service. This service is probably not useful for all applications but we believe it is useful for some, especially if we can find a way to bound the differences between the states at the strong and weak replicas.
Summary
To confront the grand challenge of support for reliable global applications, we need to focus on application needs and requirements. We have sketched a promising approach based on improved understanding of requirements for a robust platform for global application support and new software technologies that enable high performance and scalable implementation of such a platform.