Everywhere Availability, Scalability and Cost-Efficiency: Challenges for Web-based Services

Pei Cao

Computer Sciences Department

University of Wisconsin-Madison

1. Web-based Services

The ubiquitous presence of Internet and the uniform interface of the Web are fundamentally changing people's perception and use of computers and information technology. Computers are no longer just for particular tasks; increasingly, we rely on computers as access points to the Internet and rely on the accesses for our daily life. We use the Web as an information resource and a library to find answers to our questions; we use the Web to initiate financial transactions, compare prices and hunt for bargains; we use the Web to check flight information, book reservations and arrange all aspects of our travel. The Internet and Web interface provide a channel for content and service providers to reach their potential customers. The providers' utilization of this channel further fuels customers' reliance on web-based access to contents and services.

Web-based services are a new scenario of distributed computing. Not only do we have potentially hundreds of millions of users, we also have tens of millions of servers (i.e. content providers and service providers). Users and servers are connected through the Internet, which is unreliable and has very limited bandwidth at many places. Servers vary in their resources. Some, such as companies, can afford powerful fault-tolerant machines and high bandwidth connections to the Internet; some, such as academic institutions, can only afford a few workstations to run a Web server and low bandwidth connections to the Internet. Yet, users count on being able to access any server at anytime.

People's increasing reliance on Web-based services poses three requirements on the services: Everywhere Availability, Scalability and cost-Efficiency (EASE).

By everywhere availability, we mean that users can count on being able to access any web server anywhere, anytime. Such availability is regardless of the location of the user, the time of the access, the load at the server and the load on the inter-connecting network. Availability does not mean strict fault tolerance: for many services, interruptions of the service lasting under a few seconds are tolerable. However, availability does mean that services are provided with low delay --- long waiting times for web-based services easily cancel out any productivity gain they bring.

By scalability, we mean that servers should be able to provide services to all potential customers, whose number is fast growing and approaching hundreds of millions.

By cost-efficiency, we mean that availability and scalability should be achieved with the lowest possible cost to the server and the network. There are many content and service providers in the world that have very limited resources. For some, the constraint is the processing power of the server machine. For some, the constraint is the bandwidth of the network connecting the server to the Internet. Yet for some, the bottleneck is the bandwidth of the network connecting the server to the users. For example, servers on the other side of Atlantic ocean have very limited bandwidth to the users inside United States. If availability and scalability require high-bandwidth networks or expensive hardwares, the scope and usefulness of Web-based services will be severely limited. Thus, solutions need to be found such that resource-constrained servers are reachable by all interested users everywhere.

The current state of Web-based services falls far short of the goal of EASE. Users often have to wait long for responses from resource-limited servers. Connections frequently time out when a server is overloaded. Existing solutions to ensure availability in overload situations (for example, the Deep Blue match) require multiple mirrored sites, high-end server machines, and multiple high-bandwidth connections to the Internet. Clearly, the majority of content and service providers cannot afford such solutions.

Providing everywhere availability and scalability with cost efficiency can significantly increase the scope and dependability of web-based services, and consequently increase the role the Internet technology plays in people's lives. The difficulty, however, lies in the fact that scalability and availability need to be provided over a network whose capacity grows very slowly, and for servers whose resources are inherently limited. Thus, it is a challenge to the systems and network research community to design infrastructures and mechanisms to provide EASE for Web-based services.

Achieving the goal requires advances in many fields. In the following section, we look at how we can take advantage of the emerging user-proxy-server paradigm of web accesses for low-cost service replication and distribution.

2. User-Proxy-Server

One growing trend in the Internet and Web access technology is the deployment of proxy servers (also called proxies) at institutions and Internet Service Providers (ISPs). Originally introduced as a way to channel Web access through firewalls, proxy servers intercept Web requests made by users and handle the connections to Web servers. Proxy servers are often built for a particular user population (i.e. employees of a company or subscribers to an ISP) and are supported by the user population, either directly through subscriber fees or indirectly through company income. Located at gateways and network access points, they tend to be more resource-rich than average Web servers. Today, proxies are expanding their roles to include document caching, content filtering and conversion, and access auditing.

Proxies change the communication paradigm of Web-based services from user-server to user-proxy-server. Instead of the many to many communication between users and servers, the bulk of the communication now flows many to one between users and proxies, then many to many between proxies and servers. Users can still bypass proxies to access the Internet in the case of proxy failure, but most of the time the communication is intercepted and controlled by proxies.

Thus, proxies provide a natural way for servers to migrate their functionalities and move ``closer'' to users. For users, proxies are the connection points to the Internet. Therefore, complete replication of all Web server functionalities at all proxies would indeed make Web-based services scalable and everywhere available, as long as users can always connect to a proxy. Total replication, however, can easily overwhelm the proxies.

What is needed is an infrastructure for servers to distribute their processing to proxies, and for proxies to decide when to perform local processing on servers' behalf and when to forward requests directly to servers. In particular, the following questions need to be addressed:

1) A scheme for proxies to cache not only contents, but also ``functionality'' from a server.

Currently, most proxies only cache static documents from content providers. This simple form of off-loading the processing of servers has already demonstrated significant gains: server load and network traffic can be reduced by nearly half. Passive document caching, however, does not benefit service providers, which feature queries, dynamic contents, and transactions. Thus, what is needed is active caching: a scheme to allow proxies to cache ``partial functionalities'', i.e. codes and portions of databases, from a server and perform local processing for requests directed to the server. The functionality cached at the proxies needs not be complete; proxies can communicate with the server to complete processing of a request. By allowing part of the processing to be done at proxies, servers can significantly increase their availability and scalability.

Along with the caching mechanism, there needs to be a protocol by which the server can trust the proxy not to disclose the cached functionalities to third parties, and trust the proxy to perform the execution of these functionalities correctly. There also need to be a protocol by which the proxy can specify what failure semantics it can provide to cached functionalities from a server. This is particularly important for transaction-oriented servers. If proxies can preserve partial results across failures, a server can let user requests be processed and saved at the proxy for the period when it is unreachable due to network or server congestion.

2) A mechanism by which a server can ensure the consistency and control the execution of its cached functionalities at proxies.

Merely having a caching mechanism is not enough. There must also be a way for servers to control such caching. A server needs to make sure that the information cached at the proxy and the code executed at the proxy are up to date and consistent with those at the server. Stale information can do more harm than good. A server should also be able to revoke the caching of its functionalities if necessary. Since the number of proxies is potentially on the order of hundreds of thousands, it is important to provide a mechanism by which a resource-limited server can control its cached parts at all proxies.

3) Management of resources for proxies.

To a proxy, the question is whether to cache the functionality from a server, and upon a particular request, whether to execute the code on the server's behalf, or direct the request to the server. The proxy has to estimate the resource that would be consumed by the local processing, and estimate the gain (reduction in access latency seen by the user) versus the loss (consuming resources that could be used to better service other users). Research is needed in compilers and operating systems to predict the resource needs, in terms of CPU cycles, memory footprint, and I/O bandwidth, of the execution of a cached server functionality. In fact, existing OS technologies are quite inadequate at providing the necessary information even after a job is finished, let alone before-hand. Research is also needed to enable proxies to estimate the condition of the network connection and expected latency to a particular server.

In making a decision using these estimates, the proxy needs to balance several considerations. On the one hand, it should consider the resources of a server, and the degree to which a server relies on the proxy's local processing for availability and scalability. On the other hand, the proxy services a collection of users and needs to consider the overall services it provides to the users. How to balance these considerations, and yet still provide availability and scalability for resource-limited servers to the maximum degree possible, remains an open question.

One difficulty in conducting research for Web-based services is that it is hard to reflect the large scale network dynamics in a limited experimental testbed. Due to the disruption that could be resulted from experiments on the Internet, it is often desirable to test the systems on an isolated local area network. However, conclusions drawn on experimental testbed may be inaccurate for the Internet. Research is needed to distinguish which aspects of system behavior (CPU utilization, memory consumption due to state information, I/O latency, etc.) are affected by which aspects of network dynamics (bandwidth, message delay, number of concurrent network connections, server load, etc.). We can then try to capture different aspects of network dynamics in a testbed, and run many experiments to predict a system's performance on the Internet with confidence. For this purpose, collection of large-scale network traces, analysis of them, observations of behaviors of large user populations, benchmark designs, are all useful tools.

3. Summary

Web-based services have the potential to fundamentally change the way people use information technology and significantly boost the productivity gains from the technology. For them to be widespread and useful, web-based services need to have EASE: everywhere availability, scalability and cost-efficiency. We have explored ways in which the current user-proxy-server paradigm can be used for service distribution and replication toward the goal of EASE. We believe that it is important for the research community to observe, influence, and move the infrastructure for Web-based services toward a more scalable and cost-efficient architecture.



Return to: Table of Contents