Technical Report Abstracts


TR-04-01

Liu, Huaiyu and Simon Lam. "Consistency-preserving Neighbor Table Optimization for P2P Networks." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-01. May 2004. 19 pages.

Constructing and maintaining consistent neighbor tables and optimizing neighbor tables to improve routing locality are two important issues in p2p networks. In this paper, we address the problem of preserving consistency while optimizing neighbor tables for p2p networks with node dynamics. We present a general strategy: identify a consistent subnet as large as possible and only replace a neighbor with a closer one if both of them belong to the subnet. We realize the general strategy in the context of hypercube routing. First, we present a join protocol that enables the identification of a large consistent subnet with very low cost when new nodes join. Next, we define an optimization rule to constrain neighbor replacements to preserve consistency, and present a set of optimization heuristics to optimize neighbor tables with low cost. The join protocol is then integrated with a failure recovery protocol. By evaluating the protocols through simulation experiments, we found our protocols and optimization heuristics to be effective, efficient, and scalable to a large number of network nodes.

Download


TR-04-02

Liu Wenguo and Daniel P. Miranker. "An extension to SQL92 for Biological Databases." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-02. January 15, 2004. 23 pages.

MoBIoS is a project that aims at inventing a new generation database management system (DBMS) for molecular biological data. This DBMS has a set of features including storage of sequences, retrieval of sequences based on similarity metric, and a query language (mSQL) embodying the semantics of Genomics and Proteomics and allowing for concise expression of Bioinformatics studies. This paper focuses on designing and implementing mSQL. mSQL is based on the standard SPJU relational operators, with extensions in operators to enable splitting sequences into fragments, merging overlapped fragments, grouping fragments that can be merged, interior fragment select, searching, and joining optimized with metric space index. Query optimization rules are given based on these operator extensions. The concept of Sequence View is introduced, which has all the merits of standard database View, and provides for the explicit fragmentation of a sequence into overlapping substrings of a fixed length and the concurrent construction of a metric-space index on the fragments in order to accelerate matching of those fragments. Three sample biological queries (Whole Genome Join, Conserved Primer Pair Discovery, and Electronic PCR) are expressed in mSQL, query plan trees based on mSQL biological operator extensions and Sequence View are discussed. Biological applications show that mSQL is a suitable and efficient declarative querying language for querying primary structure of biological data sets with the potential to be extended for querying secondary structure.

Download


TR-04-03

Li, Xiaozhou, Jayadev Misra, and C. Greg Plaxton. "Concurrent Maintenance of Rings." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-03. February 2004. 34 pages.

A central problem for structured peer-to-peer networks is topology maintenance, that is, how to properly update neighbor variables when membership changes (i.e., nodes join or leave the network, possibly concurrently). In this paper, we consider the maintenance of the ring topology, the basis of several peer-to-peer networks, in the fault-free environment. We design, and prove the correctness of, protocols that maintain a bidirectional ring under both joins and leaves. Our protocols update neighbor variables once a membership change occurs. Using an assertional proof method, we show that, although the ring topology may be tentatively disrupted during membership changes, our protocols eventually restore the ring topology once membership changes subside. Our protocols are simple and our proofs are rigorous and explicit.

Download


TR-04-04

Kokku, Ravi, Upendra Shevade, Nishit Shah, Harrick Vin, and Mike Dahlin. "Adaptive Processor Allocation in Packet Processing Systems." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-04. July 2004. 14 pages.

The functionality of packet processing applications is often partitioned into pipeline stages; these stages are allocated a subset of the multiple processors available in a packet processing system. The workload, and hence the processing requirement, for each pipeline stage fluctuates over time. Adapting processor allocations to pipeline stages at run-time can improve robustness of the system to traffic fluctuations, can reduce processor provisioning requirement of the system, and can conserve energy. In this paper, we present an on-line algorithm for adapting processor allocations while ensuring that the additional delay suffered by packets as a result of adaptation is deterministically bounded. The resulting Processor Allocation Algorithm (PAL) is simple, but it allocates only as many processors to stages as needed to meet packet delay guarantees, accounts for system reconfiguration overheads, and copes with the unpredictability of packet arrival patterns. A key contribution of PAL is its generality; it captures the adaptation opportunities in the system as a finite state automaton (FSA)---the methodology for constructing the FSA can be applied to a variety of application requirements and system configurations. We demonstrate that for a set of trace workloads PAL can reduce processor provisioning level by 30-50%, reduce energy consumption by 60-70% while increasing the average packet processing delay by less than 150milliseconds. We describe our prototype implementation for Intel's IXP2400-based packet processing system.

Download


TR-04-05

Briggs, Willard J., Wenguo Liu, Rui Mao, Weijia Xu, Rui Mao, and Daniel P. Miranker. "SQL Extensions and Database Mechanisms for Managing Biosequences." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-05. . 12 pages.

Biologically effective retrieval and analysis of sequences entails much more than finding matching strings. While identification and storage of biological sequences usually comprises long functional units (e.g. genes, proteins and chromosomes), the analysis and retrieval of those sequences is primarily concerned with finding ordered sets of short matching subsequences (q-grams). This characterization applies both to homology search algorithms, i.e., BLAST searches, as well as a growing toolkit of algorithms in comparative genomics that are tantamount to executing joins on pairs of whole genome sequences (whole genome joins).

To support these two logical views of sequence data, we introduce mSQL, a set of extensions to SQL92. We have implemented mSQL as a component of MoBIoS, the Molecular Biological Information System. We describe the materialization of sets of q-grams as a metric-space index. Such a physical structure provides an access path for indexed-nested loop and merge-style joins, enabling O(mlogn) comparative genomic analysis. We detail the optimization of paged MVP-trees to support a metric for the retrieval of protein sequences. Empirical results demonstrate O(logn) retrieval times for local alignments

Download


TR-04-06

Xu, Weijia, Daniel P. Miranker, Rui Mao, and Shu Wang. "Metric-Space Search of Protein Sequence Databases." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-06. October 31, 2003. 14 pages.

Download


TR-04-07

Martin, Jean-Philippe and Lorenzo Alvisi. "Fast Byzantine Paxos." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-07. February 24, 2004. 11 pages.

We present the first Byzantine Paxos protocol to achieve consensus in just two communication steps in the common case and discuss how it can be used to implement a fast Byzantine fault-tolerant replicated state machine.

Download


TR-04-08

Martin, Jean-Philippe and Lorenzo Alvisi. "A Framework for Dynamic Byzantine Storage." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-08. January 14, 2004. 29 pages.

We present a quorum-based protocol for a Byzantine fault-tolerant storage system that can dynamically adapt its failure threshold and server count, allowing the storage system to be reconfigured in anticipation of possible failures or to replace servers as desired. Our protocol provides confirmable wait-free atomic semantics while tolerating Byzantine failures from the clients or servers. The system can grow without bound to tolerate as many failures as desired. Finally, the protocol is optimal and fast: only the minimal number of servers - 3f+1 - is needed to tolerate any f failures and, in the common case, reads require only one message round-trip.

Download


TR-04-09

Johnson, Gregory S., William R. Mark, and Christopher A. Burns. "The Irregular Z-Buffer and its Application to Shadow Mapping." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-09. .

The classical Z-buffer algorithm samples a scene at regularly spaced points on an image plane. We present an extension of this algorithm called the irregular Z-buffer that permits sampling of the scene at arbitrary points on the image plane. The sample points are stored in a two-dimensional spatial data structure which is queried during rasterization. The irregular Z-buffer can be applied to shadow rendering, where we demonstrate that it eliminates the sampling artifacts previously associated with shadow mapping. We describe the extensions to modern graphics hardware necessary to support efficient operation of the irregular Z-buffer, and simulate the performance on the extended hardware. Our results indicate that the irregular Z-buffer can be used to produce hard shadows that are as accurate as those produced by a ray tracer or shadow volume-based renderer. We also find that shadow mapping on the irregular Z-buffer retains many of the performance and application-development simplicity advantages of standard shadow mapping.

Download


TR-04-10

Yu, Zeyun and Chandrajit Bajaj. "Visualization of Icosahedral Virus Structures from Reconstructed Volumetric Maps." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-10. April 9, 2004. 17 pages.

In this paper we present an automatic algorithm to segment the asymmetric subunits of an icosahedral density map. This approach is readily applicable to the structural analysis of a broad range of macromolecular structures that are reconstructed using the cryo-electron microscopy (cryoEM) technique. Our algorithm includes three major steps: the detection of critical points, the detection of icosahedral symmetry axes, and the segmentation of asymmetric subunits. The experiments on the real molecular density maps reconstructed by cryoEM technique as well as the synthetic maps calculated from the Protein Data Bank (PDB) demonstrate the high-quality performance of our approach.

Download


TR-04-11

Park, Sangmin and Chandrajit Bajaj. "Multi-Dimensional Transfer Function Design." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-11. April 9, 2004. 8 pages.

Direct volume rendering can be accomplished through correct transfer functions that convert data values to visual properties such as transparency and color. Multi-dimensional transfer functions that have additional axes of gradient magnitude and the second derivative as well as intensity, are useful for the visualization of the features that intensity ranges are overlaped, but gradient magnitudes are distinguishable. However, multi-dimensional transfer functions are hard to control by hand, since correct combination of data values of each axis should be searched to find a correct voxel role in terms of boundary by try and error in multi-dimesional space. In other words, some voxels exactly on a boundary should be totally opaque and others supporting the boundary should have proper transparencies based on the distance from the boundary. In this paper, we present how to decide the voxel role that is fixed once a volume dataset is generated and suggest a multi-dimensional transfer function that minizes user interactions through the pre-defined voxel roles.

Download


TR-04-12

McGuire, Tommy. "Correct Implementation of Network Protocols." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-12. April 2004.. 190 pages.

A number of issues combine to make network protocol development significantly more difficult than other areas of computer programming: problems with time, concurrency, and failures; interactions between the network protocol and its environment; and obstacles in developing the protocol over time. In order to address these issues, we introduce the Timed Abstract Protocol notation and the Austin Protocol Compiler. The Timed Abstract Protocol, or TAP, notation is a domain-specific formal language for describing asynchronous message-passing network protocols, with two execution models: an abstract execution model and a concrete execution model. The abstract execution model is suited for protocol design, comprehension, and correctness verification. The concrete execution model is suited for protocol implementation. We show that the two models are equivalent: that a protocol interpreted under the concrete model preserves the intended behavior of the protocol interpreted under the abstract model. The Austin Protocol Compiler, or APC, is a system that transforms a protocol given in the Timed Abstract Protocol notation into executable C code and provides a runtime environment for the protocol. In order to demonstrate the effectiveness of the TAP notation and APC, we present implementations of a secure encryption key exchange protocol, a failure discovery protocol, and a Domain Name System server. While discussing the latter, we examine the performance of the APC implementation and show that it is comparable to two other DNS servers. The combination of the Timed Abstract Protocol notation and the Austin Protocol Compiler addresses the issues of network protocol development by allowing precise and verifiable descriptions of protocols which can be made executable easily, in order both to gain experimental experience and to provide reference implementations.

Download


TR-04-13

Fernholz, Daniel and Vijaya Ramachandran. "Cores and Connectivity in Sparse Random Graphs." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-13. April 30, 2004. 23 pages.

We consider k-connectivity in a sparse random graph with a specified degree sequence. When dealing with sparse random graphs, properties that require connectivity are most appropriately phrased in terms of a giant subgraph that satisfies that property since a sparse random graph is generally not connected. Here a giant subgraph is one that contains a constant fraction of the vertices in the original graph.

We obtain a tight threshold on the existence of a giant k-vertex or k-edge connected subgraph for k>2 in a sparse random graph drawn from G(n,p). Although k-connectivity in G(n,p) has been widely studied in the literature, results for higher connectivity have applied mainly to the case when p > \ln n/n, which makes it likely that the graph is connected, and very little is known about higher connectivity in sparse G(n,p).

The k-core of a graph is the maximal induced subgraph with minimum degree k. The key tool in the derivation of our connectivity results is our recent theorem and proof strategy for the giant k-core threshold in sparse random graphs with a specified degree sequence.

A degree sequence exhibits a power law if the number of vertices of degree i is proportional to 1/i^{\beta}, for a suitable constant \beta. We establish that for every k>2, a random power-law graph has a giant k-core if 2< \beta < 3, and it has no giant 3-core if \beta \geq 3. Thus for \beta \geq 3 our results establish that a random power-law graph has no giant 3-connected subgraph. For 2< \beta < 3 we derive a weaker result, that any (k-1)-separator in the k-core must separate a set of size at most n^c where c<1 depends on \beta.

Finally, the existence of a giant k-core in a random graph is related in an informal way to the probability that the genealogy tree of a certain branching process contains a perfect infinite (k-1)-ary tree. We provide a solution to this latter problem in terms of probability generating functions. This result is tangential to the rest of our paper, but may be of independent interest.

Download


TR-04-14

Chen, Victor Yenwen. "Explicit Construction of Ramsey-type Graphs." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-14. May, 2004. 23 pages.

We survey some recent constructions of Ramsey-type graphs, large graphs with small clique and independent set sizes. Then continuing the works of Grolmusz, we construct matrices over the ring Z mod pq, where p and q are distinct primes, such that the diagonal entries are 0 modulo pq and the off diagonal entries are nonzero modulo pq. These matrices lead to a construction of Ramsey graphs. Our work simplifies Grolmusz's construction while still matching the best known constructive asymptotic bound.

Download


TR-04-15

Low, Tze Meng and Robert A. van de Geijn. "An API for Manipulating Matrices Stored by Blocks." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-15. May 11, 2004.. 16 pages.

We discuss an API that simplifies the specification of a data structure for storing hierarchical matrices (matrices stored recursively by blocks) and the manipulation of such matrices. This work addresses a recent demand for libraries that support such storage of matrices for performance reasons. We believe a move towards such libraries has been slow largely because of the difficulty that has been encountered when implementing them using more traditional coding styles. The impact on ease of coding and performance is demonstrated in examples and experiments. The applicability of the approach for sparse matrices is also discussed.

Download


TR-04-16

Boyed, Dipak Chand. "Analysis of the TRIPS Architecture." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-16. May 2004. 41 pages.

Introducing a new architecture has never been a trouble-free subject for computer architects. Growing complexity and compatibility issues often hinder the progress of new architectures even though the technological trends demand rapid, wholesome changes. TRIPS, is one such novel architecture that targets the problems of wire delays, memory latencies, power consumption and saturated parallelism. TRIPS aims to be a scalable, malleable, dynamically adaptive and non-specialized architecture that supports diverse applications. This thesis analyzes the advantages and disadvantages of the TRIPS architecture and its prototype. Because this thesis is the first to evaluate the TRIPS architecture, it tries to establish a foundation based on which the evaluation and analysis of the TRIPS architecture can be carried out smoothly in the future. The goals of this thesis are three-fold. The thesis provides details of the tools developed to analyze the TRIPS architecture, reports the metrics that provide data on the effects of the features of the TRIPS architecture on the program output, and evaluates the efficiency of the TRIPS model by comparing it with the Alpha 21264 (RISC) machine for a set of common source programs.

Download


TR-04-17

Cook, William and Siddhartha Rai. "Safe Query Objects: Statically-Typed Objects as Remotely-Executable Queries." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-17. May 2004. 14 pages.

When building scalable systems that involve general-purpose computation and persistent data, object-oriented languages and relational databases are often essential components. Yet the impedance mismatch between these technologies has not been completely overcome by existing integration approaches. Call level interfaces like ODBC and JDBC are an unsafe and fragile form of metaprogramming: database queries are constructed at runtime as strings and executed as programs against the database engine. Object/relational mapping and persistent object systems do not support query shipping, in which complex queries are sent to the database for execution. This paper presents Safe Query Objects, an integrated approach to impedance mismatch that allows query behavior to be defined using statically-typed objects and methods. In addition, safe query objects support query shipping by automatically generating code to execute queries remotely in a relational database. A concrete implementation of this solution is presented using the OpenJava macro language and Java Data Objects.

Download


TR-04-19

Gautam, Aashin. "Stamping Out Spam." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-19. May 10, 2004. 21 pages.

We describe a new service of spam control in this paper. This service starts by providing some k e-pennies to each user of a mail server enabling this user to send k emails. Every time an email is sent or received, the balance of e-pennies in the users account is decremented or incremented respectively. Extra e-pennies can be attained from participating e-banks, which allow users to buy or sell e-pennies.

Download


TR-04-20

Bajaj, Chandrajit, Peter Djeu, Vinay Siddavanahalli, and Anthony Thane. "Interactive visual exploration of large flexible multi-component molecular complexes." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-20. December 2003. 10 pages.

While molecular visualization software has advanced over the years, today, most tools still operate on individual molecular structures with little facility to manipulate large multi-component complexes, or depict molecular flexibility. We extend and accelerate 3D image-based rendering via programmable graphics units to provide an order of magnitude speedup over traditional triangle-based rendering. Additionally, by employing a biochemically-sensitive level-of-detail hierarchy, we communicate appropriate molecular volume occupancy and shape while dramatically reducing the visual clutter that normally inhibits higher-level spatial comprehension. The hierarchical, image based rendering also allows dynamically-computed molecular properties data (e.g. electrostatics potential) to be mapped onto the molecular surface, tying molecular structure to molecular function. The collection of these techniques have been encapsulated in an interactive molecular exploration tool we call TexMol (short for Texture Molecular viewer), which supports simultaneous volumetric and structural rendering and synchronized multi-viewing.

Download


TR-04-21

Li, Xiaozhou, Jayadev Misra, C. Greg Plaxton, and Anthony Thane. "Active and Concurrent Maintenance of a Structured Peer-to-Peer Network Topology." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-21. May 2004. 25 pages.

A central problem for structured peer-to-peer networks is topology maintenance, that is, how to properly update neighbor variables when nodes join and leave the network, possibly concurrently. In this paper, we present a protocol that maintains Ranch, a structured peer-to-peer network topology consisting of multiple rings. The protocol handles both joins and leaves concurrently and actively (i.e., neighbor variables are updated once a join or a leave occurs). We use an assertional method to prove the correctness of the protocol, that is, we first come up with a global invariant and then show that every action of the protocol preserves the invariant. The protocol is simple and the proof is rigorous and explicit.

Download


TR-04-22

Dhillon, Inderjit S., Suvrit Sra, and Joel A. Tropp. "Triangle Fixing Algorithms for the Metric Nearness Problem." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-22. June 27, 2004. 15 pages.

Numerous problems in machine learning, data mining, databases and statistics involve pairwise dissimilarities amongst a set of objects. These dissimilarities can be represented as edge weights in a complete graph with the objects as the vertices. Often one desires the dissimilarities to satisfy the properties of a metric---especially the triangle inequality. Applications where metric data is important include clustering, metric based indexing, classification, query processing, and approximation algorithms. In this paper we present algorithms for solving the Metric Nearness Problem: Given a non-metric graph (dissimilarity matrix), find the ``nearest'' metric graph whose edge weights satisfy the triangle inequalities. This paper presents algorithms that exploit the innate structure of the problem for solving it efficiently for nearness in Lp norms. Empirically, the algorithms have time and storage requirements linear in the number of triangle constraints. The methods are easily parallelizable enabling the solution of large problems.

Download


TR-04-24

Banerjee, A., I. Dhillon, J. Ghosh, S. Merugu, and D. Modha. "A Generalized Maximum Entropy Approach to Bregman Co-clustering and Matrix Approximation." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-24. September 2004. 10 pages.

Co-clustering, or simultaneous clustering of the rows and columns of two-dimensional data matrices, is a powerful data mining technique with varied applications such as text clustering, microarray analysis and recommender systems. An information-theoretic approach that is applicable when the data matrix can be interpreted as a two-dimensional empirical joint probability distribution, was recently proposed. However, in many situations, co-clustering of more general matrices is desired. In this paper, we present a substantially generalized co-clustering framework wherein (i) loss functions corresponding to all Bregman divergences, which include squared Euclidean distance and KL-divergence as special cases, can be used, thereby making it applicable to a wide range of data matrices, (ii) various conditional expectation based constraints can be considered based on the statistics that need to be preserved, thereby giving rise to different parametric co-clustering models, and (iii) the maximum entropy principle is generalized to the minimum Bregman information principle to provide a natural model selection technique. The analysis yields an elegant meta algorithm that is guaranteed to achieve local optimality. Our methodology encompasses a vast majority of previously known clustering and co-clustering algorithms based on alternate minimization. We provide examples and empirical evidence to establish the generality and efficacy of the proposed co-clustering framework.

Download


TR-04-25

Dhillon, Inderjit S., Yuqiang Guan, and Brian Kulis. "A Unified View of Kernel k-means, Spectral Clustering and Graph Cuts." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-25. February 18, 2005. 22 pages.

Recently, a variety of clustering algorithms have been proposed to handle data that is not linearly separable. Spectral clustering and kernel k-means are two such methods that are seemingly quite different. In this paper, we show that a general weighted kernel k-means objective is mathematically equivalent to a weighted graph partitioning objective. Special cases of this graph partitioning objective include ratio cut, normalized cut and ratio association. Our equivalence has important consequences: the weighted kernel k-means algorithm may be used to directly optimize the graph partitioning objectives, and conversely, spectral methods may be used to optimize the weighted kernel k-means objective. Hence, in cases where eigenvector computation is prohibitive, we eliminate the need for any eigenvector computation for graph partitioning. Moreover, we show that the Kernighan-Lin objective can also be incorporated into our framework, leading to an incremental weighted kernel k-means algorithm for local optim ization of the objective. We further discuss the issue of convergence of weighted kernel k-means for an arbitrary graph affinity matrix and provide a number of experimental results. These results show that non-spectral methods for graph partitioning are as effective as spectral methods and can be used for problems such as image segmentation in addition to data clustering.

Download


TR-04-26

Liu, Alex X. and Mohamed G. Gouda. "Removing Redundancy from Packet Classifiers." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-26. June 4, 2004. 10 pages.

Packet classification is the core mechanism that enables many networking services such as firewall access control and traffic accounting. Reducing memory space for packet classification algorithms is of paramount importance because a packet classifier must use very limited on-chip cache to store complex data structures. This paper proposes the first ever scheme that can significantly reduce memory space for all packet classification algorithms. The scheme is to remove all redundant rules in a packet classifier before a classification algorithm starts building data structures. By removing redundant rules, we can save more than 73% of memory for a packet classifier that examines eight packet fields. In this paper, we categorize redundant rules into upward redundant rules and downward redundant rules. We give a necessary and sufficient condition for identifying each type of redundant rule. We present two efficient algorithms for detecting and removing the two types of redundant rules respectively. The two algorithms make use of a graph model of packet classifiers, called packet decision diagrams. The experimental results shows that our algorithms are very efficient.

Download


TR-04-27

Iyer, Subramanian, Debashis Sahoo, Jawahar Jain, and E. Allen Emerson. "On partitioning and model checking." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-27. 2004. 11 pages.

State space partitioning-based approaches have been proposed in the literature to address the state-space explosion problem in model checking. These approaches, whether sequential or distributed, perform a large amount of work in the form of inter-partition (cross-over) image computations, which can be expensive. We present a model checking algorithm that aggregates such cross-over images by localising computation to individual partitions. Experimentally, this algorithm consistently outperforms extant approaches in terms of time taken, as well as cross-over image computation cost.

Download


TR-04-28

Dahlin, Mike, Lei Gao, Amol Nayate, Arun Venkataramani, Praveen Yalagandula, and Jiandan Zheng. "PRACTI Replication for Large-Scale Systems." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-28. March 2004. 14 pages.

Many replication mechanisms for large scale distributed systems exist, but they require a designer to compromise a system's replication policy (e.g., by requiring full replication of all data to all nodes), consistency policy (e.g., by supporting per-object coherence but not multi-object consistency), or topology policy (e.g., by assuming a hierarchical organization of nodes.) In this paper, we present the first PRACTI (Partial Replication, Arbitrary Consistency, and Topology Independence) mechanisms for replication in large scale systems. These new mechanisms allow construction of systems that replicate or cache any data on any node, that provide a broad range of consistency and coherence guarantees, and that permit any node to communicate with any other node at any time. Our evaluation of a prototype suggests that by disentangling mechanism from policy, PRACTI replication enables better trade-offs for system designers than possible with existing mechanisms. For example, for one workload we study, PRACTI's partial replication reduces bandwidth requirements by over an order of magnitude compared to full replication for nodes that only care about a subset of the system's data.

Download


TR-04-29

Xie, Fei. "Integration of Model Checking into Software Development Processes." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-29. August 2004. 163 pages.

Testing has been the dominant method for validation of software systems. As software systems become complex, conventional testing methods have become inadequate. Model checking is a powerful formal verification method. It supports systematic exploration of all states or execution paths of the system being verified. There are two major challenges in practical and scalable application of model checking to software systems: (1) the applicability of model checking to software systems and (2) the intrinsic complexity of model checking.

In this dissertation, we have developed a comprehensive approach to integration of model checking into two emerging software development processes: Model-Driven Development (MDD) and Component-Based Development (CBD), and a combination of MDD and CBD. This approach addresses the two major challenges under the following framework: (1) bridging applicability gaps through automatic translation of software representations to directly model-checkable formal representations, (2) seamless integration of state space reduction algorithms in the translation through static analysis, and (3) scaling model checking capability and achieving state space reduction by systematically exploring compositional structures of software systems. We have integrated model checking into MDD by applying mature model checking techniques to industrial design-level software representations through automatic translation of these representations to the input formal representations of model checkers. We have developed a translation-based approach to compositional reasoning of software systems, which simplifies the proof, implementation, and application of compositional reasoning rules at the software system level by reusing the proof and implementation of existing compositional reasoning rules for directly model-checkable formal representations. We have developed an integrated state space reduction framework which systematically conducts a top-down decomposition of a large and complex software system into directly model-checkable components by exploring domain-specific knowledge. We have designed, implemented, and applied a bottom-up approach to model checking of component-based software systems, which composes verified systems from verified components and integrates model checking into CBD. We have further scaled model checking of component-based systems by exploring the synergy between MDD and CBD, i.e., specifying components in executable design languages, and realizing the bottom-up approach based on model checking of software designs through translation.

Download


TR-04-30

Li, Xiaozhou. "Maintaining the Chord Ring." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-30. July 27, 2004. 4 pages.

This short note gives an active and concurrent topology maintenance protocol for the Chord ring. The protocol maintains predecessors and successors, but not fingers. The protocol is a simple extension of a ring maintenance protocol established in a previous paper.

Download


TR-04-32

Sustik, M.A., J.A. Tropp, I.S. Dhillon, and R.W. Heath Jr.. "On the existence of equiangular tight frames." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-32. March 12, 2007. 8 pages.

An equiangular tight frame (ETF) is a $d\times N$ matrix that has unit-norm columns and orthogonal rows of norm $\sqrt{N/d}$. Its key property is that the absolute inner products between pairs of columns are (i) identical and (ii) as small as possible. ETFs have applications in communications, coding theory, and sparse approximation. Numerical experiments indicate that ETFs arise for very few pairs $(d, N)$, and it is an important challenge to develop restrictions on the pairs for which they can exist. In a recent paper Holmes and Paulsen established a necessary condition for the existence of an $N$-vector equiangular tight frame in a $d$-dimensional real Euclidean space. By applying field theory and results of graph theory we develop stronger necessary conditions and thereby rule out many possibilities admitted by the work of Holmes and Paulsen. It has been verified that a real equiangular tight frame exists for each pair $(d, N)$ with $N \leq 100$ that meets the new conditions. The arguments also extend to deliver novel necessary conditions for the existence of equiangular tight frames whose Gram matrices have entries drawn from a discrete set of complex numbers.

Download


TR-04-33

Riche, Taylor L., Jayaram Mudigonda, and Harrick M. Vin. "Experimental Evaluation of Load Balancers in Packet Processing Systems." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-33. October 2004. 8 pages.

The load balancer is a fundamental building block for implementing high-throughput applications on multi-core architectures (e.g., network processors). In this paper, we consider two canonical load balancing schemes in the context of packet processing systems: (1) packet-level load balancing that determines the mapping of a packet to processor independently for each packet; and (2) flow-level load balancing that maps a flow to a processor and directs all subsequent packets of that flow to the mapped processor. By identifying the application, system, and trace characteristics that affect their relative performance, we address the fundamental question: under what operating conditions, should one choose packet-level load balancing over flow-level load balancing, and vice versa?

Download


TR-04-34

Chaput, Harold Henry. "The Constructivist Learning Architecture: A Model of Cognitive Development for Robust Autonomous Robots." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-34. August 2004. 91 pages.

Autonomous robots are used more and more in remote and inaccessible places where they cannot be easily repaired if damaged or improperly programmed. A system is needed that allows these robots to repair themselves by recovering gracefully from damage and adapting to unforeseen changes. Newborn infants employ such a system to adapt to a new and dynamic world by building a hierarchical representation of their environment. This model allows them to respond robustly to changes by falling back to an earlier stage of knowledge, rather than failing completely. A computational model that replicates these phenomena in infants would afford a mobile robot the same adaptability and robustness that infants have. This dissertation presents such a model, the Constructivist Learning Architecture (CLA), that builds a hierarchical knowledge base using a set of interconnected self-organizing learning modules. The dissertation then demonstrates that CLA (1) replicates current studies in infant cognitive development, (2) builds sensorimotor schemas for robot control, (3) learns a goal-directed task from delayed rewards, and (4) can fall back and recover gracefully from damage. CLA is a new approach to robot control that allows robots to recover from damage or adapt to unforeseen changes in the environment. CLA is also a new approach to cognitive modeling that can be used to better understand how people learn for their environment in infancy and adulthood.

Download


TR-04-35

Roles, Angela. "Certificate Chain Expressions." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-35. August 2004. 15 pages.

Since the traditional sense of certificates in networks uses only the identity of the issuer, the identity of the subject, and the public key of the subject, there is no way to limit the use of an issued certificate. However, there may be cases in which we also wish a certificate to express limited "trust" of other nodes in the network. Thus, our research extends a certificate to also contain a certificate chain expression which indicates the situations in which the current certificate may be chained with other certificates. We define a calculus for these certificate chain expressions and present some properties, and then we illustrate the properties with a specific certificate graph.

Download


TR-04-36

Li, Xiaozhou. "Ranch: A Dynamic Network Topology." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-36. August 2004. 166 pages.

Peer-to-peer computing is an emerging paradigm that has the potential of harnessing enormous amounts of under-utilized computational resources (e.g., home computers). A central problem in peer-to-peer computing is how to organize the network nodes so that sophisticated applications can be efficiently supported. The cornerstone of a peer-to-peer network is a dynamic network topology that determines the neighbor relationships to be maintained by the network nodes. This dissertation is concerned with algorithmic and concurrency issues in dynamic network topologies.

We present Ranch (random cyclic hypercube), a simple, recursive topology consisting of a collection of rings. Ranch is a scalable topology. In particular, it has logarithmic in-degree, out-degree, and diameter, and it uses only a logarithmic number of messages for a node to join or leave the network. Ranch also has a number of additional desirable properties, including locality awareness and fault tolerance. We show how to build a name resolution scheme for Ranch that enables the peer-to-peer network to find data items efficiently. Our results include a name replication scheme and a fault-tolerant lookup algorithm.

We address the problem of topology maintenance in peer-to-peer networks, that is, how to properly update the neighbor variables when nodes join and leave the network, possibly concurrently. We design, and prove the correctness of, protocols that maintain the ring topology, the basis of several peer-to-peer networks, in the fault-free environment. Our protocols handle both joins and leaves actively (i.e., they update the neighbor variables as soon as a join or a leave occurs). We use an assertional method to prove the correctness of our protocols, that is, we first design a global invariant for a protocol and then show that every action of the protocol preserves the invariant. Our protocols are simple and our proofs are rigorous and explicit.

We extend our results on the maintenance of rings to address the maintenance of Ranch. We present active and concurrent maintenance protocols that handle both joins and leaves for Ranch, along with their assertional correctness proofs. The protocols for Ranch use the protocols for rings as a building block. The protocols and the correctness proofs for Ranch substantially extend those for rings.

We present simulation results that demonstrate the scalability and locality awareness of Ranch.

Download


TR-04-37

Jung, Eunjin, Ehab Elmallah, and Mohamed G. Gouda. "Optimal Dispersal of Special Certificate Graphs." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-37. September 5, 2004. 6 pages.

We consider a network where nodes can issue certificates that identify the public keys of other nodes in the network. The issued certificates in a network constitute a directed graph, called the certificate graph of the network. The issued certificates are dispersed among the network nodes such that the following condition holds. If any node u needs to send messages to any other node v in the network, then u can use the certificates stored in both u and v to obtain the public key of v (then u can securely send messages to v). The cost of a dispersal which assigns certificates to the nodes of a network is measured by the average number of certificates that need to be stored in one node. A dispersal is optimal if its cost is minimum. In this paper, we present three algorithms and show that each algorithm computes optimal dispersals for a rich class of certificate graphs. The time complexity of each of these algorithms, when one of the algorithms is used to disperse the certificates from a given certificate gra

Download


TR-04-38

Chowdhury, Rezaul Alam and Vijaya Ramachandran. "External-Memory Exact and Approximate All-Pairs Shortest-Paths in Undirected Graphs." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-38. August 31, 2004. 25 pages.

We present several new external-memory algorithms for finding all-pairs shortest paths in a V-node, E-edge undirected graph. Our results include the following, where B is the block-size. We present cache-oblivious algorithms with O(V sort(E)) I/Os for all-pairs shortest paths and diameter in unweighted undirected graphs, where sort(E) is the number of I/Os required to sort E data items in external memory. For weighted undirected graphs we present a cache-aware APSP algorithm that performs O(V (sqrt{VE/B} + E/B log{V/B})) I/Os. We also present efficient cache-aware algorithms that find paths between all pairs of vertices in an unweighted graph whose lengths are within a small additive constant of the shortest path length.

All of our results improve earlier results known for these problems. For approximate APSP we provide the first nontrivial results. Our diameter result uses O(V + E) extra space, and all of our other algorithms use O(V^2) space. In our work on external-memory algorithm for APSP in weighted undirected graphs we develop the notion of a `slim data structure' that might have other applications in external-memory computations.

Download


TR-04-42

Zhang, X. Brian, Simon S. Lam, and Huaiyu Liu. "Efficient Group Rekeying Using Application-Layer Multicast." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-42. April, 2005. 16 pages.

In secure group communications, there are both rekey and data traffic. We propose to use application-layer multicast to support concurrent rekey and data transport. Rekey traffic is bursty and requires fast delivery. It is desired to reduce rekey bandwidth overhead as much as possible since it competes for bandwidth with data traffic. Towards this goal, we propose a multicast scheme that exploits proximity in the underlying network. We further propose a rekey message splitting scheme to significantly reduce rekey bandwidth overhead at each user access link and network link. We formulate and prove correctness properties for the multicast scheme and rekey message splitting scheme. We have conducted extensive simulations to evaluate our approach. Our simulation results show that our approach can reduce rekey bandwidth overhead from several thousand encrypted new keys (encryptions, in short) to less than ten encryptions for more than 90% of users in a group of 1024 users

Download


TR-04-43

Joffrain, Thierry, Tze Meng Low, Enrique S. Quintana-Ortí, Robert van de Geijn, and Field van Zee. "On Accumulating Householder Transformations." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-43. October 12, 2004. 8 pages.

A theorem related to the accumulation of Householder transformations into a single orthogonal transformation, known as the compact WY transform, is presented. It provides a simple characterization of the computation of this transformation and suggests an alternative algorithm for computing it. It also suggests an alternative transformation, the UT transform, with the same utility as the compact WY Transform, which requires less computation and has similar stability properties. That alternative transformation was first published over a decade ago, but has gone unnoticed by the community.

Download


TR-04-44

Quintana-Ortí, Enrique S. and Robert van de Geijn. "Improving the Performance of Reduction to Hessenberg Form." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-44. October 12, 2004. 13 pages.

In this paper, a modification of the blocked algorithm for reduction to Hessenberg form is presented that improves performance by shifting more computation from less efficient matrix-vector operations to highly efficient matrix-matrix operations. Significant performance improvements are reported relative to the performance achieved by the current LAPACK implementation.

Download


TR-04-45

Li, Xiaozhou, C. Greg Plaxton, Mitul Tiwari, and Arun Venkataramani. "Online Hierarchical Cooperative Caching." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-45. October 18, 2004. 23 pages.

We address a hierarchical generalization of the well-known disk paging problem. In the hierarchical cooperative caching problem, a set of $n$ machines residing in an ultrametric space cooperate with one another to satisfy a sequence of read requests to a collection of read-only files. A seminal result in the area of competitive analysis states that the "least recently used" (LRU) paging algorithm is constant-competitive if it is given a constant-factor blowup in capacity over the offline algorithm. Does such a constant-competitive deterministic algorithm, with a constant-factor blowup in the machine capacities, exist for the hierarchical cooperative caching problem? In this paper, we present a deterministic hierarchical generalization of LRU that is constant-competitive when the capacity blowup is linear in $d$, the depth of the cache hierarchy. Furthermore, we exhibit an infinite family of depth-$d$ hierarchies such that any randomized hierarchical cooperative caching algorithm with capacity blowup $b$ has competitive ratio $\Omega(\log\frac{d}{b})$ against an oblivious adversary. Thus, our upper and lower bounds imply a tight bound of $\Theta(d)$ on the capacity blowup required to achieve constant competitiveness.

Download


TR-04-46

Balakrishnan, Vinod, Ravi Kokku, Aaron Kunze, Harrick M. Vin, and Erik J. Johnson. "Supporting Run-time Adaptation in Packet Processing Systems." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-46. November 2004. 14 pages.

Implementors of packet-processing applications on multi-core processors must balance two requirements: (1) adapt processor allocations dynamically to reduce the overall resource provisioning requirement for the system, achieve robustness to traffic fluctuations, and reduce energy consumption; and (2) utilize, for each application stage, resources (e.g., memory levels, inter-processor communication mechanisms, etc.) closer or local to the processors on which the stages are mapped to achieve the highest possible throughput. In this paper, we describe the design and implementation of a run-time adaptation system that can meet these two requirements simultaneously. Our design allows each application stage to utilize local resources whenever possible in the steady state. Upon adapting the allocation of processors to stages, the run-time system (1) binds each resource usage within a stage to a new resource instance; and (2) checkpoints and migrates the state from the previous resource instance to the newly-bound resource instance. We describe the design and implementation of our adaptation system in the context of a packet processing system designed using the Intel IXP2400 network processor. We show that our design has little impact (14%) on the steady-state throughput of the system. We further show that our design is able to perform resource adaptation for a real application in less than 100 ms, allowing processor allocations to be adapted at a very fine time-scale.

Download


TR-04-47

Kim, Min Sik, Yi Li, and Simon S. Lam. "Eliminating Bottlenecks in Overlay Multicast." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-47. November 2004. 10 pages.

Recently many overlay multicast systems have been proposed to overcome limited availability of IP multicast. Because they perform multicast forwarding without support from routers, data may be delivered multiple times over the same physical link, causing a bottleneck. This problem is more serious for applications demanding high bandwidth such as multimedia distribution. Although such bottlenecks can be removed by changing overlay topology, a naive approach may create bottlenecks in other parts of the network. In this paper, We propose an algorithm that removes all bottlenecks caused by the redundant data delivery of overlay multicast, detecting such bottlenecks using a wavelet-based technique we recently proposed. In a case where the source rate is constant and the available bandwidth of each link is not less than the rate, our algorithm guarantees that every node receives at the full source rate. Simulation results show that even in a network with a dense receiver population, our algorithm finds a tree that satisfies all the receiving nodes while other heuristic-based approaches often fail.

Download


TR-04-48

Yalagandula, Praveen and Mike Dahlin. "Research Challenges for a Scalable Distributed Information Management System." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-48. November 2004. 18 pages.

A Scalable Distributed Information Management System (SDIMS) that aggregates information about large-scale networked systems can serve as a basic building for a broad-range of large-scale distributed applications simplifying the design, development, and deployment of such services. In this document, we outline four key requirements such an aggregation system should satisfy to be useful as a general middleware building block -- scalability with both nodes and data attributes, flexibility to accommodate broad range of services, administrative autonomy and isolation for availability and security, and robustness to reconfigurations in the system. We propose a new aggregation framework that leverages Distributed Hash Tables (DHTs) and a new aggregation abstraction that builds on a previously proposed abstraction in Astrolabe. We also present details of several significant applications that we propose to build on top of SDIMS.

Download


TR-04-49

Choi, Young-ri, Mohaamed G. Gouda, Hongwei Zwang, and Anish Arora. "Routing on a Logical Grid in Sensor Networks." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-49. February 22, 2005. 28 pages.

We present a protocol for routing data messages from any sensor to the base station in a sensor network. The protocol maintains an incoming spanning tree whose root is the base station. The spanning tree is constructed as follows. First, each sensor in the network is assigned a unique identifier as if the sensors form a logical two-dimensional grid. Second, each sensor, other than the base station, uses its own identifier to compute the identifiers of its "potential parents" in the spanning tree. Third, the base station starts to periodically send "connected messages". When a sensor receives a connected message from anyone of its potential parents, the sensor makes this potential parent its parent in the tree and starts to periodically send connected messages. This routing protocol has three advantages over earlier protocols: overhead of the protocol is very small, the protocol balances the load over the whole network, and the protocol has nice fault-tolerance and security properties. We have implemented this protocol over a sensor network that consisted of about 50 MICA2 motes and observed that the protocol delivers 72% - 99% of the messages to the base station under heavy and bursty traffic (that generated 100 data messages per 15 seconds). We also ran the same experiment on a version of the distance vector routing protocol and observed that this protocol can delivery only 34% of the messages to the base station.

Download


TR-04-50

Low, Tze Meng, Kent F. Milfeld, Robert A. van de Geijn, and Field G. Van Zee. "Parallelizing FLAME Code with OpenMP Task Quotes." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-50. December 2004. 24 pages.

We discuss the OpenMP parallelization of linear algebra algorithms that are coded using the Formal Linear Algebra Methods Environment (FLAME) API. This API expresses algorithms at a higher level of abstraction, avoids the use of indices, and thus represents these algorithms as they are formally derived and presented. Traditional OpenMP directives require an explicit loop index, or explicit critical-region constructs on a variable, in order to indicate parallelism in loops and thus the lack of indices previously posed a challenge. A feature, task queues, that has been proposed for adoption into OpenMP 3.0 overcomes this problem. We illustrate the issues and solutions by discussing the parallelization of the symmetric rank-k update and report impressive performance on a 4 CPU Itanium2 server.

Download


TR-04-51

Lopez-Herrejon, Roberto E. and Jacob Neal Sarvela. "Frame-Based Code Generation of Multi-Dimensional Feature Sets or Origami meets Frames." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-51. January 3, 2005. 17 pages.

Software product lines are defined in terms of feature sets that can be orthogonally arranged, according to different design criteria, to constitute dimensions in a design space. When feature sets are well-structured, it is possible to use Origami, a feature-oriented programming technique, to guide feature modelling and composition. However, a considerable manual effort still remains to maintain the consistency of multi-dimensional feature sets, as feature implementations are parametrically interdependent. In this paper, we present a code-generation technique that integrates frame technology and Origami to address this problem. Our motivating example is the expression problem, viewed as an instance of a two-dimensional feature set.

Download


TR-04-52

Lopez-Herrejon, Roberto E. and Don Batory. "The Expression Problem as a Product Line and its Implementation in AHEAD." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-52. January 3, 2005. 6 pages.

Software product lines are defined in terms of feature sets. A family member is defined with the set of features it implements. Several modularization techniques have recently appeared so in an effort to evaluate their usability to implement product lines, we propose a simple yet illustrative product line example based on the expression problem that we believe captures the most basic requirements to modularize features.

Download


TR-04-53

Fernholz, Daniel and Vijaya Ramachandran. "The Diameter of Sparse Random Graphs." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-53. December, 2004. 38 pages.

We derive an expression of the form c ln n +- o(ln n) for the diameter of a sparse random graph with a specified degree sequence. The result holds a.a.s., assuming certain convergence and supercriticality conditions are met. The proof is constructive and yields a method for computing the constant c. For the random graph G(n,p) with np=Theta(1)+1, we solve for c in closed form.

Download


TR-04-54

Gouda, M. G., A. Arora, J. A. Cobb, T. Herman, and S. Kulkarni. "A Proposal for a Common Sensor Network Protocol." The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-54. October, 2004. 8 pages.

We discuss the need for a common sensor network protocol, and present a preliminary design of such a protocol.

Download


Questions to trcenter@cs.utexas.edu