Martin, Jean-Philippe, Lorenzo Alvisi, and Michael Dahlin. "Small Byzantine Quorum Systems". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-01 (technical report). January 2002. 34 pages.
In this paper we present two protocols for asynchronous Byzantine Quorum Systems (BQS) built on top of reliable Channels -- one for self-verifving data and the other for any data. Our protocols tolerate f Byzantine failures with f fewer servers than existing solutions by eliminating nonessential work in the write protocol and by using read and write quorums of different sizes. In practice, however, engineering asynchronous reliable channels is difficult in many environments. To address this concern, we moded the original asynchronous BQS protocol of Malkhi and Reiter to work on unreliable channels and discuss how our two new asynchronous protocols can be used to derive an efficient Byzantine systems.
Jiménez, Daniel A. "Delay-Sensitive Branch Predictors for Future Technologies". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-02 (dissertation). May 2002. 165 pages.
Accurate branch prediction is an essential component of a modem, deeply pipelined microprocessors. Because the branch predictor is on the critical path for fetching instructions, it must deliver a prediction in a single cycle. However, as feature sizes shrink and clock rates increase, access delay will significantly decrease the size and accuracy of branch predictors that can be accessed in a single cycle. Thus, there is a tradeoff between branch prediction accuracy and latency. Deeper pipelines improve overall performance by allowing more aggressive clock rates, but some performance is lost due to increased branch misprediction penalties. Ironically, with shorter clock periods, the branch predictor has less time to make a prediction and might have to be scaled back to make it faster, which decreases accuracy and reduces the advantage of higher clock rates. We propose several methods for breaking the tradeoff between accuracy and latency in branch predictors. Our methods fall into two broad categories: hierarchical predictors using purely hardware implementations, and cooperative predictors that off-load some prediction work to the compiler. We describe hierarchical organizations that extend traditional predictors. We then describe a highly accurate branch predictor based on a neural learning technique. Using a hierarchical organization, this complex multi-cycle predictor can be used as a component of a fast delay-sensitive predictor. We introduce a novel cooperative branch predictor that off-loads most of the prediction work to the compiler with profiling. The compiler communicates profiled information to the microprocessor using extensions to the instruction set. This Boolean formula predictor has a small and fast hardware implementation, and will work in less than one cycle in even the smallest technologies with the most aggressive projected clock rates. Finally, we present another cooperative technique, branch path re-aliasing, that moves complexity off of the critical path for making a prediction and into the compiler; this technique increases accuracy by reducing destructive aliasing during the less critical update stage.
Dhillon, Inderjit S., Yuqiang Guan, and J. Kogan. "Refining clusters in high dimensional text data". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-03 (technical report). January 2002. 18 pages.
The k-means algorithm with cosine similarity, also known as the spherical k-means algorithm, is a popular method for clustering document collections. However, spherical k-means can often yield qualitatively poor results, especially for small clusters, say 25-30 documents per cluster, where it tends to get stuck at a local maximum far away from the optimal. In this paper, we present the "first-variation" principle that refines a given clustering by incrementally moving data points between clusters, thus achieving a higher objective function value. Combining first-variation with spherical k-means yields a powerful "ping-pong" strategy that often qualitatively improves k-means clustering. We present several experimental results to show that our proposed method works well in clustering high-dimensional and sparse text data.
Guyer, Samuel Z., Emery D. Berger, and Calvin Lin. "Detecting Errors with Configurable Whole-program Dataflow Analysis". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-04 (technical report). September 2002. 8 pages.
NO ABSTRACT
McGuire, Tommy M. "The Austin Protocol Compiler Reference Manual". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-05 (technical report). September 2002. 17 pages.
This manual provides a reference for the Austin Protocol Compiler, the APT language, and the runtime system. It describes the basic usage of the compiler, the fundamentals of the execution of the APT language, the syntax and semantics of the language, and the C interface provided by the runtime system and the generated code.
Zhang, Xiaoyu, Chandrajit Bajaj, and Vijaya Ramachandran. "Parallel and Out-of-core View-dependent Isocontour Visualization Using Random Data Distribution". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-06 (technical report). January 2002. 9 pages.
NO ABSTRACT
Sohn, Bong-Soo, Chandrajit Bajaj, Sanghun Park, and Vinay K. Siddavanahalli. "Volumetric Video Compression for Real-Time Playback". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-07 (technical report). January 2002. 6 pages.
In this paper, we describe a compression scheme for volumetric video data(3D space x 1D time) where each frame of the volume is decompressed and rendered in real-time. Since even one frame size of volume is very large, run-time decompression can be a bottleneck for real-time playback of time-varying volume data. To increase the run-time decompression speed and compression ratio, we decompose the volume into small blocks and only update significantly changing blocks. The results show that our block truncation based compression scheme compromises decompression speed and image quality well enough for interactive time-varying visualization.
Sastry, Nishanth R. and Simon S. Lam. "CYRF: A Framework for Window-based Unicast Congestion Control". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-09 (technical report). January 2002. 30 pages.
This work presents a comprehensive theoretical framework of window based congestion control protocols called CYRF (for Choose Your Response Function) that are designed to converge to fairness and efficiency. We first derive a sufficient condition for convergence to fairness. Using this, we show how fair window increase/decrease policies can be constructed from suitable pairs of monotonically non-decreasing functions. We also give a new characterization of TCP-friendliness and simple rules for smooth CYRF flows to be TCP-friendly. Specific protocols that meet the needs of a given situation can be designed within this framework. We experimentally investigate two CYRF protocols for streaming media-like applications: LOG, a protocol suitable for applications that need to trade-off between smoothness and aggressiveness, and SIGMOID, which is designed to be "TCP-friendl.y" even in a network with drop-tail queues (unlike other non-linear window based protocols).
Kim, Changkyu, Doug Burger, and Stephen W. Keckler. "An Adaptive Cache Structure for Future High-Performance Systems". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-10 (technical report). February 2002. 12 pages.
On-chip cache sizes are likely to continue to grow over the next decade as working sets, available chip capacity, and memory latencies all increase. Traditional cache architectures, with fixed sizes and discrete latencies, lock one organization down at design time, which will provide inferior performance across a range of workloads. In addition, expected increases in on-chip communication delays will make the time to retrieve data in a cache a function of the data's physical location. Consequently, cache access times will become a continuum of latencies rather than a single one. This non-uniformity will make static organizations particularly limited for single-chip servers, in which multiple processors will be different distances from the cache controller. In this paper, we propose a set of adaptive, high-performance cache design, called Non-Uniform Cache Architectures (NUCAs). We extend these physical designs with logical policies that allow important data to migrate closer to the processor within the same cache. We show that these adaptive level-two NUCA designs provide 1.6 times the performance of a Uniform Cache Architecture of any size, and that the adaptive NUCA scheme outperforms static NUCA schemes by 9% for multi-megabyte, on-chip server caches with large numbers of banks.
Martin, Jean-Philippe, Lorenzo Alvisi, and Michael Dahlin. "Minimal Byzantine Quorums (SUPERSEDED BY TR02-38)". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-12 (technical report). February 2002. 13 pages.
Byzantine quorum systems can provide fault tolerant storage in hazardous environments, but the redundant servers they require increase software development and hardware costs. In order to minimize the number of servers required to implement Byzantine quorum services, we develop a new algorithm that uses a ''Listeners'' pattern of network communication to detect and resolve ordering ambiguities created by concurrent accesses to the system. In our analysis of this algorithm, we (1) identify the lower bound on the number of servers required to implement distributed data services in a Byzantine environment, (2) describe new protocols that reduce the number of servers necessary for generic data while providing strong consistency semantics, and (3) show that the new protocols match the lower bounds for the number of servers and provide the best consistency semantics for this number of servers.
Pettie, Seth. "A Faster All Pairs Shortest Path Algorithm for Real-Weighted Sparse Graphs". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-13 (technical report). February 2002. 12 pages.
We present a faster all-pairs shortest paths algorithm for arbitrary real-weighted directed graphs. The algorithm works in the fundamental comparison-addition model and runs in O ( mn + n 2 log log n ) time, where m and n are the number of edges & vertices, respectively. This is strictly faster than Johnson's algorithm (for arbitrary edge-weights) and Dijkstra's algorithm (for positive edge-weights) when m = o ( n log n ) and matches the running time of Hagerup's APSP algorithm, which assumes integer edge-weights and a more powerful model of computation. Our algorithm is based on a generalization of Hagerup's "component hierarchy" to real --weighted directed graphs. The component hierarchy approach, first initiated by Thorup, is a non-greedy method for computing single-source shortest paths, and, by repeated application, all-pairs shortest paths. Component hierarchy-based algorithms have been designed for integer-weighted graphs, directed and undirected, and real-weighted undirected graphs. We show that in a precise sense, adapting the component hierarchy approach to the general case (arbitrary real-weighted directed graphs) is more complicated that for the restricted graph classes treated earlier.
Jiménez, Daniel A. and Calvin Lin. "Composite Confidence Estimators for Enhanced Speculation Control". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-14 (technical report). January 2002. 16 pages.
Many speculative microarchitectural techniques such as eager execution, value prediction, pipeline gating, and others rely on a confidence estimator to predict whether a speculative action will be beneficial. Since they cannot achieve perfect accuracy, confidence estimators must balance the cost of inhibiting a potentially useful speculation with the cost of a mis-speculation. The performance of confidence estimators with respect to these competing costs is quantified by two values: the Spec, i.e., the probability that an incorrect prediction is identified as low-confidence, and the Pvn, i.e. the probability that a prediction identified as having low confidence will be incorrect. We propose a way to allow more effective use of speculation control techniques by combining multiple confidence estimators into a composite confidence estimator. This new class of confidence estimators provides increased performance and finer control over the trade-off between Spec and Pvn. The main contributions of this paper are twofold. First, we describe techniques for building efficient composite confidence estimators, and evaluate them with a previously proposed statistical methodology, emphasizing the relationship of Spec and Pvn. Second, we use a detailed microarchitectural simulator to evaluate the ability of our estimator to support an energy reduction technique called pipeline gating. Using previous confidence estimators, pipeline gating reduces the amount of extra work due to mis-speculated instructions by 22%, with a reduction in IPC of 5%. With the same impact on IPC, our confidence estimators reduce extra work by 31%.
Sharygina, Natasha and James C. Browne. "Data Abstraction for Cycle Intensive Programs". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-15 (technical report). March 2002. 20 pages.
This paper reports on the design, implementation and evaluation of a data abstraction algorithm which is effective in reducing the complexity of model-checking for control properties of cycle intensive programs.
Sharygina, Natasha, James C. Browne, and Delbert Tesar. "Development of Verifiable Programs - Applications of an Approach Based on Executable Object-Oriented Specifications". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-16 (technical report). March 2002. 11 pages.
This paper presents the first phase of a two phase approach for development of object-oriented software systems which combines validation of OOA models formulated in xUML (an executable subset of UML, with formal verification through model checking.
Dhillon, Inderjit, Subramanyam Mallela, and Rahul Kumar. "Enhanced Word Clustering for Hierarchical Text Classification". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-17 (technical report). March 1, 2002. 17 pages.
In this paper we propose a new information-theoretic divisive algorithm for word clustering applied to text classification. In previous work, such "distributional clustering" of features has been found to achieve significant improvements over feature selection in terms of classification accuracy, especially at lower number of features [Baker & McCallum 98, Slonim & Tishby 2001]. However the existing clustering techniques are agglomerative in nature resulting in (i) sub-optimal word clusters and (ii) high computational cost. In order to explicitly capture the optimality of word clusters in an information theoretic framework, we first derive a global criterion for feature clustering. We then present a fast, divisive algorithm that monotonically decreases this objective function value, thus converging to a local minimum. We show that our algorithm minimizes the "within-cluster Jensen-Shannon divergence" while simultaneously maximizing the "between-cluster Jensen-Shannon divergence." In comparison to the previously proposed agglomerative strategies our divisive algorithm achieves higher classification accuracy especially at lower number of features. We further show that feature clustering is an effective technique for building smaller class models in hierarchical classification. We present detailed experimental results on the 20 News groups data set and a 3-level hierarchy of HTML documents collected from Dmoz Open Directory.
Gouda, Mohamed G., Chin-Tser Huang, and Mootaz Elnozahy. "Key Trees and the Security of Interval Multicast". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-18 (technical report). October 26, 2001. 15 pages.
A key tree is a distributed data structure of security keys that can be used in two ways by a group of users. First, any user in the group can use the group key in the key tree to securely broadcast data items to all other users in the group. The cost of each secure broadcast is one encryption. Second, the key tree can be maintained efficiently when a user leaves the group, or when a new user joins the group. The cost of maintaining a key tree when a user leaves or joins the group is O(log n) encryptions, where n is the number of users in the group. In this paper, we describe a third use of key trees where any user in the group can use the different keys in the key tree to securely multicast data items to different subgroups within the group. We show that the cost of securely multicasting a data item to a subgroup whose users are "consecutive" is O(log n) encryptions. We also show that the cost of securely multicasting a data item to an arbitrary subgroup is O(n/2) encryptions. However, this cost can be reduced to one encryption by introducing an additional key tree to the group. Finally, we present a calculus for identifying user subgroups in a group with multiple key trees.
Shivakumar, Premkishore, Michael Kistler, Stephen W. Keckler, Doug Burger, and Lorenzo Alvisi. "Modeling the Impact of Device and Pipeline Scaling on the Soft Error Rate of Processor Elements". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-19 (technical report). 2002. 24 pages.
NO ABSTRACT
Barbançon, Francois and Daniel P. Miranker. "SPHINX: Schema Integration by Example". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-20 (technical report). July 2002, revised December 2, 2002. 38 pages.
We focus on the problem of semi-automated query discovery for XML views without requiring the intervention of an expert to guarantee a correct final result. Given multiple independent sources of heterogeneous XML data structures, our tool, SPHINX, lets a non-technical user define views using example-based, graphical, interaction. SPHINX embodies a syntactically derived meta-model of federating view definitions. The meta-model of federating view definitions enables, first, an active-learning method which removes the burden of generating the examples from the user. Second, SPHINX can identify when the learned view definition converges to a vetted, semantically accurate result. Thus, the users do not need to verify the correctness of the transformation.
Pettie, Seth. "On the Comparison-Addition Complexity of All-Pairs Shortest Paths". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-21 (technical report). May 28, 2002. 14 pages.
We present an all-pairs shortest path algorithm for arbitrary graphs that performs O ( mn log alpha ( m,n )) comparison and addition operations, where m and n are the number of edges and vertices, resp., and alpha is Tarjan's inverse-Ackermann function. Our algorithm eliminates the sorting bottleneck inherent in approaches based on Dijkstra's algorithm, and for graphs with O ( n ) edges our algorithm is within a tiny O ( log alpha ( n,n )) factor of optimal. Our algorithm can be implemented to run in polynomial time (granted, a large polynomial). We leave open the problem of providing an efficient implementation.
Pettie, Seth. "An Inverse-Ackermann Style Lower Bound for MST Verification Queries". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-22 (technical report). April 23, 2002. 13 pages.
We consider the problem of preprocessing an edge-weighted tree T in order to quickly answer queries of the following type: does a given edge e belong in the minimum spanning tree of It is well-known that offline minimum spanning tree verification is solvable in linear time. We demonstrate that this online variant of the problem is intrinsically harder. In particular, a preprocessing cost of is necessary to answer queries with at most t comparisons, where is the t th -row inverse of Ackermann's function. For the case of linear preprocessing this implies a query lower bound of comparisons. Our lower bound also holds for randomized preprocessing algorithms.
Cao, Thuan D., John F. Hall, and Robert A. van de Geijn. "Parallel Cholesky Factorization of a Block Tridiagonal Matrix". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-23 (technical report). April 2002. 15 pages.
In this paper we discuss the parallel implementation of the Cholesky factorization of a positive definite symmetric matrix when that matrix is block tridiagonal. While parallel implementations for this problem, and closely related problems like the factorization of banded matrices, have been previously reported in the literature, those implementations dealt with the special cases where the block size (bandwidth) was either very large (wide) or very small (narrow). We present a solution that can be used for the entire spectrum of cases, ranging from extremely large (wide) to very small (narrow). Preliminary performance results collected on a Cray T3E-600 distributed memory supercomputer show that our implementation attains respectable performance. Indeed, factorization of a matrix with block size 1000 and a total dimension of more than 500,000 takes about 3.6 minutes on 128 processors.
Gorinsky, Sergey, Sugat Jain, and Harrick Vin. "Multicast Congestion Control with Distrusted Receivers". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-25 (technical report). September 2002. 8 pages.
Congestion control protocols rely on receivers to support fair bandwidth sharing. However, a receiver has incentives to elicit self-beneficial bandwidth allocations and hence may manipulate its congestion control protocol. Whereas the issue of receiver misbehavior has been studied for unicast congestion control, the impact of receiver misbehavior in multicast remains unexplored. In this paper, we examine the problem of fair congestion control in distrusted multicast environments. We classify standard mechanisms for multicast congestion control and determine their potential vulnerabilities to receiver misbehavior. Our evaluation of prominent multicast protocols shows that each of them is susceptible to attacks by a misbehaving receiver.
Amenta, Nina, Sunghee Choi, Maria Jump, Ravi Kolluri, and Thomas Wahl. "Finding Alpha-Helices in Skeletons". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-27 (technical report). October2002. 12 pages.
NO ABSTRACT
Gorinsky, Sergey and Harrick Vin. "Extended Analysis of Binary Congestion Control". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-28 (technical report). January 16, 2002. 5 pages.
Congestion control in the Internet relies on binary adjustment algorithms. For example, Transmission Control Protocol (TCP) in its congestion avoidance mode behaves similarly to Additive-Increase Multiplicative-Decrease (AIMD) algorithm. The classical analysis by Chiu and Jain recommends AIMD based on the assertion that among stable linear algorithms, AIMD ensures the quickest convergence to fair states. We demonstrate incorrectness of this assertion. For an asynchronous version of Chiu-Jain model, we show that AIMD is sensitive to initial conditions and has multiple unfair attractors. Our findings question the appropriateness of AIMD for binary congestion control. We attribute some of our observations to unrealistic features of Chiu-Jain model and argue that binary adjustment algorithms should be analyzed in a more realistic model.
Zhang, X. Brian, Simon S. Lam, Dong-Young Lee, and Y. Richard Yang. "Protocol Design for Scalable and Reliable Group Rekeying (Revised Nov. 2002)". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-29 (technical report). June 2002. 26 pages.
We present the design and specification of a scalable and reliable protocol for group rekeying together with performance evaluation results. The protocol is based upon the use of key trees for secure groups and periodic batch rekeying. At the beginning of each rekey interval, the key server sends a rekey message to all users consisting of encrypted new keys (encryptions, in short) carried in a sequence of packets. We present a scheme for identifying keys, encryptions, and users, and a key assignment algorithm that ensures that the encryptions needed by a user are in the same packet. Our protocol provides reliable delivery of new keys to all users eventually. It also attempts to deliver new keys to all users with a high probability by the end of the rekey interval. For each rekey message, the protocol runs in two steps: a multicast step followed by a unicast step. Proactive FEC multicast is used to reduce delivery latency. Our experiments show that a small FEC block size can be used to reduce encoding time at the server without increasing server bandwidth overhead. Early transition to unicast, after at most two multicast rounds, further reduces the worst-case delivery latency as well as user bandwidth requirement. The key server adaptively adjusts the proactivity factor based upon past feedback information; our experiments show that the number of NACKs after a multicast round can be effectively controlled around a target number. Throughout the protocol design, we strive to minimize processing and bandwidth requirements for both the key server and users.
Middleton, Brittany E. "A Brief Introduction to Nonstandard Analysis". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-31 (honors). May 22, 2002. 38 pages.
ACL2(r), authored by Dr. Ruben A. Gamboa, is an extension of ACL2, to allow reasoning about the real and complex irrational numbers. The modifications are based on nonstandard analysis. Jun Sawada, of IBM Research Labs, required the support of ACL2(r) in his proof efforts to verify that the approximation used the square root function of the IBM Power4 processor had the accuracy required. To verify the accuracy of the square root function, Taylor's Theorem was chosen. Taylor's Theorem with Remainder is an approximation tool commonly used in mathematics that provides a means for estimating f(x) for an arbitrary x in the interval [ a , b ] from the values of f and its derivatives at a . This thesis serves as an introduction to the work completed in the proof of Taylor's Theorem with Remainder, as well as a general, concise introduction of nonstandard analysis in ACL2(r). The introduction is presented as a guide through the foundational lemmas needed in the proof of Taylor's Theorem with Remainder.
Andersen, Timothy, Kenneth O. Stanley, and Risto Miikkulainen. "Neuro-Evolution Through Augmenting Topologies Applied to Evolving Neural Networks to Play Othello". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-32 (technical report). May 23, 2002. 18 pages.
NO ABSTRACT
Lee, Delwin F. "A Formal Presentation of Electronic Commerce Protocols". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-33 (honors). May 23, 2002. 55 pages.
In this paper, we investigate the security of microcommerce digital cash protocols that can facilitate the selling of content over the Internet. Examples of such content are web pages, newspaper articles, and even individual plays of computer games. We focus on two particular digital cash protocols, namely Compaq's Millicent and IBM's Micropayments. For each of these two protocols, we present a formal specification using the Abstract Protocol notation, and then discuss how an adversary can attack the protocol using message forgery, modification, and replay. We then use three concepts of convergence theory, namely closure, convergence, and protection, to show that each protocol is secure against these attacks. Finally, we formally specify and verify the Secure Sockets Layer protocol, which can be used to provide privacy for these digital cash protocols.
Gouda, Mohamed G. and Chin-Tser Huang. "A Secure Address Resolution Protocol". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-34 (technical report). June 2002. 26 pages.
We propose an architecture for securely resolving IP addresses into hardware addresses over an Ethernet. The proposed architecture consists of a secure server connected to the Ethernet and two protocols: an invite-accept protocol and a request-reply protocol. Each computer connected to the Ethernet can use the invite-accept protocol to periodically record its IP address and its hardware address in the database of the secure server. Each computer can later use the request-reply protocol to obtain the hardware address of any other computer connected to the Ethernet from the database of the secure server. These two protocols are designed to overcome the actions of any adversary that can lose sent messages, arbitrarily modify the fields of sent messages, and replay old messages.
Chowdhury, Rezaul Alam and Vijaya Ramachandran. "Improved Distance Oracles for Avoiding Link-Failure". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-35 (technical report). June 28, 2002. 11 pages.
We consider the problem of preprocessing an edge-weighted directed graph to answer queries that ask for the shortest path from any given vertex to another avoiding a failed link. We present two algorithms that improve on earlier results for this problem. Our first algorithm, which is a modification of an earlier method, improves the query time to a constant while maintaining the earlier bounds for preprocessing time and space. Our second result is a new algorithm whose preprocessing time is considerably faster than earlier results and whose query time and space are worse by no more than a logarithmic factor.
Zhang, X. Brian, Simon S. Lam, and Dong-Young Lee. "Group Rekeying with Limited Unicast Recovery". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-36 (technical report). July 2002. 28 pages.
In secure group communications, a key server can deliver a "group-oriented" rekey message[20] to a large number of users efficiently using IP multicast. For reliable delivery, Keystone[21] proposed the use of forward error correction (FEC) in an initial multicast, followed by the use of unicast delivery for users that cannot recover their new keys from the multicast. In this paper, we investigate how to limit unicast recovery to a small fraction r of the user population. By specifying a very small r , almost all users in the group will receive their new keys within a single multicast round. We present analytic models for deriving r as a function of the amount of FEC redundant information (denoted by h ) and the rekeying interval duration (denoted by T ) for both Bernoulli and two-state Markov Chain loss models. From our analyses, we conclude that r decreases exponentially as h increases. We then present a protocol designed to adaptively adjust ( h , T ) to achieve a specified r . In particular, our protocol chooses from among all feasible ( h , T ) pairs one with h and T values close to their feasible minima. Our protocol also adapts to an increase in network traffic. Simulation results using ns-2 show that with network congestion our adaptive FEC protocol can still achieve a specified r by adjusting values of h and T .
Martin, Jean-Philippe, Lorenzo Alvisi, and Michael Dahlin. "Minimal Byzantine Storage". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-38 (technical report). August 2002. 19 pages.
Byzantine fault-tolerant storage systems can provide high availability in hazardous environments, but the redundant servers they require increase software development and hardware costs. In order to minimize the number of servers required to implement fault-tolerant storage services, we develop a new algorithm that uses a "Listeners" pattern of network communication to detect and resolve ordering ambiguities created by concurrent accesses to the system. When storing generic data, our protocol requires 3f+1 servers to tolerate up to f Byzantine faults - f fewer than the 4f+1 required by existing protocols for non-self-verifying data. In addition, SBQ-L provides atomic consistency semantics, which is stronger than the regular or pseudo-atomic semantics provided by these existing protocols. We show that this protocol is optimal in the number of servers - any protocol that provides safe semantics or stronger requires at least 3f+1 servers to tolerate f Byzantine faults in an asynchronous system. We also examine protocols that store self-verifying data (i.e. data that cannot be undetectably altered). Existing protocols can use self-verifying data to reduce the number of servers required to tolerate faults but because SBQ-L already uses the minimum possible number of servers for its semantics, self-verifying data provides no advantage. Finally, we examine a non-confirmable writes variation of the SBQ-L protocol where a client cannot determine when its writes complete. We show that SBQ-L with non--confirmable writes provides regular semantics with 2f+1 servers and that this number of servers is minimal.
Gorinsky, Sergey and Harrick Vin. "Extended Analysis of Binary Adjustment Algorithms". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-39 (technical report). August 2002. 5 pages.
Congestion control in the Internet relies on binary adjustment algorithms. For example, Transmission Control Protocol (TCP) in its congestion avoidance mode behaves similarly to Additive-Increase Multiplicative-Decrease (AIMD) algorithm. Chiu and Jain offer a theoretical justification for choosing AIMD: among stable linear algorithms, AIMD ensures the quickest convergence to maxmin-fair states. Whereas Chiu-Jain model rests on a well-known unrealistic assumption of uniform feedback, more precise analytic characterizations of TCP behavior are developed and validated. In particular, the advanced theory and experiments agree that TCP congestion control does not converge to maxmin fairness. However, despite the recent progress in TCP feedback modeling, it is still common to use Chiu-Jain model for comparison of binary adjustment algorithms. This paper argues against such practice. We provide evidence that due to the incorrect assumption of uniform feedback, Chiu-Jain model is not suitable for trustworthy conclusions about properties of an adjustment algorithm. We emphasize that until algorithms are not analyzed with a more realistic feedback model, optimal choice of an binary adjustment algorithm will remain an open problem.
Nitya Ranganathan, Ramadass Nagarajan, Daniel Jimenez, Doug Burger, Stephen W. Keckler, and Calvin Lin. "Combining Hyperblocks and Exit Prediction to Increase Front-End Bandwidth and Performance". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-41 (technical report). May 2002. 27 pages.
As processor issue widths increase, high-bandwidth parallel fetches across multiple branches will be necessary to prevent Flynn's bottleneck. In this paper, we show that by combining hyperblocks and exit predictors, high instruction bandwidths can be achieved in a streamlined, simple design that requires only a third of the predictions of a conventional predictor. Exit predictors identify the first branch in a hyperblock that will be taken with a single prediction, implicitly predicting all previous branches in the hyperblock as not taken. Hyperblock-based exit predictors reduce the number of predictions by 72%, while maintaining an average prediction accuracy of 94.7% (compared to 95.9% for the best conventional predictor we measured, a PAs-gshare predictor). With this accuracy, a 250Kbit exit predictor achieves a mean 3.9 misses per thousand instructions (MPKI), fetching 91 useful straight-line instructions per prediction, on average. Since so many instructions are fetched with a single simple prediction, these predictors can be used for high frequencies, run-ahead prediction, initial prediction in multi-stage schemes, instruction prefetching, or power reduction. We conclude by showing that hyperblock exit predictors achieve higher accuracy than other high-bandwidth predictors at the same fetch bandwidth, and that these predictors can be used to achieve high ILP on advanced architectures such as Grid Processors.
He, Chun (Cindy). "Formal Specifications of Traceback Marking Protocols". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-42 (honors). May 2002. 21 pages.
Denial-of-Service attacks and Distributed Denial-of-Service attacks are serious security problems over the Internet due to their nature. They are easy to implement, hard to prevent, and very difficult to trace. This paper describes Denial-of-Service attacks and Distributed Denial-of-Service attacks and presents various traceback that are proposed to identify the sources of theses attackers in the Internet. Finally it gives formal specifications for a class of these traceback protocols called marking protocols.
Sahni, Jasleen Kaur. "Scalable Network Architectures for Providing Per-flow Service Guarantees". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-43 (dissertation). August 2002. 153 pages.
NO ABSTRACT
Bajaj, Chandrajit, Ariel Shamir, and Bong-Soo Sohn. "Progressive Tracking of Isosurfaces in Time-Varying Scalar Fields". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-44 (technical report). August 2002. 8 pages.
Scientific simulations and measurements often involve time dependent processes and produce time dependent data sets. Isosurface extraction is an important tool for visualizing three or two-dimensional time varying scalar fields defined by such data. Nevertheless, the size of the data and the dynamic nature impose difficulty in devising efficient and effective time dependent isosurface extraction techniques. In this paper, we describe a progressive algorithm for time dependent isosurface extraction. The algorithm maintains efficiency in time and space by exploiting coherency in both temporal and spatial dimensions of the data, as well as in the function values domain. It creates the isosurface of consecutive time steps progressively from previous time steps allowing time critical utilization. In addition, it can track evolving isosurface components and identify topology change events such as merge, split, vanish and create. This information is used to define several visualization techniques such as tracking of individual components, which help gain better understanding of the dynamic structure of the data.
Lopez-Herrejon, Roberto E. and Don Batory. "Using AspectJ to Implement Product-Lines: A Case Study". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-45 (technical report). 2002. 9 pages.
Aspect-Oriented Programming (AOP) is an emerging technology whose goal is to modularize concerns that cross-cut multiple classes. The purpose of this report is to describe how one of the main representatives of AOP, namely AspectJ, was used to implement a simple yet illustrative product-line of graph algorithms so that we can focus on the implementation details. We expect that studies like this can shed light on the applicability of AOP beyond the traditional examples of logging and debugging.
Liu, Huaiyu and Simon S. Lam. "Neighbor Table Construction and Update in a Dynamic Peer-to-Peer Network". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-46 (technical report). September 20, 2002. 28 pages.
In a system proposed by Plaxton, Rajaraman and Richa (the PRR scheme), the expected cost of accessing a replicated object was proved to be asymptotically optimal. The PRR scheme, however, assumes a static set of nodes and pre-existence of consistent and optimal neighbor tables in nodes. To extend the PRR scheme to a dynamic, distributed environment, such as the Internet, various protocols are needed (for node joining, leaving, table optimization, and failure recovery). In this paper, we first present a conceptual foundation, called C-set trees, for protocol design and reasoning about consistency. We then present the detailed specification of a join protocol. In our protocol, only nodes that are joining need to keep extra state information about the joining process. We present a rigorous proof that the join protocol generates consistent neighbor tables for an arbitrary number of concurrent joins. The crux of our proof is based upon induction on a C-set tree. Our join protocol can also be used for building consistent neighbor tables for a set of nodes at network initialization time. Lastly, we present both analytic and simulation results on the communication cost of a join in our protocol.
Desikan, Rajagopalan, Charles R. Lefurgy, Stephen W. Keckler, and Doug Burger. "On-chip MRAM as a High-Bandwidth, Low-Latency Replacement for DRAM Physical Memories". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-47 (technical report). 2002. 18 pages.
NO ABSTRACT
Kim, Min Sik, Simon S. Lam, and Lee Dong-Young. "Optimal Distribution Tree for Internet Streaming Media". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-48 (technical report). September 2002. 18 pages.
Internet radio and television stations require significant bandwidth to support delivery of high quality audio and video streams to a large number of receivers. IP multicast is an appropriate delivery model for these applications. However, widespread deployment of IP multicast on the Internet is unlikely in the near future. An alternative is to build a multicast tree in the application layer. Previous studies have addressed tree construction in the application layer. However, most of them focus on reducing delay. Few systems have been designed to achieve a high throughput for bandwidth-intensive applications. In this paper, we present a distributed algorithm to build an application-layer tree. We prove that our algorithm finds a tree such that the average incoming rate of receivers in the tree is maximized (under certain network model assumptions). We also describe protocols that implement the algorithm. For implementation on the Internet, there is a tradeoff between the overhead of available bandwidth measurements and fast convergence to the optimal tree. This tradeoff can be controlled by tuning some parameters in our protocols. Our protocols are also designed to maintain a small number, O(log n), of soft states per node to adapt to network changes and node failures.
Dhillon, Inderjit S., Edward M. Marcotte, and Usman Roshan. "Diametrical Clustering for identifying anti-correlated gene clusters". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-49 (technical report). September 2002. 9 pages.
Clustering genes based upon their expression patterns allows us to define cellular pathways and predict gene function. Most existing clustering algorithms cluster genes together when their expression patterns show high positive correlation. However, it has been observed that genes whose expression patterns are strongly anti-correlated can also be functionally similar. Biologically, this is not unintuitive --- genes responding to the same stimuli, regardless of the nature of the response, are more likely to operate in the same pathways. In this paper, we present a new "diametrical clustering" algorithm that explicitly identifies anti-correlated clusters of genes. Our algorithm proceeds by iteratively (i) re-partitioning the genes and (ii) computing the dominant singular vector of each gene cluster; each singular vector serving as the prototype of a "diametric" cluster. We empirically show the effectiveness of the algorithm in identifying diametrical or anti-correlated clusters. Testing the algorithm on yeast cell cycle data, fibroblast gene expression data, and DNA microarray data from yeast mutants reveals that opposed cellular pathways can be discovered with this method. We present systems whose mRNA expression patterns, and likely their functions, oppose the yeast ribosome and proteosome, along with evidence for the inverse transcriptional regulation of a number of cellular systems.
Dahlin, Mike, Bharat Chandra, Lei Gao, and Amol Nayate. "End-to-End WAN Service Availability". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-50 (technical report). September 2002. 26 pages.
This study seeks to understand how network failures affect the availability of service delivery across wide area networks and to evaluate classes of techniques for improving end-to-end service availability. Using several large-scale connectivity traces, we develop a model of network unavailability that includes key parameters such as failure location and failure duration. We then use trace-based simulation to evaluate several classes of techniques for coping with network unavailability. We find that caching alone is seldom effective at insulating services from failures but that the combination of mobile extension code and prefetching can improve average unavailability by as much as an order of magnitude for classes of service whose semantics support disconnected operation. We find that routing-based techniques may provide significant improvements, but that the improvements of many individual techniques are limited because they do not address all significant categories of network failures. By combining the techniques we examine, some systems may be able to reduce average unavailability by as much as one or two orders of magnitude.
Berger, Emery D. "Memory Management for High-Performance Applications". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-52 (dissertation). August 2002. 113 pages.
Memory managers are a source of performance and robustness problems for application software. Current general-purpose memory managers do not scale on multiprocessors, cause false sharing of heap objects, and systematically leak memory. Even on uniprocessors, the memory manager is often a performance bottleneck. General-purpose memory managers also do not provide the bulk deletion semantics required by many applications, including web servers and compilers. The approaches taken to date to address these and other memory management problems have been largely ad hoc. Programmers often attempt to work around these problems by writing custom memory managers. This approach leads to further difficulties, including data corruption caused when programmers inadvertently free custom-allocated objects to the general-purpose memory manager. In this thesis, we develop a framework for analyzing and designing high-quality memory managers. We develop a memory management infrastructure called heap layers that allows programmers to compose efficient memory managers from reusable and independently testable components. We conduct the first comprehensive examination of custom memory managers and show that most of these achieve slight or no performance improvements over a state-of-the-art general-purpose memory manager. Building on the knowledge gained in this study, we develop a hybrid memory management abstraction called reaps that combines the best of both approaches, allowing server applications to manage memory quickly and flexibly while avoiding memory leaks. We identify a number of previously unnoticed problems with concurrent memory management and analyze previous work in the light of these discoveries. We then present a concurrent memory manager called Hoard and prove that it avoids these problems.
Bientinesi, Paolo, John A. Gunnels, Margaret E. Myers, Enrique S. Quintana-Orti, and Robert A. van de Geijn. "The Science of Deriving Dense Linear Algebra Algorithms". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-53 (technical report). September 2002. 21 pages.
In this paper we present a systematic approach to the derivation of families of high-performance algorithms for a large set of frequently encountered dense linear algebra operations. As part of the derivation a constructive proof of the correctness of the algorithm is given. The paper is structured so that it can be used as a tutorial for novices. However, the method has been shown to yield new, high-performance algorithms for well-studied linear algebra operations and should also be of interest to the "high priests of high performance."
Amenta, Nina and Sunghee Choi. "Blocked Randomized Incremental Constructions". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-54 (technical report). October 2002. 11 pages.
Randomized incremental constructions are widely used in computational geometry, but they perform very badly on large data because of their inherently random memory access patterns. We define an insertion order which removes enough randomness to significantly improve performance, but leaves enough randomness so that the algorithms remain theoretically optimal, or nearly so.
Goto, Kazushige and Robert van de Geijn. "On Reducing TLB Misses in Matrix Multiplication". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-55 (technical report). November 2002. 19 pages.
During the last decade, a number of projects have pursued the high-performance implementation of matrix multiplication. Typically, these projects organize the computation around an "inner kernel," C=A T B+C, that keeps one of the operands in the L1 cache, while streaming parts of the other operands through that cache. Variants include approaches that extend this principle to multiple levels of cache or that apply the same principle to the L2 cache while essentially ignoring the L1 cache. The intent is to optimally amortize the cost of moving data between memory layers. The approach proposed in this paper is fundamentally different. We start by observing that for current generation architectures, much of the overhead comes from Translation Look-aside Buffer (TLB) table misses. While the importance of caches is also taken into consideration, it is the minimization of such TLB misses that drives the approach. The result is a novel approach that achieves highly competitive performance on a broad spectrum of current high-performance architectures.
Napper, Jeff, Lorenzo Alvisi, and Harrick Vin. "A Fault-Tolerant Java Virtual Machine". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-56 (technical report). May 2002. 16 pages.
The Java programming language was designed for portability and safe code distribution, not for fault-tolerance. We modify the Sun JDK1.2 to provide transparent fault-tolerance for many Java applications under the crash failure model. Our approach is to log non-deterministic events at the JVM interface using a primary-backup architecture. In particular, we identify the sources of non-determinism in the JVM due to asynchronous exceptions and multi-threaded access to shared data, as well as the non-determinism present at the native method interface. We analyze the overhead introduced in our system by each of these sources of non-determinism and compare the performance of different techniques for handling multi-threading.
Kistler, Michael and Lorenzo Alvisi. "Improving the Performance of Software Distributed Shared Memory with Speculation". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-57 (technical report). February 2002. 26 pages.
NO ABSTRACT
Gorinsky, Sergey, Sugat Jain, Harrick Vin, and Yongguang Zhang. "Lightweight Protection Against Inflated Subscription in Multicast Congestion Control". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-58 (technical report). November 7, 2002. 18 pages.
Group membership regulation is a useful mechanism for multicast congestion control: RLM, RLC, FLID-DL, and WEBRC form a promising line of multi-group protocols where receivers provide no feedback to the sender but control congestion via group subscription. Unfortunately, the group subscription mechanism also offers receivers an opportunity to elicit self-beneficial bandwidth allocations. In particular, a misbehaving receiver can ignore guidelines for group subscription and choose an unfairly high subscription level in a multi-group multicast session; this poses a serious threat to fairness of bandwidth allocation. In this paper, we present a lightweight solution for the problem of inflated subscription. Our design guards access to multicast groups with dynamic keys and consists of two independent components: DELTA (Distribution of ELigibility To Access) - a novel method for in-band distribution of group keys to receivers that are eligible to access the groups according to the congestion control protocol, and SIGMA (Secure Internet Group Management Architecture) - a generic architecture for key-based group access at edge routers. DELTA and SIGMA are the first to solve the problem of inflated subscription in multi-group protocols for multicast congestion control.
Garg, Rahul, Vijay Kumar, Atri Rudra, and Akshat Verma. "Coalitional games on graphs: core structure, substitutes and frugality". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-59 (technical report). November 2002. 11 pages.
We study mechanisms that can be modelled as coalitional games with transferrable utilities, and apply ideas from mechanism design and game theory to problems arising in a network design setting. We establish an equivalence between the game-theoretic notion of agents being substitutes and the notion of frugality of a mechanism. We characterize the core of the network design game and relate it to outcomes in a sealed bid auction with VCG payments. We show that in a game, agents are substitutes if and only if the core of the forms a complete lattice. We look at two representative games -- Minimum Spanning Tree and Shortest Path in this light. Finally, we design an ascending price mechanism for such games and study the strategic behavior of agents.
Plaxton, C. Greg. "Approximation Algorithms for Hierarchical Location Problems". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-60 (technical report). November 2002. 17 pages.
We formulate and (approximately) solve hierarchical versions of two prototypical problems in discrete location theory, namely, the metric uncapacitated k -median and facility location problems. Our work yields new insights into hierarchical clustering, a widely used technique in data analysis. First, we show that every metric space admits a hierarchical clustering that is within a constant factor of optimal at every level of granularity with respect to the average (squared) distance objective. Second, we provide a natural solution to the leaf ordering problem encountered in the traditional dendrogram-based approach to the visualization of hierarchical clusterings.
Li, Xiaozhou and C. Greg Plaxton. "On Name Resolution in Peer-to-Peer Networks". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-61 (technical report). November 2002. 22 pages.
An efficient name resolution scheme is the cornerstone of any peer-to-peer network. The foundation of an efficient name resolution scheme is a dynamic network topology that determines the neighbor relationships to be maintained by the nodes in the network. The name resolution scheme proposed by Plaxton, Rajaraman, and Richa, which we hereafter refer to as the PRR scheme, is a scalable scheme that also provides provable locality properties on a certain class of growth-restricted metric spaces. On arbitrary metric spaces, however, some performance bounds of PRR are significantly weakened. In this paper, we define a class of network topologies called hyperdelta networks and observe that the PRR topology may be viewed as a random hyperdelta network. We then propose SPRR (simplified PRR), a variant of the PRR scheme that performs well on arbitrary metric spaces. SPRR imposes additional constraints on PRR neighbor selection by placing the nodes on a cycle. Although SPRR does not provide as strong locality properties as PRR, it exploits locality heuristically yet effectively. Finally, a significant level of fault tolerance can be achieved in SPRR without adding much complexity.
Mettu, Ramgopal R. "Approximation Algorithms for NP-Hard Clustering Problems". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-62 (dissertation). August 2002. 114 pages.
Given a set of n points and their pairwise distances, the goal of clustering is to partition the points into a "small" number of "related" sets. Clustering algorithms are used widely to manage, classify, and summarize many kinds of data. In this dissertation, we study the classic facility location and k-median problems in the context of clustering, and formulate and study a new optimization problem that we call the online median problem. For each of these problems, it is known to be NP-hard to compute a solution with cost less than a certain constant factor times the optimal cost. We give simple constant-factor approximation algorithms for the facility location, k-median, and online median problems with optimal or near-optimal time bounds. We also study distance functions that are "approximately" metric, and show that such distance functions allow us to obtain a faster online median algorithm and to generalize our analysis to other objective functions, such as that of the well-known k-means heuristic. Given n points, the associated interpoint distances and nonnegative point weights, and a nonnegative penalty for each point, the facility location problem asks us to identify a set of cluster centers so that the weighted average cluster radii and the sum of the cluster center penalties are both minimized. The k-median problem asks us to identify exactly k cluster centers while minimizing just the weighted average cluster radii. We give a simple greedy algorithm for the facility location problem that runs in O(n^2) time and produces a solution with cost at most 3 times optimal. For the k-median problem, we develop and make use of a sampling technique that we call "successive sampling," and give a randomized constant-factor approximation algorithm that runs in O(n(k+\log{n}+\log^2{n})) time. We also give an Omega(nk) lower bound on the running time of any randomized constant-factor approximation algorithm for the k-median problem that succeeds with even a negligible constant probability. In many settings, it is desirable to browse a given data set at differing levels of granularity (i.e., number of clusters). To address this concern, we formulate a generalization of the k-median problem that we call the online median problem. The online median problem asks us to compute an ordering of the points so that, over all i, when a prefix of length i is taken as a set of cluster centers, the weighted average radii of the induced clusters is minimized. We show that a natural generalization of the greedy strategy that we call "hierarchically greedy" yields an algorithm that produces an ordering such that every prefix of the ordering is within a constant factor of the associated optimal cost. Furthermore, our algorithm has a running time of Theta(n^2). Finally, we study the performance of our algorithms in practice. We present implementations of our k-median and online median algorithms; our experimental results indicate that our approximation algorithms may be useful in practice.
Zhang, Yongjie, Chandrajit Bajaj, and Bong-Soo Sohn. "Adaptive Multiresolution and Quality 3D Meshing from Imaging Data". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-63 (technical report). December 2002. 10 pages.
This paper presents an algorithm to extract adaptive and quality 3D meshes directly from volumetric imaging data - primarily Computed Tomography (CT) and Magnetic Resonance Imaging (MRI). The extracted tetrahedral and hexahedral meshes are extensively used in the Finite Element Method (FEM). Our comprehensive approach combines bilateral and anisotropic (feature specific) diffusion filtering, with contour spectrum based, relevant isosurface and interval volume selection. Then, a top-down octree subdivision coupled with the dual contouring method is used to rapidly extract adaptive and multiresolution 3D finite element (tetrahedral and hexahedral) meshes from volumetric imaging data. The main contributions are extending the dual contouring method to interval volume tetrahedralization and hexahedralization with curvilinear feature sensitive adaptation. Compared to other tetrahedral extraction methods from interval volumes (Marching Cubes and Marching Tetrahedra), our method generates high quality adaptive multiresolution 3D meshes without introducing any hanging nodes. Our method has the properties of crack prevention, feature preservation and feature sensitivity.
Nakhleh, Luay, Daniel Miranker, Francois Barbancon, William H. Piel, and Michael Donoghue. "Requirements of Phylogenetic Databases". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-64 (technical report). December 2002. 21 pages.
We examine the organizational impact on phylogenetic databases of the increasing sophistication in the need and use of phylogenetic data. A primary issue is the use of the unnormalized representation of phylogenies in Newick format as a primitive data type in existing phylogenetic databases. In particular, we identify and enumerate a list of potential applications of such databases and queries (use-cases) that biologists may wish to see integrated into a phylogenetic database management system. We show there are many queries that would best be supported by a normalized data model where phylogenies are stored as lists of edges. Since many of the queries require transitive traversals of the phylogenies we demonstrate, constructively, that complex phylogenetic queries can be conveniently constructed as Datalog programs. We address concerns with respect to the cost and performance of the normalized representation by developing and empirically evaluating a feasibility prototype.
Ahmed, Muqtadar. "The Anonymity Ring". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-65 (honors). Fall 2002. 38 pages.
NO ABSTRACT
Pereira, Kevin. "An Empirical Investigation into a Location Sensing System Based on RF Signal Strength of TinyOS Motes". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-66 (honors). Fall 2002. 14 pages.
With the increasing usage of handheld, mobile wireless devices and the unprecedented popularity of celluar phones, the ability to know where a mobile device is physically located at any given time is fast becoming an intrinsic feature of any wireless application and service. This paper will primarily deal with an investigation into the feasibility of using TinyOS motes to build a system in incremental stages that would be able to locate tagged objects within an acceptable boundary of error
Agaram, Kartik K., Stephen W. Keckler, Calvin Lin, and Kathryn McKinley. "Phase Analysis of Program Memory Behavior". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-67 (technical report). 2002. 17 pages.
This paper describes a methodology for measuring and analyzing a program's memory performance. This method can both identify phase behavior and analyze the behavior of individual data structures. Our approach uses a new tool that combines compile-time annotation of memory allocation sites with a detailed microprocessor simulator. Using this infrastructure, we decompose the complex access patterns of four SPEC-2000 benchmarks by phase and by data structure. We study how different data structures coexist in a common memory system and we distinguish data structures that miss because of external conflicts from those that miss because of poor intrinsic locality. These results provide a richer understanding of the application than those delivered by tools that simply aggregate memory behavior into a single miss-rate statistic. These results also suggest effective optimizations for each data structure and phase of an application. In various phases of the execution of the benchmarks we identify optimizations, such as data-structure specific caching, static layout transformations, software prefetching and streaming, that are likely to be most effective. We show that the set of effective optimizations vary by application and program execution phase.
Lopez-Herrejon, Roberto E. and Don Batory. "Using Hyper/J to Implement Product-Lines: A Case Study". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-68 (technical report). 2002. 9 pages.
Aspect-Oriented Programming (AOP) is an emerging technology whose goal is to modularize concerns that may involve several classes. The purpose of this report is to describe how one of the main representatives of AOP, namely Hyper/J, was used to implement a simple yet illustrative product-line of graph algorithms.
Erdem, Esra. "Theory of Applications of Answer Set Programming". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-69 (dissertation). August 2002. 233 pages.
Answer set programming (ASP) is a new form of declarative logic programming. ASP interprets a logic program as a constraint on sets of literals, just as a propositional formula can be viewed as a constraint on assignments of truth values to atoms. The concept of an answer set was originally proposed as a semantics of negation as failure in Prolog. Instead of traditional Prolog systems, ASP uses answer set solvers. The input of Prolog consists of a logic program and a query, and Prolog computes answer substitutions; the input of an answer set solver is a logic program, and the solver computes the program's answer sets. The idea of ASP is to represent a given computational problem as a logic program whose answer sets correspond to solutions, and to use an answer set solver to find an answer set. We have investigated the application of ASP to several combinatorial search problems, including planning, wire routing, and phylogeny reconstruction. Planning is the problem of finding a sequence of actions that leads to a given goal. Wire routing is the problem of determining the physical locations of all wires interconnecting the circuit components on a chip. Phylogeny reconstruction is the problem of constructing and labeling an evolutionary tree for a set of taxa (taxonomic units), which describes the evolution of the taxa in that set from their most recent common ancestor. In our work on phylogeny reconstruction, we have generated several conjectures about the evolutionary history of Indo-European languages. The work on the use of ASP for planning has led us to the investigation of some theoretical questions related to answer sets. One is the problem of equivalent transformations of logic programs: under what conditions can we replace a program by an equivalent program that can be processed by an answer set solver more efficiently? Another problem is related to completion--a process that can translate a logic program into a set of formulas of classical logic. In some cases, the interpretations satisfying the completion of a program are also the answer sets for that program. In such cases, we can use propositional solvers--systems that compute a model of a given set of clauses--to find the program's answer sets. For some problems, propositional solvers are more efficient than answer set solvers. Therefore, we have investigated under what conditions we can use propositional solvers to find the program's answer sets.
Mittal, Neeraj. ": Techniques for Analyzing Distributed Computations". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-70 (dissertation). May 2002. 182 pages.
Inherent non-determinism in distributed programs and presence of multiple threads of control makes it difficult to write correct distributed software. Not surprisingly, distributed systems are particularly vulnerable to software faults. To build a distributed system capable of tolerating software faults, two important problems need to be addressed: fault detection and fault recovery . The fault detection problem requires finding a (consistent) global state of the computation that satisfies certain predicate ( e.g. , violation of mutual exclusion). To prevent a fault from causing any serious damage such as corrupting stable storage, it is essential that it be detected in a timely manner. However, we prove that detecting a predicate in 2-CNF, even when no two clauses contain variables from the same process, is an NP-complete problem. We develop a technique, based on computation slicing , to reduce the size of the computation and thus the number of global states to be examined for detecting a predicate. Slicing can be used to throw away the extraneous global states of the computation in an efficient manner, and focus on only those that are currently relevant for our purpose. To detect a fault, therefore, rather than searching the state-space of the computation, it is much more efficient to search the state-space of the slice. We identify several useful classes of predicates for which the slice can be computed efficiently. Our experimental results indicate that slicing can lead to an exponential reduction over existing techniques both in terms of time as well as space for fault detection. To recover from faults, we consider rollback recovery approach, which involves restoring the system to a previous state and then re-executing. We focus on rollback recovery using controlled re-execution , which is useful and effective for tolerating synchronization faults . Unlike other approaches which depend on chance and do not ensure that the re-execution is fault-free, the controlled re-execution method avoids synchronization faults during re-execution in a deterministic fashion. Specifically, it selectively adds synchronization dependencies during re-execution to ensure that the previously detected synchronization faults do not occur again. We provide efficient algorithms to solve the problem for two important classes of synchronization faults.
Kolbly, Donovan. ": Extensible Language Implementation". The University of Texas at Austin, Department of Computer Sciences. Report# TR-02-71 (dissertation). December 2002. 177 pages.
This work presents several new approaches to the construction of extensible languages, and is the first system to combine local, dynamically extensible context-free syntax with the expressive power of meta-level procedures. The unifying theme of our system is that meaning should be computed relative to local context. We show how this theme is manifest in an implementation of a Scheme macro system which achieves hygienic macro expansion without rewriting. Additionally, our Scheme macro system makes available compile-time meta-objects for additional power in writing macros; macros that pattern match on compile-time types for optimization at macro-processing time are one example. This approach is currently in use in our RScheme implementation of Scheme. We also show the how this approach is applied to languages with conventional syntax, using Java as an example. We present a dynamically extensible parser based on the Earley parsing algorithm. This approach is practical as well as flexible; a straightforward implementation in C parses a 600-line (2777 token) file in about 44ms on an 866MHz Pentium III. We also describe a language extension framework that makes possible an extensible variant of Java, in which new syntax can be supplied by the casual programmer with only limited knowledge of the underlying compiler implementation or approach. This finally makes available to Java programmers the easy access to structured macro facilities that Lisp programmers find so powerful. Finally, we demonstrate this framework by constructing a deterministic finite automaton language extension to Java.
For help please contact trcenter@cs.utexas.edu