Skip to Content

The University of Texas at Austin

CS Technical Abstracts (2007)

TR-07-01

Sherstov, Alexander A. "Halfspace Matrices". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-01 (regular report). January 4, 2007. 18 pages.

NO ABSTRACT

DOWNLOAD tr07-01.pdf

TR-07-02

Sherstov, Alexander A. "Separating AC^0 from Depth-2 Majority Circuits". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-02 (regular report). January 4, 2007. 14 pages.

NO ABSTRACT

DOWNLOAD tr07-02.pdf

TR-07-03

Chowdhury, Rezaul A., Hai-Son Le, and Vijaya Ramachandran. "Efficient Cache-oblivious String Algorithms for Bioinformatics". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-03 (regular report). January 25, 2007. 26 pages.

We present theoretical and experimental results on cache-efficient and parallel algorithms for some well-studied string problems in bioinformatics: 1. Pairwise alignment. Optimal pairwise global sequence alignment using affine gap penalty; 2. Median. Optimal alignment of three sequences using affine gap penalty; 3. RNA secondary structure prediction. Maximizing number of base pairs in RNA secondary structure with simple pseudoknots. For each of these three problems we present cache-oblivious algorithms that match the best-known time complexity, match or improve the best-known space complexity, and improve significantly over the cache-efficiency of earlier algorithms. We also show that these algorithms are easily parallelizable, and we analyze their parallel performance. We present experimental results that show that all three cache-oblivious algorithms run faster on current desktop machines than the best previous algorithm for the problem. For the first two problems we compare our code to publicly available software written by others, and for the last problem our comparison is to our implementation of Akutsu's algorithm. We also include empirical results showing good performance for the parallel implementations of our algorithms for the first two problems. Our methods are applicable to several other dynamic programs for string problems in bioinformatics including local alignment, generalized global alignment with intermittent similarities, multiple sequence alignment under several scoring functions such as `sum-of-pairs' objective function and RNA secondary structure prediction with simple pseudoknots using energy functions based on adjacent base pairs.

DOWNLOAD tr07-03.pdf

TR-07-04

Aiyer, Amitanand S., Lorenzo Alvisi, and Rida A. Bazzi. "Wait Free Atomic Semantics and Writebacks -- Preliminary Version". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-04 (regular report). February 26, 2007. 8 pages.

In the presence of Byzantine faults no protocolcan achieve wait-free atomic semantics without having the reader perform a write back. All existing protocols for wait-free atomic semantics write back the entire value to the servers. We show the first protocol that does not writeback the value, but still achieves wait-free atomic semantics by writing back the timestamp. Further, we also show that wait-free atomic semantics can be achieved even when the readers are only allowed to change just {\it 1-bit of information} at the servers.

DOWNLOAD tr07-04.pdf

TR-07-05

Sra, Suvrit, Prateek Jain, and Inderjit Dhillon. "Modeling data using directional distributions: Part II". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-05 (regular technical report). February 5, 2007, updated May 2008. 15 pages.

High-dimensional data is central to most data mining applications, and only recently has it been modeled via directional distributions. In [Banerjee et al., 2003] the authors introduced the use of the von Mises-Fisher (vMF) distribution for modeling high-dimensional directional data, particularly for text and gene expression analysis. The vMF distribution is one of the simplest directional distributions. TheWatson, Bingham, and Fisher-Bingham distributions provide distri- butions with an increasing number of parameters and thereby commensurately increased modeling power. This report provides a followup study to the initial development in [Banerjee et al., 2003] by presenting Expectation Maximization (EM) procedures for estimating parameters of a mixture of Watson (moW) distributions. The numerical challenges associated with parameter estimation for both of these distributions are significantly more difficult than for the vMF distribution. We develop new numerical approximations for estimating the parameters permitting us to model real- life data more accurately. Our experimental results establish that for certain data sets improved modeling power translates into better results.

DOWNLOAD tr07-05.pdf

TR-07-06

Mao, Yun, Feng Wang, Lili Qiu, Simon S. Lam, and Jonathan M. Smith. "S4: Small State and Small Stretch Routing Protocol for Large Wireless Sensor Networks". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-06 (regular report). May 8, 2007. 14 pages.

Routing protocols for wireless sensor networks must address the challenges of reliable packet delivery at increasingly large scales and with highly constrained node resources. Attempts to reduce routing state can result in undesirable worst-case routing performance, as measured by stretch, which is the ratio of the hop count of the selected path to that of the optimal path. We present a new routing protocol, Small State and Small Stretch (S4), which jointly minimizes the state and stretch. S4 uses a combination of beacon distance-vector based global routing state and scoped distance-vector based local routing state to achieve a worst-case stretch of 3 using $O(\sqrt{N})$ routing state per node in an N-node network. Its average routing stretch is close to 1. S4 further incorporates local failure recovery to achieve resilience to dynamic topology changes. We use multiple simulation environments to assess performance claims at scale, and use experiments in a 42-node wireless sensor network testbed to evaluate performance under realistic RF and failure dynamics. The results show that S4 achieves scalability, efficiency, and resilience in a wide range of scenarios.

DOWNLOAD tr07-06.pdf

TR-07-07

Grauman, Kristen and Trevor Darrell. "Pyramid Match Hashing: Sub-Linear Time Indexing Over Partial Correspondences". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-07 (regular report). February 12, 2007.

NO ABSTRACT

DOWNLOAD NOT AVAILABLE

TR-07-08

Jain, Prateek, Brian Kulis, and Inderjit Dhillon. "Online Linear Regression using Burg Entropy". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-08 (regular report). February 14, 2007.

NO ABSTRACT

DOWNLOAD NOT AVAILABLE

TR-07-09

Shevade, Upendra, Ravi Kokku, and Harrick M. Vin. "Run-time System for Scalable Throughput-oriented Services". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-09 (regular report). February 20, 2007.

NO ABSTRACT

DOWNLOAD NOT AVAILABLE

TR-07-10

Cook, William R. and Ali H. Ibrahim. "Programming Languages & Databases: What's the Problem?". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-10 (regular report). February 20, 2007. 18 pages.

The problem of integrating databases and programming languages has been open for nearly 45 years. During this time much progress has been made, in exploring specialized database programming languages, orthogonal persistence, object-oriented databases, transaction models, data access libraries, embedded queries, and object-relational mapping. While new solutions are proposed every year, none has yet proven fully satisfactory. One explanation for this situation is that the problem itself is not sufficiently well defined, so that partial solutions continue to be proposed and evaluated based upon incomplete metrics, making directed progress difficult. This paper is an attempt to clarify the problem, rather than propose a new solution. We review issues that arise on the boundary between programming languages and databases, including typing, optimization, and reuse. We develop specific criteria for evaluating solutions and apply these to the solution approaches mentioned above. The analysis shows that progress has been made, yet the key problem of meeting all the criteria simultaneously remains open.

DOWNLOAD tr07-10.pdf

TR-07-11

Brown, Daniel and William R. Cook. "Monadic Memoization Mixins". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-11 (regular report). February 20, 2007. 11 pages.

Memoization is a familiar technique for improving the performance of programs: computed answers are saved so that they can be reused later instead of being recomputed. In a pure functional language, memoization of a function is complicated by the need to manage the table of saved answers between calls to the function, including recursive calls within the function itself. A lazy recursive data structure can be used to maintain past answers--although achieving an efficient algorithm can require a complex rewrite of the function into a special form. Memoization can also be defined as a language primitive--but to be useful it would need to support a range of memoization strategies. In this paper we develop a technique for modular memoization within a pure functional language. We define monadic memoization mixins that are composed (via inheritance) with an ordinary monadic function to create a memoized version of the function. As a case study, we memoize a recursive-descent parser written using standard parser combinators. A comparison of the performance of different approaches shows that memoization mixins are efficient for a small example.

DOWNLOAD tr07-11.pdf

TR-07-12

Choi, Taehwan and Sunghoon Seo. "ABC² : A New Approach to Seamless Mobility between Cellular Networks and WLAN". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-12 (regular report). December 21, 2007. 26 pages.

With the advent of new innovative mobility protocols, it is hard to re- alize seamless mobility. There are several reasons for this problem. First, service providers have to deploy in the entire networks and this demands a lot of financial investment. Second, even with those innovations, new mobility protocols are likely to be ignored by customers if it costs a lot of money for subscribers. Third, it is likely that most mobility protocols does not support both high bandwidth and large range. Although the widespread deployment of cellular networks guarantees mobility in large range, it cannot completely support multimedia data. Similarly, although the explosive growth of WLAN provides the conve- nience of wireless with high bandwidth, it lacks large range support. Thus, in this paper, we propose a new approach, ABC2, for seamless mobility such that our approach provides both high bandwidth and large range. ABC2 will provide not only an ÒAlways Best ConnectedÓ but also an ÒAlways Best ComplementedÓ interworking architecture between cellular networks and WLAN. ABC2 architecture is novel in that packet data network is layered on top of circuit switch network such that it can use current legacy tele- phone networks for mobility management and it can provide multimedia data with low cost investment as the Internet advances to the higher band- width. In our analysis, we show that ABC2 performs better in terms of location updates, handoff latency, and signaling delay compared to most mobility protocols dependent on Mobile IP.

DOWNLOAD tr07-12.pdf

TR-07-13

Aiyer, Amitanand S., Lorenzo Alvisi, and Rida A. Bazzi. "Lightweight writeback for Byzantine storage systems". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-13 (regular report). March 1, 2007. 24 pages.

We present the first optimally resilient, bounded, wait-free implementation of a replicated register providing atomic semantics in a system in which readers can be Byzantine, up to $f$ servers ($n \geq (3f+1)$) are subject to Byzantine failures and servers do not communicate with each other. Unlike previous solutions, the sizes of messages sent to writers depend only on the actual number of active readers and not on the total number of readers in the system. Timestamps generated by our solution are non-skipping and messages sent to readers and writers contain only a finite number of values, operation identifiers, and timestamps. We introduce {\em lightweight write back}, a new mechanism which enables readers to write back only the (non skipping) timestamp of the value read and not the value itself. This is particularly important since Byzantine readers that write back a value could force the servers to process infinitely large messages, whereas a non-skipping timestamp is practically finite in size. With a novel use of secret sharing techniques combined with writeback throttling, we manage to tolerate Byzantine readers without the use of any unproven cryptographic assumptions.

DOWNLOAD tr07-13.pdf

TR-07-14

Rozner, Eric, Jayesh Seshadri, Yogita Mehta, and Lili Qiu. "SOAR: Simple Opportunistic Adaptive Routing Protocol for Wireless Mesh Networks". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-14 (regular report). March 5, 2007.

NO ABSTRACT

DOWNLOAD NOT AVAILABLE

TR-07-15

Jain, N., D. Kit, P. Mahajan, P. Yalagandula, M. Dahlin, and Y. Zhang. "{STAR:} Self Tuning Aggregation for Scalable Monitoring (extended)". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-15 (regular report). March 21, 2007.

NO ABSTRACT

DOWNLOAD NOT AVAILABLE

TR-07-16

Navratil, Paul Arthur, Jarrett L. Johnson, and Volker Bromm. "Visualization of Cosmological Point-Based Data Sets". The University of Texas at Austin, Departments of Computer Sciences and Astronomy. Report# TR-07-16 (regular report). March 27, 2007. 7 pages.

We describe our visualization process for a particle-based simulation of the formation of the first stars and their impact on cosmic history. The dataset consists of several hundred time-steps of point simulation data, each approximately 500MB in size. For each time-step, we interpolate the point data onto a regular grid using a method taken from the radiance estimate of photon mapping. We import the resulting regular grid representation into ParaView, with which we extract isosurfaces across multiple variables. Our images provide insights into the evolution of the early universe, tracing the cosmic transition from an initially homogeneous state to one of increasing complexity. Specifically, our visualizations capture the build-up of regions of ionized gas around the first stars, their evolution, and their complex interactions with the surrounding matter. These phenomena were not clearly visible with other visualization techniques. These observations will guide the upcoming James Webb Space Telescope, the key astronomy mission of the next decade.

DOWNLOAD tr07-16.pdf

TR-07-17

Bond, Michael D. and Kathryn S. McKinley. "Probabilistic Calling Context". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-17 (regular report). March 29, 2007.

NO ABSTRACT

DOWNLOAD NOT AVAILABLE

TR-07-18

Bond, Michael D., Nicholas Nethercote, Stephen W. Kent, Samuel Z. Guyer, and Kathryn S. McKinley. "Tracking Bad Apples: Reporting the Origin of Null and Undefined Value Errors". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-18 (regular report). March 29, 2007.

NO ABSTRACT

DOWNLOAD NOT AVAILABLE

TR-07-19

Navratil, Paul Arthur, Donald S. Fussell, and Calvin Lin. "Dynamic Ray Scheduling for Improved System Performance". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-19 (regular report). April 12, 2007. 8 pages.

The performance of full-featured ray tracers has historically been limited by the hardware's floating point computational power. However, next generation multi-threaded multi-core architectures promise to provide sufficient CPU power to support real time frame rates. In such systems, the emerging problem will be limited memory system performance in terms of both on-chip cache and DRAM-to-cache bandwidth. This paper presents a novel ray tracing algorithm that significantly improves both cache utilization and DRAM-to-cache bandwidth. The key insight is to view ray traversal as a scheduling problem, which allows our algorithm to match ray traversal computations and intersection computations with available system resources. Using a detailed simulator, we show that our algorithm reduces the amount of geometry brought into the cache by up to 32x for primary rays and up to 60x for shadow rays, in exchange for the small overhead of maintaining the ray schedule. Moreover, our algorithm creates units of work that are more amenable to parallelization than traditional Whitted-style ray tracers.

DOWNLOAD tr07-19.pdf

TR-07-20

Kotla, Ramakrishna, Lorenzo Alvisi, and Mike Dahlin. "SafeStore : A Durable and Practical Storage System". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-20 (regular report). April 30, 2007. 18 pages.

This paper presents SafeStore, a distributed storage system designed to maintain long-term data durability despite conventional hardware and software faults, environmental disruptions, and administrative failures caused by human error or malice. The architecture of SafeStore is based on fault isolation, which Safe- Store applies aggressively along administrative, physical, and temporal dimensions by spreading data across autonomous storage service providers (SSPs). However, current storage interfaces provided by SSPs are not designed for high end-to-end durability. In this paper, we propose a new storage system architecture that (1) spreads data efficiently across autonomous SSPs using informed hierarchical erasure coding that, for a given replication cost, provides several additional 9Õs of durability over what can be achieved with existing black-box SSP interfaces, (2) performs an efficient end-to-end audit of SSPs to detect data loss that, for a 20% cost increase, improves data durability by two 9Õs by reducing MTTR, and (3) offers durable storage with cost, performance, and availability competitive with traditional storage systems. We instantiate and evaluate these ideas by building a SafeStore-based file system with an NFSlike interface.

DOWNLOAD tr07-20.pdf

TR-07-21

Aiyer, Amitanand S., Lorenzo Alvisi, Allen Clement, and Harry C. Li. "Information-Theoretically Secure Byzantine Paxos". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-21 (regular report). May 18, 20077. 11 pages.

We present Information Theoretically secure Byzantine Paxos (IT ByzPaxos), the first deterministic asynchronous Byzantine consensus protocol that is provably secure despite a computationally unbounded adversary. Previous deterministic asynchronous algorithms for Byzantine consensus rely on unproven number theoretic assumptions (i.e ., digital signatures) to maintain agreement. IT ByzPaxos instead uses secret sharing techniques that are information theoretically secure to ensure that all correct processes agree. Our protocol guarantees safety in an asynchronous system and provides progress under eventual synchrony. IT ByzPaxos matches the 3f+1 lower bound on the number of processes for Byzantine consensus.

DOWNLOAD tr07-21.pdf

TR-07-22

Dimitrov, Nedialko and Indrajit Roy. "A Competitive, Primal-Dual Algorithm for Stable Coalitions in a Cluster". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-22 (regular report). May 7, 2007. 21 pages.

In this paper we study the following Cluster Profit Problem. Highly parallelizable requests arrive on network nodes. Each request is associated with a tuple (g, r). The requester is willing to pay k*g if k machines execute the request in parallel. If some machines work on a request, those machines must pay the request processing costs, r, as well as connection costs to the request. The problem is to find a profit maximizing assignment of machines to request, such that each machine works on at most one request. The Cluster Profit Problem can be viewed as a profit maximizing variant of the Facility Location Problem. We show a constant-competitive, primal-dual algorithm for the Cluster Profit Problem. We show the algorithm is game theoretically stable with respect to group deviations on the part of the participating machines. Finally, we provide a distributed implementation of the algorithm that uses only local network structure for the required computations.

DOWNLOAD tr07-22.pdf

TR-07-23

Rozner, Eric, Yogita Mehta, Aditya Akella, and Lili Qiu. "Traffic-Aware Channel Assignment in Enterprise Wireless LANs". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-23 (regular report). May 7, 2007. 15 pages.

Campus and enterprise wireless networks are increasingly characterized by ubiquitous coverage and rising traffic demands. Efficiently assigning channels to access points (APs) in these networks can significantly affect the performance and capacity of the WLANs. The state-of-the-art approaches assign channels statically, without considering prevailing traffic demands. In this paper, we show that the quality of a channel assignment can be improved significantly by incorporating observed traffic demands at APs and clients into the assignment process. We refer to this as traffic-aware channel assignment. We conduct extensive tracedriven and synthetic simulations and identify deployment scenarios where traffic-awareness is likely to be of great help, and scenarios where the benefit is minimal. We address key practical issues in using traffic awareness, including measuring an interference graph, handling non-binary interference, collecting traffic demands, and predicting future demands based on historical information. We present an implementation of our assignment scheme for a 25-node WLAN testbed. Our testbed experiments show that traffic-aware assignments offers superior network performance under a wide range of real network configurations. On the whole, our approach is simple yet effective. It can be incorporated into existing WLANs with little modification to the wireless nodes or infrastructure.

DOWNLOAD tr07-23.pdf

TR-07-24

Hollander, Jeremy. "A Link Layer Discovery Protocol Fuzzer". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-24 (regular report). May 7, 2007. 6 pages.

We developed the first Link Layer Discovery Protocol (LLDP) fuzzer with ten test cases to find security vulnerabilities in LLDP-enabled network devices. The current test cases look for off-by-one errors, consistency errors, buffer overflows and stack injections. Furthermore, our fuzzer can easily be extended with additional test cases.

DOWNLOAD tr07-24.pdf

TR-07-25

Li, Harry C., Allen Clement, Amitanand Aiyer, and Lorenzo Alvisi. "The Paxos Register". The University of Texas at Austin, Department of Computer Sciences and Astronomy. Report# TR-07-25 (regular report). May 14, 2007. 21 pages.

We introduce the Paxos register to simplify and unify the presentation of Paxos-style consensus protocols. We use our register to show how Lamport's Classic Paxos and Castro and Liskov's Byzantine Paxos are the same consensus protocol, but for different failure models. We also use our register to compare and contrast Byzantine Paxos with Martin and Alvisi's Fast Byzantine Consensus. The Paxos register is a write-once register that exposes two important abstractions for reaching consensus: (i) read and write operations that capture how processes in Paxos protocols propose and decide values and (ii) tokens that capture how these protocols guarantee agreement despite partial failures. We encapsulate the differences of several Paxos-style protocols in the implementation details of these abstractions.

DOWNLOAD tr07-25.pdf

TR-07-26

Sherstov, Alexander A. "Communication Complexity under Product and Nonproduct Distributions". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-26 (regular report). May 15, 2007. 16 pages.

See abstract on report. This report doesn't print properly in HTML.

DOWNLOAD tr07-26.pdf

TR-07-27

Arora, Anish, Young-ri Choi, Mohamed G. Gouda, Jason O. Hallstrom, Ted Herman, William M. Leal, and Nigamanth Sridhar. "A Programming Construct for Sensor Applications". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-27 (regular report). May 18, 2007. 16 pages.

DESAL is an NSF-sponsored project to produce a programming language, also called DESAL, for developing software for sensor network applications. The DESAL programming language is both statebased (rather than event-based) and rule-based (rather than being linearly executed). In this paper, we present a new programming construct called Phases, that can be incorporated into the DESAL programming language. This construct allows a programmer to develop software (for sensor network applications) that can run on many sensor networks, that support phases, without having to worry about the physical characteristics of these sensor networks, such as the number of sensors in the network, and the end-to-end delay of the network.

DOWNLOAD tr07-27.pdf

TR-07-28

Kampasi, Abhinay, Yin Zhang, Giovanni Di Crescenzo, Abhrajit Ghosh, and Rajesh Talpade. "Improving Stepping Stone Detection Algorithms using Anomaly Detection Techniques". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-28 (regular report). May 21, 2007. 8 pages.

Network attackers frequently use a chain of compromised intermediate nodes to attack a target machine and maintain anonymity. This chain of nodes between the attacker and the target is called a stepping stone chain. Various algorithms have been proposed to detect stepping stones, timing correlation based algorithms being one of them. However, the existing timing based algorithms are susceptible to failure if the attacker actively tries to evade detection using jitter or chaff. We have developed three anomaly detection algorithms to detect the presence of jitter and chaff in interactive connections. Experiments performed on Deter using real-world traces and live traffic demonstrate that the algorithms perform well with very low false positives and false negatives and have a high success percentage of about 99%. These algorithms based on response times from the server and causality of traffic in both directions of an interactive connection have made the existing stepping stone detection framework more robust and resistant to evasion.

DOWNLOAD tr07-28.pdf

TR-07-29

Hanson, Heather. "Coordinated Power, Energy, and Temperature Management". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-29 (dissertation). June 18, 2007.

NO ABSTRACT

DOWNLOAD NOT AVAILABLE

TR-07-30

Tok, Teck Bok and Calvin Lin. "Applying Relevance-Based Context Partitioning to Pointer Analysis". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-30 (regular tech report). July 6, 2007.

NO ABSTRACT

DOWNLOAD NOT AVAILABLE

TR-07-31

Liu, Huaiyu and Simon Lam. "Neighbor Table Construction and Update for Resilient Hypercube Routing in P2P Networks". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-31 (regular tech report). July 12, 2007. 28 pages.

Several proposed peer-to-peer networks use hypercube routing for scalability. Consistency of neighbor tables in hypercube routing guarantees the existence of a path from any source node to any destination node. Such consistency, however, can be broken by node failures. To improve the robustness of hypercube routing, we first generalize the concept of consistency to K-consistency, for K>=1, which is shown to provide at least K disjoint paths for any source-destination pair with a probability close to 1. Our next objective is to design and specify a new join protocol together with a proof that it generates K-consistent neighbor tables for an arbitrary number of concurrent joins. We first present a conceptual foundation, called C-set trees, for protocol design and reasoning about K-consistency. We then present a detailed specification of a join protocol, and a rigorous proof of correctness for the join protocol. The crux of our proof is based upon induction on C-set trees. Both theoretical analysis and simulation results show that the join protocol is scalable to a large number of network nodes.

DOWNLOAD tr07-31.pdf

HR-07-31

Madigan, Ryan. "Creating a High-Level Control Module for an Autonomous Mobile Robot Operating in an Urban Environment". The University of Texas at Austin, Department of Computer Sciences. Report# HR-07-31 (honors thesis). Spring 2007. 15 pages.

The DARPA Urban Challenge is a government-sponsored competition scheduled for November 2007. The challenge is to create a fully autonomous vehicle built upon a street-legal frame that can navigate an urban driving environment. Naturally, urban driving environments require the vehicle to obey traffic laws, interact properly with other vehicles on the road, and generally drive much like a human would. This functionality requires a good amount of sophisticated decision making that was not necessary to complete the 2005 DARPA Grand Challenge. This paper outlines our strategy for implementing a control module to effectively maintain awareness of our vehicleÕs highlevel situation and to ensure that the vehicle is working toward completing its mission while following the rules of the road.

DOWNLOAD cs-07-31-madigan.pdf

TR-07-32

Aiyer, Amitanand, Lorenzo Alvisi, and Rida Bazzi. "Bounded Wait-Free Implementation of Optimally resilient Byzantine Storage without (Unproven) Cryptographic assumptions in P2P Networks". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-32 (regular tech report). July 17, 2007.

NO ABSTRACT

DOWNLOAD NOT AVAILABLE

HR-07-32

Marker, Bryan. "On Composing Matrix Multiplication from Kernels". The University of Texas at Austin, Department of Computer Sciences. Report# HR-07-32 (honors thesis). Spring 2007. 21 pages.

Matrix multiplication is often treated as a basic unit of computation in terms of which other operations are implemented, yielding high performance. In this paper initial evidence is provided that there is a benefit gained when lower level kernels, from which matrix multiplication is composed, are exposed. In particular it is shown that matrix multiplication itself can be coded at a high level of abstraction as long as interfaces to such low-level kernels are exposed. Furthermore, it is shown that higher performance implementations of parallel matrix-multiplication can be achieved by exploiting access to these kernels. Experiments on Itanium2 and Pentium 4 servers support these insights.

DOWNLOAD cs-07-32-marker.pdf

TR-07-33

Sherstov, Alexander. "A Discrepancy-Based Proof of Razborov's Quantum Lower Bounds". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-33 (regular tech report). July 24, 2007. 25 pages.

NO ABSTRACT

DOWNLOAD tr07-33.pdf

HR-07-33

Moldenhauer, William. "Bringing Verification to a Virtual World". The University of Texas at Austin, Department of Computer Sciences. Report# HR-07-33 (honors thesis). Spring 2007. 29 pages.

Second Life is a virtual world where avatars representing real entities interact through scripted transactions. Avatars may be anonymous so that relationships from the real world do not carry into Second Life, and their trustworthiness may not be apparent. Our goal is to apply software verification techniques to establish a basis for trust in Second Life within the realm of scripted transactions. Even though Second Life is a virtual world, it has an economy that provides very real financial opportunity as well as risk. Scripts can be used to help manage assets of financial value in Second Life, but can they be trusted to handle financial assets appropriately? This project aims to provide a solution to this problem by automatically generating models of Second Life scripts that can be verified through the use of traditional software verification tools. This verification capability serves as an initial step towards a certification system for Second Life, providing a framework for formally verifying that Second Life scripts correctly implement their desired behaviors and certifying scripts with varying levels of trust, based on the properties they can be formally shown to satisfy.

DOWNLOAD cs-07-33-moldenhauer.pdf

TR-07-34

Sherstov, Alexander. "Approximate Inclusion-Exclusion for Arbitrary Symmetric Functions". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-34 (regular tech report). July 24, 2007. 29 pages.

NO ABSTRACT

DOWNLOAD tr07-34.pdf

HR-07-34

Roche, David Lan. "Experimental Study of High Performance Priority Queues". The University of Texas at Austin, Department of Computer Sciences. Report# HR-07-34 (honors thesis). Spring 2007. 35 pages.

The priority queue is a very important and widely used data structure in computer science, with a variety of applications including DijkstraÕs Single Source Shortest Path algorithm on sparse graph types. This study presents the experimental results of a variety of priority queues. The focus of the experiments is to measure the speed and performance of highly specialized priority queues in out-of-core and memory intensive situations. The priority queues are run in-core on small input sizes as well as out-of-core using large input sizes and restricted memory. The experiments compare a variety of well-known priority queue implementations such as Binary Heap with highly specialized implementations, such as 4-ary Aligned Heap, Chowdhury and RamachandranÕs Auxiliary Buffer Heap, and Fast Binary Heap. The experiments include Cache-Aware as well as Cache-Oblivious priority queues. The results indicate that the high-performance priority queues easily outperform traditional implementations. Also, overall the Auxiliary Buffer Heap has the best performance among the priority queues considered in most in-core and out-of-core situations.

DOWNLOAD cs-07-34-roche.pdf

TR-07-35

Batory, Don. "From Implementation To Theory in Product Synthesis". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-35 (regular tech report). January 18, 2007. 13 pages.

Future software development will rely on product synthesis, i.e., the synthesis of code and non-code artifacts for a target component or application. Prior work on feature-based product synthesis can be unified by applying elementary ideas from category theory. Doing so reveals (a) important and previously unrecognized properties that product synthesis tools must satisfy, and (b) non-obvious generalizations of current techniques that will guide future research efforts in automated product development.

DOWNLOAD tr07-35.pdf

HR-07-35

Zacharski, Adam. "Xen and the Art of Distributed Virtual Machine Management". The University of Texas at Austin, Department of Computer Sciences. Report# HR-07-35 (honors thesis). Spring 2007. 24 pages.

In this thesis report, a method for automating the creation and management of many virtual machines across multiple physical machines is described. By using the Xen paravirtualization software as a base, this method adds tools to create new virtual machine images and allows for live migration without any human intervention. This method uses management agents that run on the physical and virtual machines. There is also a management GUI that can be run from any computer on the network. All of these agents (the management clients and the GUI) communicate using a publish/subscribe based messaging system. For example, when a machine is overloaded the server agent on that machine broadcasts a message to the publish/subscribe message bus that contains a request for help. This system was evaluated on a network of four dual processor dual core servers. Multiple high load virtual machines were started on a single physical machine. The system was then monitored to make sure that the virtual machines were evenly distributed among the other physical machines on the network. The test results show that this method for virtual machine management could be used for the management of a few hundred virtual machines.

DOWNLOAD cs-07-35-zacharski.pdf

TR-07-36

Batory, Don and Egon Borger. "On The Modularization of Theorems for Software Product Lines". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-36 (regular tech report). July 24, 2007. 10 pages.

A goal of software product lines is the economical synthesis of programs in a family of programs. In this paper, we explain how theorems about program properties can be integrated into feature-based development of software product lines. As a case study, we analyze an existing Java/JVM compilation correctness proof for defining, interpreting, compiling, and executing bytecode for the Java language. We explain how features modularize both programs and theorems. By composing features, the source code and theorems for a program are synthesized. Generated theorems may then be certified manually or automatically using a proof checker, opening a new line of research in verification.

DOWNLOAD tr07-36.pdf

TR-07-37

Quintana-Orti, Gregorio, Enrique S. Quintana-Orti, Ernie Chan, Robert A. van de Geijn, and Field G. Van Zee. "Scheduling of QR Factorization Algorithms on SMP and Multi-Core Architectures". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-37 (regular tech report). July 31, 2007. 15 pages.

This paper examines the scalable parallel implementation of QR factorization of a general matrix, targeting SMP and multi-core architectures. Two implementations of algorithms-by-blocks are presented. Each implementation views a block of a matrix as the fundamental unit of data, and likewise, operations over these blocks as the primary unit of computation. The first is a conventional blocked algorithm similar to those included in libFLAME and LAPACK but expressed in a way that allows operations in the so-called critical path of execution to be computed as soon as their dependencies are satisfied. The second algorithm captures a higher degree of parallelism with an approach based on Givens rotations while preserving the performance benefits of algorithms based on blocked Householder transformations. We show that the implementation effort is greatly simplified by expressing the algorithms in code with the FLAME/FLASH API, which allows matrices stored by blocks to be viewed and managed as matrices of matrix blocks. The SuperMatrix run-time system utilizes FLASH to assemble and represent matrices but also provides out-of-order scheduling of operations that is transparent to the programmer. Scalability of the solution is demonstrated on a ccNUMA platform with 16 processors.

DOWNLOAD tr07-37.pdf

TR-07-38

Wiedermann, Ben. "Know your place: Selectively executing statements based on context". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-38 (regular tech report). July 31, 2007.

NO ABSTRACT

DOWNLOAD NOT AVAILABLE

TR-07-39

Wiedermann, Ben. "Quit Copying Me: A Framework for Evaluating Optimal Garbage Collection Costs". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-39 (regular tech report). July 31, 2007.

NO ABSTRACT

DOWNLOAD NOT AVAILABLE

TR-07-40

Kotla, Ramakrishna, Lorenzo Alvisi, Mike Dahlin, Allen Clement, and Edmund Wong. "Zyzzyva: Speculative Byzantine Fault Tolerance". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-40 (regular tech report). August 6, 2007.

NO ABSTRACT

DOWNLOAD NOT AVAILABLE

TR-07-41

Chan, Ernie, Field G. Van Zee, Paolo Bientinesi, Enrique S. Quintana-Orti, Gregorio Quintana-Orti, and Robert van de Geijn. "SuperMatrix: A Multithreaded Runtime Scheduling System for Algorithms-by-Blocks". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-41 (regular tech report). August 23, 2007. 13 pages.

This paper describes SuperMatrix, a runtime system that parallelizes matrix operations for SMP and/or multi-core architectures. We use this system to demonstrate how code described at a high level of abstraction can achieve high performance on such architectures while completely hiding the parallelism from the library programmer. The key insight entails viewing matrices hierarchically, consisting of blocks that serve as units of data where operations over those blocks are treated as units of computation. The implementation transparently enqueues the required operations, internally tracking dependencies, and then executes the operations utilizing out-of-order execution techniques inspired by superscalar microarchitectures. This separation of concerns allows library developers to implement algorithms without concerning themselves with the parallelization aspect of the problem. Different heuristics for scheduling operations can be implemented in the runtime system independent of the code that enqueues the operations. Results gathered on a 16 CPU ccNUMA Itanium2 server demonstrate excellent performance.

DOWNLOAD tr07-41.pdf

TR-07-42

Gouda, Mohamed G. and Yan Li. "The Truth System: Can a System of Lying Processes Stabilize?". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-42 (regular tech report). August 24, 2007. 19 pages.

We introduce a new abstract system, called the truth system. In the truth system, a process deduces a true value, with high probability, from an incoming stream of both true and false values, where the probability that a value in the incoming stream is true is at least 0.6. At each instant, the receiving process maintains at most one candidate of the true value, and eventually the process reaches the conclusion that its candidate value equals, with high probability, the true value. In this paper, we present three versions of the truth system, discuss their properties, and show how to choose their parameters so that their probability of error is small, i.e. about 10^{-6}. The third version, called the stable system, is the most valuable. We employ the stable system as a building block in a stabilizing unidirectional token ring of n processes. When n is small, i.e. about 100 or less, the stable system can be considered error-free and we argue that the resulting token ring is stabilizing with high probability. We simulate the token ring, when n is at most 100, and observe that the ring always stabilizes even though each process lies about its state 40% of the time.

DOWNLOAD tr07-42.pdf

TR-07-43

Govindan, Madhu Saravana Sibi, Stephen W. Keckler, Sani Nassif, and Emrah Acar. "A Temperature-Aware Power Estimation Methodology". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-43 (regular tech report). August 29, 2007. 11 pages.

Reducing power consumption, improving designer productivity and mitigating thermal effects are grand challenges for future CMOS-based designs in the nanometer regime [1]. Solving these challenges requires a power estimation methodology that is temperature aware and simple, fast and accurate. In this paper, we present such a power estimation methodology that utilizes data from different levels of modeling abstraction and is applicable to both current and future processors. Our methodology leverages design data from the gatelevel model and activity factors from the structural RTL model and refines the initial power estimates based on a thermal and power grid model. We demonstrate our methodology using a SOC-style, tiled, general purpose, chip multiprocessor implemented at 130nm and provide scaled-down estimates at 90nm, 65nm, 45nm and 32nm technologies.

DOWNLOAD tr07-43.pdf

TR-07-44

Xiong, Ming, Song Han, Kam-Liu Lam, and Deji Chen. "Deferrable Scheduling for Maintaining Real-Time Data Freshness: Algorithms, Analysis, and Results". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-44 (regular tech report). September 5, 2007.

Periodic update transaction model has been used to maintain freshness (or temporal validity) of real-time data. Period and deadline assignment has been the main focus in the past studies such as the More-Less scheme in which update transactions are guaranteed by the Deadline Monotonic scheduling algorithm to complete by their deadlines. In this article, we propose deferrable scheduling algorithm for fixed priority transactions - a novel approach for minimizing update workload while maintaining temporal validity of real-time data. In contrast to previous work, update transactions scheduled by the deferrable scheduling algorithm follow a sporadic task model. The deferrable scheduling algorithm exploits the semantics of temporal validity constraint of real-time data by judiciously deferring the sampling times of update transaction jobs as late as possible. We present a theoretical estimation of its processor utilization, examine its schedulability, and present a sufficient condition. Our experimental results verify the theoretical estimation of the processor utilization. We demonstrate through the experiments that the deferrable scheduling algorithm is a very effective approach, and it significantly outperforms the More-Less scheme in terms of reducing processor workload.

DOWNLOAD tr07-44.pdf

TR-07-45

Bush, Kevin Beckwith and Doug Burger. "DSP Extensions to the TRIPS ISA". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-45 (regular tech report). September 7, 2007. 17 pages.

In this paper, we propose a set of DSP extensions to the TRIPS ISA and evaluate their performance. By extending the TRIPS ISA with specialized DSP instructions, we offer an explorative look at the interaction conventional specialization techniques (such as SIMD instructions) have with EDGE ISAs. We discuss the implementation and its feasibility and provide non-intrusive compiler support through hand-written library functions. Finally, we evaluate the performance benefits of our extensions with custom library-emphasizing benchmarks and compare our results with those of the industry standard TI c6416 digital signal processor.

DOWNLOAD tr07-45.pdf

TR-07-46

Sherstov, Alexander A. "The Pattern Matrix Method for Lower Bounds on Quantum Communication". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-46 (regular tech report). September 6, 2007. 28 pages.

See abstract in report.

DOWNLOAD tr07-46.pdf

TR-07-47

Kulis, Brian, Suvrit Sra, Stefanie Jegelka, and Inderjit S. Dhillon. "Scalable Semidefinite Programming using Convex Perturbations". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-47 (regular tech report). September 13, 2007.

Several important machine learning problems can be modeled and solved via semidefinite programs. Often, researchers invoke off-the-shelf software for the associated optimization, which can be inappropriate for many applications due to computational and storage requirements. In this paper, we introduce the use of convex perturbations for semidefinite programs (SDPs). Using a particular perturbation function, we arrive at an algorithm for SDPs that has several advantages over existing techniques: a) it is simple, requiring only a few lines of Matlab, b) it is a first-order method which makes it scalable, c) it can easily exploit the structure of a particular SDP to gain efficiency (e.g., when the constraint matrices are low-rank). We demonstrate on several machine learning applications that the proposed algorithm is effective in finding fast approximations to large-scale SDPs.

DOWNLOAD tr07-47.pdf

TR-07-48

Jain, Prateek, Brian Kulis, and Kristen Grauman. "Fast Similarity Search for Learned Metrics". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-48 (regular tech report). September 14, 2007. 11 pages.

We propose a method to efficiently index into a large database of examples according to a learned metric. Given a collection of examples, we learn a Mahalanobis distance using an information theoretic metric learning technique that adapts prior knowledge about pairwise distances to incorporate similarity and dissimilarity constraints. To enable sub-linear time similarity searchunder the learned metric, we show how to encode a learned Mahalanobis parameterization into randomized locality-sensitive hash functions. We further formulate an indirect solution that enables metric learning and hashing for sparse input vector spaces whose high dimensionality make it infeasible to learn an explicit weighting over the feature dimensions. We demonstrate the approach applied to systems and image datasets, and show that our learned metrics improve accuracy relative to commonly-used metric baselines, while our hashing construction permits efficient indexing with a learned distance and very large databases.

DOWNLOAD tr07-48.pdf

TR-07-49

Diamond, Jeff, Behnam Robatmilli, Stephen W. Keckler, Robert van de Geijn, Kazushige Goto, and Doug Burger. "High Performance Linear Algebra on a Spatially Distributed Processor". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-49 (regular tech report). September 14, 2007.

NO ABSTRACT

DOWNLOAD NOT AVAILABLE

TR-07-50

Quintana-Orti, Gregorio, Enrique S. Quintana-Orti, Ernie Chan, Robert van de Geijn, and Field G. Van Zee. "Design and Scheduling of an Algorithm-by-Blocks for LU Factorization on Multithreaded Architectures". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-50 (regular tech report). September 19, 2007. 16 pages.

The scalable parallel implementation, targeting SMP and/or multicore architectures, of the LU factorization of a matrix is studied. It is shown that an algorithm-by-blocks exposes a higher degree of parallelism than traditional implementations based on multithreaded BLAS. This algorithm requires a different pivoting strategy, incremental pivoting, but allows most computation to be cast in terms of matrix-matrix multiplication without adversely increasing the operation count. Its implementation using the SuperMatrix runtime system is discussed and the scalability of the solution is demonstrated on an ccNUMA platform with 16 processors.

DOWNLOAD tr07-50.pdf

TR-07-51

Quintana-Orti, Gregorio, Enrique S. Quintana-Orti, Alfredo Remon, and Robert van de Geijn. "SuperMatrix for the Factorization of Band Matrices". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-51 (regular tech report). September 24, 2007. 10 pages.

We pursue the scalable parallel implementation of the factorization of band matrices with medium to large bandwidth targeting SMP and multi-core architectures. Our approach decomposes the computation into a large number of fine-grained operations exposing a higher degree of parallelism. The SuperMatrix run-time system allows an out-of-order scheduling of operations and a blocked packed storage for band matrices that are transparent to the programmer. Experimental results for the Cholesky factorization of band matrices on a CC-NUMA platform with sixteen processors demonstrate the scalability of the solution.

DOWNLOAD tr07-51.pdf

TR-07-52

Djeu, Peter, Warren Hunt, Rui Wang, Ikrima Elhassan, Gordon Stoll, and William R. Mark. "Razor: An Architecture for Dynamic Multiresolution Ray Tracing". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-52 (regular tech report). January 24, 2007. 15 pages.

Rendering systems organized around the ray tracing visibility algorithm provide a powerful and general tool for generating realistic images. These systems are being rapidly adopted for offline rendering tasks, and there is increasing interest in utilizing ray tracing for interactive rendering as well. Unfortunately, standard ray tracing systems suffer from several fundamental problems that limit their flexibility and performance, and until these issues are addressed ray tracing will have no hope of replacing Z-buffer systems for most interactive graphics applications. To realize the full potential of ray tracing, it is necessary to use variants such as distribution ray tracing and path tracing that can compute compelling visual effects: soft shadows, glossy reflections, ambient occlusion, and many others. Unfortunately, current distribution ray tracing systems are fundamentally inefficient. They have high overhead for rendering large dynamic scenes, use excessively detailed geometry for secondary rays, perform redundant computations for shading and secondary rays, and have irregular data access and computation patterns that are a poor match for cost-effective hardware. We describe Razor, a new software architecture for a distribution ray tracer that addresses these issues. Razor supports watertight multiresolution geometry using a novel interpolation technique and a multiresolution kD-tree acceleration structure built on-demand each frame from a tightly integrated application scene graph. This dramatically reduces the cost of supporting dynamic scenes and improves data access and computation patterns for secondary rays. The architecture also decouples shading computations from visibility computations using a two-phase shading scheme. It uses existing best-practice techniques including bundling rays into SIMD packets for efficient computation and memory access. We present an experimental system that implements these techniques at near-interactive frame rates. We present results from this system demonstrating the effectiveness of its software architecture and algorithms.

DOWNLOAD tr07-52.pdf

TR-07-53

Sherstov, Alexander A. "Unbounded-Error Communication Complexity of Symmetric Functions". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-53 (regular tech report). September 26, 2007. 35 pages.

See techreport for abstract.

DOWNLOAD tr07-53.pdf

TR-07-54

Chen, Mo, Rezaul Alam Chowdhury, Vijaya Ramachandran, David Lan Roche, and Lingling Tong. "Priority Queues and Dijkstra's Algorithm". The University of Texas at Austin, Departments of Computer Sciences. Report# TR-07-54 (regular tech report). October 12, 2007. 25 pages.

We study the impact of using different priority queues in the performance of Dijkstra's SSSP algorithm. We consider only general priority queues that can handle any type of keys (integer, floating point, etc.); the only exception is that we use as a benchmark the DIMACS Challenge SSSP code which can handle only integer values for distances. Our experiments were focussed on the following: 1. We study the performance of two variants of Dijkstra's algorithm: the well-known version that uses a priority queue that supports the Decrease-Key operation, and another that uses a basic priority queue that supports only Insert and Delete-Min. For the latter type of priority queue we include several for which high-performance code is available such as bottom-up binary heap, aligned 4-ary heap, and sequence heap. 2. We study the performance of Dijkstra's algorithm designed for flat memory relative to versions that try to be cache-efficient. For this, in main part, we study the difference in performance of Dijkstra's algorithm relative to the cache-efficiency of the priority queue used, both in-core and out-of-core. We also study the performance of an implementation of Dijkstra's algorithm that achieves a modest amount of additional cache-efficiency in undirected graphs through the use of two cache-efficient priority queues. This is theoretically the most cache-efficient implementation of Dijkstra's algorithm currently known. Overall, our results show that using a standard priority queue without the decrease-key operation results in better performance than using one with the decrease-key operation in most cases; that cache-efficient priority queues improve the performance of Dijkstra's algorithm, both in-core and out-of-core on current processors; and that the dual priority queue version of Dijkstra's algorithm has a significant overhead in the constant factor, and hence is quite slow in in-core execution, though it performs by far the best on sparse graphs out-of-core.

DOWNLOAD tr07-54.pdf

HR-07-36

Chen, Mo. "Measuring and Improving the Performance of Cache-efficient Priority Queues in Dijkstra's Algorithm". The University of Texas at Austin, Department of Computer Sciences. Report# HR-07-36 (honors thesis). Spring 2007. 20 pages.

The priority queue is an useful data structure in computation. There currently exist many implementations of this data structure, including some that are cache-aware and some cache-oblivious. In this study, we compare the performance of several implementations of priority queues in Dijkstra's Single Source Shortest Path algorithm. We compare high per- formance heaps, such as the 4ary Aligned Heap, Fast Binary Heap and Sequence Heap against in-house heap implementations, namely Buffer Heap and Auxiliary Buffer Heap, and against well-known implementations such as the textbook Binary Heap. We focus our analysis on the benefit of supporting decrease-key operations within the priority queue. Results indicate that graph density a ect the relative performance of the different priority queues, and that using the Decrease-Key operation incurs unexpected performance hits. Furthermore, we will propose a parallel version of Buffer Heap which remains cache-oblivious and scales to a high level of parallelism.

DOWNLOAD cs-07-36-chen.pdf

TR-07-55

Cho, Hyuk and Inderjit S. Dhillon. "Effect of data transformation on residue". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-55 (regular tech report). October 11, 2007. 23 pages.

Recently, Aguilar-Ruiz [2005] considers a data matrix containing both scaling and shifting factors and shows that the mean squared residue [Cheng and Church, 2000], called RESIDUE(II) in this paper, is useful to discover shifting patterns, but not appropriate to find scaling patterns. This finding draws our attention on the weakness of RESIDUE(II) measure and the need of new approaches to discover both scaling and shifting patterns in the considered matrix. To resolve the weakness of RESIDUE(II) in finding scaling patterns, we propose a simple remedy that still uses the same residue measure. The main idea is to remove hidden scaling factors in the considered data matrix by taking a specific data transformation. We investigate various data transformations including no transformation, double centering, mean centering, standard deviation normalization, and Z-score transformation. Further, we apply these data transformations to row/column dimension of data matrix models with different global/local scaling and global/local shifting factors. First, we characterize the properties of the data transformations on different data matrix models, including six Euclidean co-clustering schemes in Bregman co-clustering algorithms [Banerjee et al., 2007] as well as other existing data models in the literature. In particular, we formally analyze the effect of each data transformation on the two residues [Cho et al., 2004], here called RESIDUE(I) and RESIDUE(II), respectively. Then, we apply all the data transformations to publicly available human cancer gene expression datasets and empirically validate the analysis results by using the minimum sum squared residue co-clustering (MSSRCC) algorithms [Cho et al., 2004]. In conclusion, through column standard deviation normalization or column Z-score transformation, we are able to overcome the shortcoming of RESIDUE(II) in finding scaling patterns and discover both scaling and shifting patterns.

DOWNLOAD tr07-55.pdf

TR-07-56

Rozner, Eric, Anand Padmanabha Iyer, Yogita Mehta, Lili Qiu, and Mansoor Jafry. "ER: Efficient Retransmission Scheme for Wireless LANs". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-56 (regular tech report). October 11, 2007.

NO ABSTRACT

DOWNLOAD NOT AVAILABLE

TR-07-57

Belaramani, Nalini, Mike Dahlin, Amol Nayate, and Jiandan Zheng. "Making Replication Simple with URSA". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-57 (regular tech report). October 23, 2007. 14 pages.

Our goal is to qualitatively simplify developing large scale data replication systems and improving existing ones. To realize this goal, we address a fundamental question: What are the right abstractions for defining replication systems? Our new architecture, URSA, defines (1) mechanisms that define abstractions for storage, communication, and consistency that automatically handle the bookkeeping needed to allow policies to distribute data however they want, (2) a model of policy that cleanly separates safety and liveness concerns, (3) a way to define safety policies leveraging standard consistency algorithms, and (4) a way to define liveness policies using a declarative language. By capturing the right abstractions, URSA dramatically reduces the effort needed to construct a new system, and it facilitates innovation by making it easy to evolve existing systems. For example, we build systems that closely approximate Coda, Bayou, Pangaea, Chain Replication, TRIP, and TierStore using fewer than 200 lines of system-specific policy code for each, and we add significant new features like cooperative caching to Coda and TierStore, small-device support for Bayou, or hierarchy to TRIP using fewer than 10 additional lines.

DOWNLOAD tr07-57.pdf

TR-07-58

Ramadan, Hany E., Christopher J. Rossbach, Owen S. Hofmann, and Emmett Witchel. "Dependence-Aware Transactional Memory". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-58 (regular tech report). October 25, 2007. 11 pages.

Transactional memory is a promising programming model to enable high performance programs with reasonable programmer effort on the parallel architectures favored by modern processor manufacturers. This paper introduces dependence-aware transactions, a new method for maintaining the conflict serializability safety property of memory transactions while allowing significant freedom for an implementation's version management and conflict detection. The paper evaluates two implementations of dependence-aware transactional memory: one all-software implementation and one mostly-hardware implementation. The results ofmicro-benchmarks indicate the promise of dependence-aware transactions to increase the performance of a transactional memory system, with no change in the transactional programming model.

DOWNLOAD tr07-58.pdf

TR-07-59

Lee, Dong-Young and Simon S. Lam. "Efficient and Accurate Protocols for Distributed Delaunay Triangulation Under Churn". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-59 (regular tech report). November 9, 2007, Revised May 16, 2008, Revised September 1, 2008. 14 pages.

We design a new suite of protocols for a set of nodes in d-dimension (d > 1) to construct and maintain a distributed Delaunay triangulation (DT) in a dynamic environment. The join, leave, and failure protocols in the suite are proved to be correct for a single join, leave, and failure, respectively. For a system under churn, it is impossible to maintain a correct distributed DT continually. We define an accuracy metric such that accuracy is 100% if and only if the distributed DT is correct. The suite also includes a maintenance protocol designed to recover from incorrect system states and to improve accuracy. In designing the protocols, we make use of two novel observations to substantially improve protocol efficiency. First, in the neighbor discovery process of a node, many replies to the node's queries contain redundant information. Second, the use of a new failure protocol that employs a proactive approach to recovery is better than the reactive approaches used in prior work. Experimental results show that our new suite of protocols maintains high accuracy for systems under churn and each system converges to 100% accuracy after churning stopped. They are much more efficient than protocols in prior work.

DOWNLOAD tr07-59.pdf

TR-07-60

Li, Yan and Mohamed G. Gouda. "Sources and Monitors: A Trust Model for Peer-to-Peer Networks". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-60 (regular tech report). November 12, 2007.

NO ABSTRACT

DOWNLOAD NOT AVAILABLE

TR-07-61

Zheng, Jiandan, Nalini Belaramani, and Mike Dahlin. "Implementing flexible consistency on PRACTI". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-61 (regular tech report). November 20, 2007.

NO ABSTRACT

DOWNLOAD NOT AVAILABLE

TR-07-62

Sustik, Matyas A. and J S. Moore. "String Searching Over Small Alphabets". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-62 (regular tech report). December 11, 2007. 5 pages.

We propose a new string searching procedure inspired by the Boyer--Moore algorithm. The two key ideas of our improvement are to keep track of all the previously matched characters within the current alignment and not to move the reading position unconditionally to the end of the pattern when a mismatch occurs. The result is an algorithm with increased average shift amounts and a guarantee that any character of the text is read at most once. The algorithm performs especially well when the alphabet size is small. We also discuss and test variants of the original idea aimed at practical implementations.

DOWNLOAD tr07-62.pdf

TR-07-63

Morton, Kristi, David Kitchin, and William Cook. "Orc-X: Combining Orchestrations and XQuery". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-63 (regular tech report). December 6, 2007. 7 pages.

In designing a language for distributed computing, the handling of data and distribution can be viewed as largely orthogonal concerns, as long as the data model supports the communication requirements of the distribution model. This view contrasts strongly with approaches based on distributed objects, which typically enforce a tight coupling of state and behavior. We have previously presented Orc, a language that provides simple but powerful constructs to orchestrate distributed computations. Previous versions of Orc included only simple data types, since these were sufficient to demonstrate the concurrency primitives. However, Orc's communication model is based on web services, which support complex XML documents in addition to simple data types. Thus it is natural to consider XML as an appropriate data model for Orc. We present Orc-X, an extension of the Orc language with an XML data model and XML-specific data management capabilities from XQuery. We demonstrate that Orc-X is well-suited for the application domain of distributed resource management protocols such as the Narada mesh overlay protocol.

DOWNLOAD tr07-63.pdf

TR-07-64

Bond, Michael D. and Kathryn S. McKinley. "Tolerating Memory Leaks". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-64 (regular tech report). December 7, 2007.

NO ABSTRACT

DOWNLOAD NOT AVAILABLE

TR-07-65

Wehrman, Ian, David Kitchin, William R. Cook, and Jayadev Misra. "Properties of the Timed Operational and Denotational Semantics of Orc". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-65 (regular tech report). December 14, 2007. 125 pages.

Orc is a language for structured concurrent programming. Orc provides three powerful combinators that define the structure of a concurrent computation. These combinators support sequential and concurrent execution, and concurrent execution with blocking and termination. Orc is particularly well-suited for task orchestration, a form of concurrent programming with applications in workflow, business process management, and web service orchestration. Orc provides constructs to orchestrate the concurrent invocation of services while managing time-outs, priorities, and failures of services or communication. Previous work on the semantics of Orc has focused on its asynchronous behavior. In this report, we define a relative-time operational semantics of Orc that allows reasoning about delays, which are introduced explicitly by time-based constructs or implicitly by network delays. We develop a number of identities among Orc expressions and define an equality relation that is a congruence. We also develop a denotational semantics for Orc, in which the meaning of an Orc expression is a set of traces. A number of properties of the semantics are shown here, including equivalence of the operational and denotational semantics.

DOWNLOAD tr07-65.pdf

TR-07-66

Batory, Don and Doug Smith. "Finite Map Spaces and Quarks: Algebras of Program Structure". The University of Texas at Austin, Department of Computer Sciences. Report# TR-07-66 (regular tech report). December 20, 2007. 28 pages.

We present two algebras that unify the disparate software composition models of Feature-Oriented Programming and Aspect-Oriented Programming. In one algebra, a finite map space underlies program synthesis, where adding finite maps and modifying their contents are fundamental operations. A second and more general algebra uses quarks, a construct that represents both expressions and their transformations. Special cases of our algebras correspond to existing systems and languages, and thus can serve as a foundation for next-generation tools and languages for feature-based program synthesis.

DOWNLOAD tr07-66.pdf

For help please contact trcenter@cs.utexas.edu