Calvin Lin
    Associate Professor of Computer Sciences
    University of Texas, Austin
 
   
 
Home
Research Interests
Publications
Teaching
Students
Research Summary

  My research takes a broad view of how languages and compilers can improve system performance and programmer productivity. My projects pursue two general directions. The first direction, which is software-driven, attempts to simplify programming by raising the level at which we create and optimize programs. The second direction, which is technology-driven, devises compiler techniques for future silicon technology trends, where, for example, in 35 nm technology the latency across a single chip is expected to be about 30 cycles.

Software-driven projects

  • Broadway: Compiler techniques for optimizing software libraries
  • Compiler support for improving the fault tolerance of supercomputing clusters
  • Java Layers: Language and compiler support for component-based programming

Technology-driven projects

  • Branch prediction for future technologies
  • Compiling for the TRIPS Processor
  • Characterizing performance tradeoffs of future microprocessors

The remainder of this page describes each of these projects in more detail.

Domain-Specific Compiler Optimizations: The Broadway Compiler

  This project seeks to make existing and future libraries more usable, more efficient, and more portable. Our Broadway Compiler is a C compiler that can be configured with domain-specific information to optimize the use and implementation of software libraries. The key to this compiler is a simple annotation language that can express domain-specific information which cannot be inferred through conventional static analysis. Our approach provides a separation of concerns that shields the application programmer from the annotations and that allows programmers to use simple, clean interfaces.

Our new optimizations can be divided into three categories:

  • Domain-specific generalizations of classic optimizations
  • Techniques for optimizing sequences of library calls
  • Machine-specific optimizations
We are also adding support for Java class libraries, which we intend to evaluate by optimizing the Java Swing library.

Publications:

Personnel:
         
Samuel Guyer, Walter Chang, Ben Hardekopf, Teck Tok

Scalable Fault Tolerance for Supercomputing Clusters

  This project seeks to provide fault tolerance for the world's fastest supercomputing clusters. The supercomputers at Los Alamos, Sandia, and Livermore National Laboratories consist of thousands of nodes, which collectively suffer from poor reliability. The mean-time-to-failure (MTTF) of today's 6,000 node systems is about 1.6 hours, and the MTTF for next year's 16,000 node ASCI Q machine is expected to be 20 minutes. This project will customize the Egida library-- which has been implemented underneath MPI-- for the specific needs of scientific computations. Whereas most fault tolerance protocols make worst-case assumptions about the behavior of the participating processes, we will use compilers to analyze the specific behavior of the applications to determine their specific fault-tolerance needs. This project builds on previous work on Egida by Lorenzo Alvisi and Harrick Vin.

Publications:

Personnel:
          Lorenzo Alvisi, Sung-Eun Choi, Alison Norman,

Component-Based Programming with Java Layers

  This project aims to simplify the creation and maintenance of large, complex software, particularly families of related software versions. Java Layers (JL) is a language that accomplishes this goal by extending Java in three ways:
  • JL lets users trivially create applications from components by writing simple type equations.
  • JL provides a simple mechanism for creating components. These components, or layers,are a type of Java class that obey certain restrictions on their interfaces.
  • JL allows programmers to specify design rules, which provide semantic constraints about how layers can be meaningfully composed.
We have recently finished our Java Layers compiler, and we are currently evaluating the language by using it to build a family of Widget libraries for diverse platforms that include a PC, PalmOS, and cell phones. A key scientific question is whether a domain-independent language such as JL can achieve the same level of performance as domain-specific languages, which can hard-code domain-specific optimizations into their compilers. Thus, we are exploring ways to allow domain-specific information to be cleanly specified in JL, and we are exploring ways to use such information in the compilation process.

JL is an instance of Batory's GenVoca model of software generators. For more information on GenVoca, see Don Batory's research page.

Publications:

Project Page:
          Java Layers Research Page

Personnel:
          Rich Cardone, Adam Brown

Branch Prediction for Future Technologies

  This project explores ways to widen the interface between compilers and hardware. We are currently focusing on branch prediction, with a particular eye towards the constraints of future technologies. As latencies within a chip become increasingly significant, there is a growing tradeoff between branch predictor accuracy and predictor delay: Complex predictors with large tables can achieve high accuracy, thereby increasing IPC, but large tables lead to either multiple cycle access or to limits on clock rate, both of which reduce IPC. We are studying various mechanisms for dealing with delay and accuracy:
  • Hierarchical predictor organizations
  • Neural systems that replace two-bit counters
  • Techniques that shift some of the burden of branch prediction to the compiler, including a novel way of encoding in each branch instruction a boolean function that remove the tables altogether

Publications:

Personnel:
          Daniel Jiménez, Heather Hanson, Stephen W. Keckler

Compiling for the Adaptive TRIPS Microprocessor

  The vertically integrated TRIPS project is motivated by the increasing intra-chip latencies of silicon technology, where in 35nm technology with aggressive clock rates, it is expected to take 30 cycles to cross a chip. Two key aspects of the TRIPS project are (1) a novel microarchitecture that is designed to scale to 35nm technology, and (2) a highly adaptive computing system that provides an unprecedented level of feedback, reconfiguration, and integration between the hardware and software.

The TRIPS compiler project focuses on three issues. First, we are studying code scheduling strategies for the TRIPS Grid Processor, which exposes the latencies amongst a 2D grid of ALU's, and between the ALU's and the memory system. Second, we are studying ways to create efficient libraries for this flexible piece of hardware. The goal is to provide a number of LibOS's (Library Operating Systems), which are each tailored for a different hardware configuration by using the Broadway approach to customizing libraries. Finally, we are studying ways that a compiler can create self-adaptive programs that can dynamically negotiate for resources. We are thus developing techniques for identifying phase behavior with respect to various types of behavior, including memory access, communication, and CPU utilization. We are also developing methods of estimating resource requirements for each phase in a program.

Publications:

Personnel:
          Ben Rister, Kathryn McKinley

Other TRIPS Personnel:
  Lorenzo Alvisi, Doug Burger, Mike Dahlin, Mootaz Elnozahy, Peter Hofstee, Lizzy John, Steve Keckler, Charles Lefurgy, Harrick Vin

Characterizing Performance Tradeoffs of Modern Microprocessors

  As memory latencies continue to grow relative to processor speeds, microprocessors will rely on increasingly sophisticated latency-hiding mechanisms. These new hardware features present new challenges and tradeoffs for compilers. For example, loop fusion and array contraction, which typically improve performance by improving locality, can decrease performance when hardware prefetching units are provided. This project studies such tradeoffs and develops new compilation strategies for future microprocessors, as exemplified by the IBM Power 4. Our goals are (1) to develop cost models that capture the effects of modern hardware, including cache effects and sophisticated prefetching hardware, and (2) to use these models to guide new compilation strategies.

Publications:

Personnel:
          Ibrahim Hur, Kathryn McKinley

C-Breeze Compiler Infrastructure

  C-Breeze is a compiler infrastructure that's intended to be small and easy to use, both as a teaching tool and a research tool. C-Breeze is currently a source-to-source translator for ANSI C (ISO 9899), but we are in the process of creating a PowerPC back-end, and we are adding support for object oriented optimizations so we can compile Java programs (producing Java source as output).

C-Breeze is written in C++ and is organized as a series of phases. Phases are provided to parse source code into an AST, to dismantle the AST to a low level intermediate representation, to build a control flow graph, and to convert to SSA form. Various other analysis and optimization phases are available, such as a context-sensitive interprocedural pointer analysis, PRE using SSA form, a configurable interprocedural dataflow analysis engine, and more mundane phases such as liveness analysis and constant propagation. New phases can be added by using the Visitor design pattern through built-in phases to traverse and change the AST.

We plan to release this software shortly. See the C-Breeze Home Page for more information.

Personnel:
  Samuel Guyer, Daniel Jiménez, Mike Kistler, Viktor Lapinskii, Michael Michael, Teck Tok

Other Projects:

  Last updated: December 4, 2006