Research Interests

I am interested in compilers and architecture. In particular, I am interested in how best (in terms of programmability, performance, power, reliability and design simplicity) to distribute responsibilities between the microarchitecture and the compiler with respect to dependences, speculation, and parallelization.

Current Project

My dissertation is about finding bugs quickly in concurrent programs while still providing coverage guarantees. We combine partial-order reduction, a technique for reducing the size of the state space while preserving coverage, with bounded search, a technique to prioritize executions likely to manifest bugs. By combining these approaches, we guide a model checker towards errors quickly, while still providing coverage guarantees.

Previous Projects

I previously worked on spatial instruction scheduling for TRIPS, an EDGE architecture with static placement, dynamic issue (SPDI) scheduling. An SPDI scheduler is responsible only for the placement of instructions. It does not order the instructions, although it does reason about their expected issue order to guide placement decisions.

The instruction scheduler maps instructions in a dataflow graph to a partitioned substrate that exposes communication latencies to the compiler. The key challenge for a spatial instruction scheduler is balancing latency and concurrency. The scheduler must expose concurrency to improve performance, yet must be careful that doing so does not increase communication overheads to such an extent that the benefits of concurrency are lost.

I implemented an instruction scheduler that can place instructions on any physical topology. To support a new topology, the scheduler requires an interface implementation that describes the communication latencies for that topology. The scheduler currently supports the TRIPS prototype ISA, the TRIPS prototype microarchitecture, and the TFlex microarchitecture, which fully partitions the register file and data cache banks and must support anywhere from 1 to 32 cores.

The scheduler converts a TRIPS intermediate language (TIL) file into a TRIPS assembly language (TASL) file. It converts the TIL from operand format into target format, constructing a dataflow graph for the instructions within each hyperblock, and also constructing a hyperblock flow graph, which represents control flow between hyperblocks for global analysis.

Before placing instructions, the scheduler inserts software fanout trees as necessary to send values to many consumers. The scheduler attempts to form these software fanout trees in a way that minimizes the critical path length through the program by assigning more critical instructions to locations in the software fanout tree with a shorter dependence height. We use a Huffman-like algorithm similar to the scheme developed by Hartley and Casavant to build software fanout trees in a way that is sensitive to critical path length.

After fanout insertion, the scheduler traverses each hyperblock flow graph in reverse post-order, placing the instructions within each block using the Spatial Path Scheduling algorithm.

Microsoft Research Internship, Summer 2009

In Summer 2009 I worked with Madan Musuvathi at Microsoft Research on CHESS, a stateless model-checker for concurrent software. I completed work on best-first search that I began in summer 2008, and submitted it for publication. I began working on combining partial-order reduction with bounded search, and I did preliminary research on memory consistency models for cloud computing.

Microsoft Research Internship, Summer 2008

In Summer 2008 I worked with Shaz Qadeer and Madan Musuvathi at Microsoft Research to improve CHESS, a stateless model-checker for concurrent software. I implemented a best-first search to allow CHESS to find bugs more quickly, and created several cost functions to help identify executions likely to cause errors. I developed a space-efficient, parallelizable representation for the search space to reduce the overheads of a best-first search.

Microsoft Research Internship, Fall 2007

In Fall 2007 I did an intership at Microsoft Research, where I worked with Jim Larus on using data race detection in a software transactional memory system. I implemented Goldilocks, a sound and complete dynamic data race detector with support for transactional memory to eliminate the need to support strong atomicity for scenarios in which a data race exists. I also collaborated with Shaz Qadeer during this internship.


Spring 2008

CS395T - Computation Orchestration

WGS393 - Gender, Technology, and Information

CS395 - Conference Course

PED102G - Basic Scuba Diving

Fall 2007

CS395 - Conference Course

Spring 2007

CS395T - Hardware and Software for Multicore Systems

EE382V - Computer Architecture: User System Interplay

CS395T - Data Mining: A Statistical Learning Perspective

CS395 - Conference Course

Fall 2006

CS388L - Introduction to Mathematical Logic

CS380C - Compilers

CS395 - Conference Course

Spring 2006

CS395T - Fine-grained Parallelism

CS395T - High-performance and Pasallel Computing

CS395 - Conference Course

Fall 2005

CS381K - Artificial Intelligence

CS382M - Advanced Computer Architecture

CS395 - Conference Course

The University of Texas at Austin