TRIPS Technical Overview
The TRIPS project has developed technology scalable processor and memory system technologies for nanoscale microprocessor chips. These technologies are intended to mitigate increasing on-chip communication latency, to provide power efficiency and reduce design complexity for high-performance systems, and to provide programmers with familiar instruction execution models.
Key Technologies and Innovations
TRIPS Hardware and Software
To enable scalable and distributed processor cores, the TRIPS team developed Explicit Data Graph Execution (EDGE) architectures and has implemented the architecture in a custom ASIC TRIPS prototype chip. Unlike traditional processor architectures that operate at the granularity of a single instruction, EDGE ISAs support large graphs of computation mapped to a flexible hardware substrate, with instructions in each graph communicating directly with other instructions, rather than going through a shared register file. This capability not only reduces design complexity, but amortizes and execution overheads over a large graph of instructions.
The TRIPS microarchitecture is fundamentally distributed and composed of tiles communicating via control and operand networks. The implementation includes protocols that enable the disparate tiles to act cohesively as a single high-performance processor. The TRIPS team has also developed a scalable on-chip memory system which is composed of multiple memory banks connected via a high-bandwidth on-chip network. The memory banks can be configured to operate as a non-uniform cache (NUCA), a novel scalable on-chip memory system developed by the TRIPS team.
The TRIPS processor executes code generated by a custom compiler from sequential C or Fortran programs. The compiler includes algorithms designed to create large blocks that can execute atomically, according to the EDGE specifications. In addition, the compiler includes a spatial instruction scheduler which places instructions to be executed on the distributed execution substrate such that communication latency and contention among the tiles are minimized.
In 2003, the TRIPS team began the implementation of a prototype system including a custom ASIC, custom system boards, and custom software tools. First silicon was delivered on September 27, 2006. Each TRIPS chip contains two scalable processor cores, each of which can execute up to 16 instructions per cycle. The prototype system can be scaled up to 32 processor chips for a peak performance approaching 500 gigaflops. The team will use the prototype to demonstrate the end-to-end application capabilities of EDGE hardware and software, to identify performance bottlenecks in the architecture, and to continue to develop and refine algorithms in the compiler.
TRIPS is designed to be a general purpose architecture that performs well across a wide range of applications. The current application suite includes desktop/workstation applications (SPEC), embedded applications (EEMBC), and signal processing applications. The TRIPS team is currently tuning compiler algorithms and adding more applications to the test suite.
The TRIPS processor has been in development for three years, and is currently funded by the Defense Advanced Research Projects Agency's (DARPA). Polymorphous Computing Architectures program. The TRIPS team consists of 30 faculty members, research scientists, graduate students, undergraduates, and post docs, and is led by Professors Doug Burger and Stephen W. Keckler. The TRIPS compiler effort is led by Professor Kathryn McKinley.