Karu's homepage

Research Material

Publications

Current research summary

Future research

Other ideas

Interesting Articles

Academic Genealogy

Personal page

Pronunciation Guide

Karthikeyan Sankaralingam
Ph. D. Student
Department of Computer Sciences
The University of Texas at Austin
The University of Wisconsin, Madison - Spring 2007
karu@cs.utexas.edu PGP public key

I am a Ph. D. student at the University of Texas at Austin working in the CART group, with Professors Steve Keckler and Doug Burger. I am one of the lead designers of the TRIPS processor microarchitecture and ISA. My research work includes design of the Grid Processor Architecture, microarchitecture techniques for supporting various types of data parallel execution, and design of scalable bypass networks.

My PhD dissertation introduces the concept of polymorphous architectures and describes polymorphous architectures as a unified approach for extracting concurrency of different granularities. Details here

I am now an Assistant Professor at the University of Wisconsin-Madison, Department of Computer Science. Here is my new webpage

Current research

PhD Dissertation: Polymorphous Architectures: A Unified Approach for Extracting Concurrency of Different Granularities. [ Abstract ]
The Grid Processor Architecture [ Paper Abstract ]
A scalable distributed microarchitecture which provides ultra-wide issue computation, large instruction windows, scales to future wire-delay dominated technologies, and can replace conventional out-of-order superscalar processors. The three main features are:
  1. Encode dataflow dependences in the ISA to enable direct instruction-instruction communication and reduce the overheads of detecting and managing dependencies that conventional out-of-order processors must pay.
  2. Partition the program into well-defined blocks to limit the scope of the dependencies so that the number of dependence arcs does not become too many to encode in the instruction space.
  3. To manage design complexity and address wire delay scaling and reliability, the computation core is completely distributed using microarchitecture control and data networks with only nearest-neighbor links used for communication.
Polymorphous architectures - TRIPS [ Paper Abstract ]
My dissertation research introduced the concept of a polymorphous processor - one that can adapt to the workload by altering the behavior of coarse microarchitecture units to suite the application. TRIPS prototype chip ISA, microarchitecture, processor verification and physical design lead.
Mechanisms for data level parallelism [ Paper Abstract ]
In my dissertation research, I identified fundamental attributes for classifying DLP programs based on memory behavior, control structures, and amount of available parallelism. I proposed a set of universal mechanisms, which can provide architectural capability for run-time processor customization for any application sub-domain within DLP.
Distributed Pagerank for P2P Systems [ Paper Abstract ]
I developed a distributed formulation of the Google pagerank algorithm that can be executed on a Peer-to-Peer network. This work provides the first solution for ranking documents on P2P networks. Extending this idea to web servers I showed how to build a completely decentralized search engine.
TRIPS Prototype Processor [ Paper Abstract ]
This paper describes the control protocols in the TRIPS processor, a distributed, tiled microarchitecture that supports dynamic execution. It details each of the five types of reused tiles that compose the processor, the control and data networks that connect them, and the distributed microarchitectural protocols that implement instruction fetch, execution, flush, and commit. We also describe the physical design issues that arose when implementing the microarchitecture in a 170M transistor, 130nm ASIC prototype chip.
GMailRSS
This is just a neat tool I wrote for using GMail as an RSS reader. This is simply a shameless plug for the tool :)

Future research

  • Easy to program energy efficient multi-teraflops architectures.
  • Analyze emerging applications for a world of computing dominated by mobility, ubiquitous internet connectivity, and highly compact form factors.
  • Architectures for post-CMOS technologies to replace CMOS and scale to multiple generations beyond CMOS (most likely a technology that relies on something other than charge transport).


Select publications

Patents

  • U.S. patent #6772199, assigned 07/03/2004. Method and system for enhanced cache efficiency utilizing selective replacement exemption. With T. Keller.
  • US patent application filed 10/31/02. A High-Performance Technology-Scalable Processor Architecture. With R. Nagarajan, D.C. Burger, and S.W. Keckler.

Publications by type

Conference       Journal       Tech reports and Workshop       Other       Full list

Conference publications

Journals articles

Tech Reports and Workshop publications

Other Publications


Miscellaneous

Some course related projects that I have done.
  • Evaluating the Instruction Folding Mechanism of the PicoJava processor
  • A very simple raytracer Handles reflection, refraction and ambient, diffuse, specular lighting for directional and postional sources. Handles only spheres and planes :-( This is what we needed for one of the course assignments. The code is not documented and is public domain.
  • Graph representation: A report I did in a course in formal verification. When I started out with the project the goal was to see if it is possible to come up with a representation more succinct than Binary Decision Diagrams for representing boolean functions. The project turned out to be a survey of the literature and I found out that some succinct representations (motivated primarly by VLSI CAD considerations) can be applied to boolean functions and are as succinct as BDDs.
  • FM partitioning, placement and Maze routing: This code does FM paritioning, uses it for doing standard cell placement. A maze router which can do multi-terminal nets then does the routing. Works using a kind of restricted file format specific to VLSI netlists. I wrote this code for a course project and hence it is not well documented :-( Released as public domain.
  • Ubiquitous computing using smartcards A project I did, where we motivated the use of smartcards for ubiquitous computing and developed a software prototype. Code not very well documented - released as public domain.

Other ideas

These are a list of some tools and systems that I think would be very useful and be interesting to build as well.

You can contact me at karu@cs.utexas.edu. PGP public key