|
Karu's homepage
Research Material
Publications
Current research summary
Future research
Other ideas
Interesting Articles
Academic Genealogy
Personal page
Pronunciation Guide
|
Karthikeyan Sankaralingam
Ph. D. Student
Department of Computer Sciences
The University of Texas at Austin
The University of Wisconsin, Madison - Spring 2007
karu@cs.utexas.edu PGP public key
I am a Ph. D. student at the University of Texas at Austin working in
the CART group, with Professors Steve Keckler
and Doug Burger.
I am one of the lead designers of
the TRIPS processor
microarchitecture and
ISA. My research work includes design of the Grid Processor
Architecture, microarchitecture techniques for supporting various types
of data parallel execution, and design of scalable bypass networks.
My PhD dissertation introduces the concept of polymorphous
architectures and describes polymorphous architectures as a unified
approach for extracting concurrency of different granularities.
Details here
I am now an Assistant Professor at the University of Wisconsin-Madison, Department of Computer Science. Here is my new webpage
Current research
-
PhD Dissertation: Polymorphous Architectures: A Unified Approach for Extracting Concurrency of Different Granularities. [ Abstract ]
-
- The Grid Processor Architecture
[ Paper Abstract ]
- A scalable distributed microarchitecture which provides ultra-wide issue computation, large instruction windows, scales to future wire-delay dominated technologies, and can replace conventional out-of-order superscalar processors. The three main features are:
- Encode dataflow dependences in the ISA to enable direct
instruction-instruction communication and reduce the overheads of
detecting and managing dependencies that conventional out-of-order
processors must pay.
- Partition the program into well-defined
blocks to limit the scope of the dependencies so that the number of
dependence arcs does not become too many to encode in the instruction
space.
- To manage design
complexity and address wire delay scaling and reliability, the
computation core is completely distributed using microarchitecture
control and data networks with only nearest-neighbor links used for
communication.
- Polymorphous architectures - TRIPS
[ Paper Abstract ]
- My dissertation research introduced the concept of a polymorphous processor - one that can adapt to the workload by altering the behavior of coarse microarchitecture units to suite the application. TRIPS prototype chip
ISA, microarchitecture, processor verification and physical design
lead.
- Mechanisms for data level parallelism
[ Paper Abstract ]
- In my dissertation research, I identified fundamental
attributes for classifying DLP programs based on memory behavior,
control structures, and amount of available parallelism. I proposed a
set of universal mechanisms, which can provide architectural
capability for run-time processor customization for any application
sub-domain within DLP.
- Distributed
Pagerank for P2P
Systems
[ Paper Abstract ]
- I developed a
distributed formulation of the Google pagerank algorithm that can be
executed on a Peer-to-Peer network. This work provides the first
solution for ranking documents on P2P networks. Extending this idea to
web servers I showed how to build a completely decentralized search
engine.
-
TRIPS Prototype Processor
[
Paper Abstract ]
-
This paper describes
the control protocols in the TRIPS processor, a distributed, tiled
microarchitecture that supports dynamic execution. It details each of
the five types of reused tiles that compose the processor, the control
and data networks that connect them, and the distributed
microarchitectural protocols that implement instruction fetch,
execution, flush, and commit. We also describe the physical design
issues that arose when implementing the microarchitecture in a 170M
transistor, 130nm ASIC prototype chip.
- GMailRSS
- This is just a neat tool I wrote for using
GMail as an
RSS reader. This is simply a shameless plug for the tool :)
Future research
- Easy to program energy efficient
multi-teraflops architectures.
- Analyze emerging applications for a
world of computing dominated by mobility, ubiquitous internet
connectivity, and highly compact form factors.
- Architectures for post-CMOS technologies to replace CMOS and scale to
multiple generations beyond CMOS (most likely
a technology that relies on something other than charge transport).
Select publications
- Distributed Microarchitectural Protocols in the TRIPS Prototype
Processor, MICRO-39, pdf
-
Universal Mechanisms for Data-Parallel Architectures,
MICRO-36
pdf
-
Exploiting ILP, TLP, and DLP with the Polymorphous TRIPS Architecture,
ISCA-30
pdf
-
A Design Space Evaluation of Grid Processor Architectures,
MICRO-34
pdf
-
Distributed Pagerank for P2P systems,
HPDC-12
pdf
- U.S. patent #6772199, assigned 07/03/2004.
Method and system for enhanced cache efficiency utilizing selective replacement exemption. With T. Keller.
- US patent application filed 10/31/02.
A High-Performance Technology-Scalable Processor Architecture. With R. Nagarajan, D.C. Burger, and S.W. Keckler.
Publications by type
Conference
Journal
Tech reports and Workshop
Other
Full list
Conference publications
- "Distributed Microarchitectural Protocols in the TRIPS Prototype
Processor," K. Sankaralingam, R. Nagarajan, R. McDonald, R. Desikan, S.
Drolia, M. Govindan, P. Gratz, D. Gulati, H. Hanson, C. Kim, H. Liu, N.
Ranganathan, S. Sethumadhavan, S. Sharif, P. Shivakumar, S. W. Keckler,
and D. Burger.
39th International Symposium on Microarchitecture (MICRO), December,
2006.
pdf
bib
-
"Dataflow Predication,"
A. Smith, R. Nagarajan, K. Sankaralingam, R. McDonald, D. Burger, S. W.
Keckler, and K. S. McKinley,
39th International Symposium on Microarchitecture (MICRO), December,
2006.
pdf
bib
-
"Universal Mechanisms for Data-Parallel Architectures",
K. Sankaralingam, S.W.Keckler, W.R. Mark, and D.C. Burger,
36th International Symposium on Microarchitecture (MICRO), December, 2003.
pdf
slides
bib
-
"Routed Inter-ALU Networks for ILP Scalability and Performance,"
K. Sankaralingam, V.A. Singh, S.W. Keckler, and D.C. Burger,
21st International Conference on Computer Design (ICCD), 2003.
pdf
slides
bib
-
"Distributed Pagerank for P2P Systems,"
K. Sankaralingam, S. Sethumadhavan, and J.C. Browne,
12th International Symposium on High Performance Distributed Computing (HPDC), 2003.
pdf
slides
bib
-
"Exploiting ILP, TLP, and DLP with the Polymorphous TRIPS Architecture,"
K. Sankaralingam, R. Nagarajan, H. Liu, C. Kim, J. Huh, D.C. Burger, S.W.
Keckler, and C.R. Moore,
30th Annual International Symposium on Computer Architecture (ISCA), June 2003.
pdf
bib
-
"A Wire-Delay Scalable Microprocessor Architecture for High Performance Systems," S.W. Keckler, D.C. Burger, C.R. Moore, R. Nagarajan, K. Sankaralingam, V. Agarwal, M.S. Hrishikesh, N. Ranganathan, and P. Shivakumar,
2003 International Solid-State Circuits Conference (ISSCC), February, 2003
pdf
bib
-
"A Design Space Evaluation of Grid Processor Architectures,"
R. Nagarajan, K. Sankaralingam, D.C. Burger, and S.W. Keckler,
34th International Symposium on Microarchitecture (MICRO), December, 2001.
pdf
slides
bib
Journals articles
-
"Pagerank Computation and Keyword Search on Distributed Systems and P2P
Networks," K. Sankaralingam, M. Yalamanchi, S. Sethumadhavan, and J. C.
Browne, Journal of Grid Computing, 2003, Volume 1, Issue 3, pp.
291-307 (Invited
Paper)
bib
-
"TRIPS: A polymorphous architecture for exploiting ILP, TLP, and DLP,"
K. Sankaralingam, R. Nagarajan, H. Liu, C. Kim, J. Huh,
N. Ranganathan, D. Burger, S. W. Keckler, R. G. McDonald, and
C. R. Moore, ACM Transactions on Architecture and Code Optimization
(TACO), March 2004 bib
-
"Exploiting ILP, TLP, and DLP with the Polymorphous TRIPS Architecture,"
K. Sankaralingam, R. Nagarajan, H. Liu, C. Kim, J. Huh, D.C. Burger, S.W.
Keckler, and C.R. Moore,
IEEE Micro, Nov/Dec 2003
bib
Tech Reports and Workshop publications
- "Design and Analysis of Routed Inter-ALU Networks for ILP Scalability and Performance," V.A. Singh, K. Sankaralingam, S.W. Keckler, and D.C. Burger,
UT-Austin Computer Sciences Technical Report TR-03-17, February, 2001.
pdf
bib
- "A Technology-Scalable Architecture for Fast Clocks and High ILP,"
K. Sankaralingam, R. Nagarajan, D.C. Burger, S.W. Keckler.
5th Workshop on the Interaction of Compilers and Computer Architecture,
at HPCA-7, January, 2001.
pdf
bib
- "SimpleScalar Simulation of the PowerPC Instruction Set Architecture,"
K. Sankaralingam, R. Nagarajan, S.W. Keckler, and D.C. Burger.
UT-Austin Computer Sciences Technical Report TR-00-04, February, 2001.
pdf
bib
- "Towards an Optimal File Allocation Strategy for SpecWEB99," Tom Keller,
K. Sankaralingam and H. Peter Hofstee,
Workshop on Workload Characterization, September 2000
bib
Other Publications
-
"More on Arbitrary Boundary Packed Arithmetic,"
K. Sankaralingam and R. Sankaralingam,
5th International Conference on High Performance Computing (HiPC), 1998.
pdf
bib
-
"Computer Model of Flamelet Distribution on the Burner Surface
of Composite Solid Propellants," K Sankaralingam and S.R. Chakravarthy,
38th Aerospace Sciences Meeting Conference and Exhibit 2000, Reno, Nevada.
bib
-
"A Computer Model of Flamelet Distribution on the Burning Surface
of a Composite Solid Propellant," K Sankaralignam and S.R. Chakravarthy,
Combustion Science and Technology, 2000, Vol 161, pp. 49-68
bib
Miscellaneous
Some course related projects that I have done.
- Evaluating the Instruction Folding Mechanism
of the PicoJava processor
-
A very simple raytracer Handles
reflection,
refraction and ambient, diffuse, specular lighting for directional and postional
sources. Handles only spheres and planes :-( This is what we needed for
one of the course assignments. The code is not documented and is public
domain.
-
Graph representation: A report I did in
a course in formal verification. When I started out with the project the
goal was to see if it is possible to come up with a representation
more succinct than Binary Decision Diagrams for representing boolean
functions. The project turned out to be a survey of the literature
and I found out that some succinct representations (motivated primarly
by VLSI CAD considerations) can be applied to boolean functions and are
as succinct as BDDs.
-
FM partitioning, placement
and Maze routing: This code does FM paritioning, uses it for doing
standard cell placement. A maze router which can do multi-terminal nets
then does the routing. Works using a kind of restricted file format
specific to VLSI netlists. I wrote this code for a course project and
hence it is not well documented :-( Released as public domain.
-
Ubiquitous computing using smartcards
A project I did, where we motivated the use of smartcards for ubiquitous
computing and developed a software prototype. Code not very well
documented - released as public domain.
Other ideas These are a list of some tools and systems that I
think would be very useful and be interesting to build as well.
- RSS meltdown and the right way to read RSS feeds: I think the
status quo of subscribing to RSS feeds and using some sort of "reader" to
go read these RSS items everyday is headed towards disaster - both the
email style interface of marking items as read and the so called "River
of News" style aggregator are just too clunky. I think the amount of
information is just too much and more often than not the information in
many RSS feeds is either the same or there is a huge amount of overlap.
In my opinion the right way to read RSS feeds is through some sort of
automated recommendation engine, coupled with a relevance matching
algorithm tied to a good user interface. The interface should show all
related RSS items together - and discovery of new blogs and RSS feeds
should be automatic - people shouldn't have to mine and discover these on
their own! Somewhat similar to the Google News idea - but with a cleaner
UI, with an actually working ranking metric for the RSS items.
Google Desktop 2.0 has some of this
in a very rudimentary way.
Google Reader IMHO is a complete disaster. Some of this is very similar
to my idea for news gathering from 4 years ago!
Added Jan 2005
- Book rentals: Increasingly my reading needs aren't quite met
by
the univeristy here, or the public library, and instead by Amazon.com. It
doesn't make a whole lot of sense for me to buy all the books I want to
read. How easy/simple will it be for Amazon to do a la netflix for books.
Hmm - this may be my greatest, biggest money making idea yet! This is so
such a simple, and retrograde, I am surprised no one has done it! Perhaps
there are business model constraints. Amazon's book recommendation
coupled with rentals - could be a good combination. Amazon could make huge
amounts of money recirculating textbooks, fiction, etc. Added June 2004
On second thoughts, apparently such a service exists
(www.booksfree.com), they have a decent collection.
-
Truly anonymous and robust internet access.
Develop a mechanism to truly anonymously publish on the internet.
More here.
Added April 2003
- Automatic news gathering, relevance matching and categorizing.
News articles from all over the web should be periodically
crawled (of the order of 2 to 3 hours) and the articles which
discuss the same topic should be grouped together.
A single webpage should then be created with stories classified
according to area like Business, Sports, Science etc.
Stories which talk about the same topic should be grouped together
under one headline.
Some what like a unix merge of all the news papers updated dynamically.
An option to travel back in time to any instant in history should be
possible. Eventually this will turn out to be an automatic history
generator! I guess Google has made one small step towards
this -
Google currently does something similar now.
Added around June 2001
- Personlized complete record of every spoken word:
I think this is already implementable. Every person carries some sort
of recording device with him(preferrably solid state storage with
no moving parts) which will have a microphone and detect when
conversations take place. Whenever it detects that people are speaking it
starts recording. Current voice recorders can do all of this.
A person carries this with him all the time. Battery life problems need
to be addressed. When the person gets home or to some sort of computer
terminal, the voice recorder is fitted into a docking station that
unloads all of recorded communication into permanent storage. A speech
recognition software then identifies each voice in the communication
and transcripts the audio and stores it in a compressed text format.
I guess there are important privacy issues still to be addressed.
Apart from that such a device is buildable easily using today's
technology.
Added around March 2001.
See also:
Darpa LifeLog,
HP Factoid,
Jim Gray's Digital Immortality
You can contact me at karu@cs.utexas.edu.
PGP public key
|