UTCS Colloquium/Architecture: Shubu Mukherjee/Intel Fault Screeners ACES 2.402 Thursday March 27 2008 3:30 p.m.

Contact Name: 
Jenna Whitney
Date: 
Mar 27, 2008 3:30pm - 4:30pm

There is a signup schedule for this event (UT EID required).

Type of Talk: UTCS Colloquium/Architecture

Speaker/Affiliation: S

hubu Mukherjee/Intel

Date/Time: Thursday March 27 2008 3:30 p.m.<

br>
Location: ACES 2.402

Host: Doug Burger

Talk Title:

Fault Screeners

Talk Abstract:
Fault screeners are a new breed of
fault identification technique
that can probabilistically detect if a

transient fault has affected the
state of a processor. We demonstrate

that fault screeners
function because of two key characteristics. First
we show that
much of the intermediate data generated by a program inh

erently
falls within certain consistent bounds. Second we observe that

these bounds are often violated by the introduction of a fault. Thus

fault screeners can identify faults by directly watching for data
i

nconsistencies arising in an application''s behavior.

We present an
idealized algorithm capable of identifying over 85% of
injected faults

on the SpecInt suite and over 75% on average overall.
Further in a real

istic implementation on a simulated Pentium-III-like
processor about ha

lf of the errors due to injected faults are
identified while still in sp

eculative state. Errors detected this early
can be eliminated by a pipel

ine flush. In this talk we present a
hardware-based version of this scr

eening algorithm and show that its
implementation reduces overall perfor

mance by less than 1%.

Speaker Bio:
Shubu Mukherjee is a Principa

l Engineer and Director of Intel''s
SPEARS Group (Simulation and Pathfi

nding of Efficient and Reliable
Systems). The SPEARS Group is responsi

ble for spearheading
architectural change and innovation in the deliver

y of enterprise
processors and chipsets by building and supporting simu

lation and
analytical models of performance power and reliability. Dr

. Mukherjee
is widely recognized both within and outside Intel as one o

f the experts
on architecture design for soft errors. He has made pione

ering
contributions towards the design of Redundant Multithreading (RMT

)
techniques architectural vulnerability modeling for soft errors crea

tion
of performance modeling infrastructure called Asim (jointly with D

r.
Joel Emer) design of the Alpha 21364 interconnection network and <

br>the creation of the first shared memory prediction scheme.

Prior
to joining Intel Shubu worked in Compaq for 3 years and Digital
Equipm

ent Corporation for 10 days. Dr. Mukherjee received his B.Tech.
from the
Indian Institute of Technology Kanpur and M.S. and PhD from
the Univer

sity of Wisconsin-Madison. He was the General Chair of ASPLOS
(Architect

ural Support for Programming Languages and Operating Systems)
2004. He

has co-authored over 40 external papers. He holds 8 patents and
has file

d over 30 more in Intel. Dr. Mukherjee''s book titled Architecture Design

for Soft Errors just appeared in the market.