UTCS Colloquia - Dongyoon Lee, Faculty Candidate, University of Michigan, Ann Arbor, "Holistic System Design for Determinism" ACE 2.302

Contact Name: 
Kate Callard
ACES 2.302
Apr 16, 2013 11:00am - 12:00pm

Signup Schedule: http://apps.cs.utexas.edu/talkschedules/cgi/list_events.cgi

Talk Audience: UTCS Faculty, Grads, Undergrads, Other Interested Parties

Host: Lorenzo Alvisi

Talk Abstract: With the advent of multiprocessor systems, it is now the role of the programmers to explicitly expose parallelism and take advantage of parallel computing resources. However, parallel programming is inherently complex as programmers have to reason about all possible thread interleavings. A deterministic replay system that records and reproduces the execution of parallel programs can serve as a foundation for building many useful tools (e.g., time-travel debugger, fault tolerance system, etc.) by overcoming the inherent non-determinism in multiprocessor systems. While it is well known how to replay uniprocessor systems, it is much harder to provide deterministic replay of shared memory multithreaded programs on multiprocessors because shared memory accesses add a high-frequency source of non-determinism.

I introduce a new insight to deterministic replay that it is sufficient for many replay uses to guarantee only the same output and the final states between the recorded and replayed executions, and thus it is possible to support replay without logging precise shared-memory dependencies. I call this relaxed but sufficient replay guarantee “external determinism” and leverage this observation to build efficient multiprocessor replay systems. In this talk, I will introduce three replay systems: Respec, Chimera, and Rosa. Respec enables software-only deterministic replay at low overhead with operating system support. Chimera leverages static data-race analysis to build an efficient software-only replay solution. Lastly, Rosa provides an ultra-low overhead replay solution with minimal hardware extension.

Speaker Bio: Dongyoon is currently a PhD candidate in the EECS department at the University of Michigan, Ann Arbor. He received the M.S. degree in computer science and engineering from the University of Michigan, Ann Arbor, in 2009 and the B.S. degree in electronic engineering from Seoul National University, Korea, in 2004. He has worked at the intersection of operating systems, computer architecture, and dynamic/static program analysis, with a focus on developing practical solutions to improve programmability, reliability and security of parallel programs. He has been awarded VMware 2012 graduate fellowship, Best Paper at ASPLOS 2011, and Grand Prize in embedded software contest held in Korea.