Publications


Pacer: Proportional Detection of Data Races

Michael D. Bond, Katherine E. Coons, and Kathryn S. McKinley
ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2010), Toronto, June 2010

Paper (for now, the submission is available as a tech report):   PDF

Source code:

The source code will be available from the Jikes RVM Research Archive when the final paper is ready (March or April). Let me know if you'd like a preliminary version of the source code.
Abstract:

Data races indicate serious concurrency bugs such as order, atomicity, and sequential consistency violations. Races are difficult to find and fix, often manifesting only in deployment. The frequency of these bugs will likely increase as software adds parallelism to exploit multicore hardware. Unfortunately, sound and precise race detectors slow programs by factors of eight or more, which is too expensive to deploy.

This paper presents a precise, low-overhead sampling-based data race detector called Pacer. Pacer makes a proportionality guarantee: it detects any race at a rate equal to the sampling rate, by finding races whose first access occurs during a global sampling period. During sampling, Pacer tracks all accesses using the sound and precise FastTrack algorithm. In non-sampling periods, Pacer discards sampled access information that cannot be part of a reported race, and Pacer simplifies tracking of the happens-before relationship, yielding near-constant, instead of linear, overheads. Experimental results confirm our design claims: time and space overheads scale with the sampling rate, and sampling rates of 1-3% yield overheads low enough to consider in production software. Furthermore, our results suggest Pacer reports each race at a rate equal to the sampling rate. The resulting system provides a "get what you pay for" approach to race detection that is suitable for identifying real, hard-to-reproduce races in deployed systems.



Breadcrumbs: Efficient Context Sensitivity for Dynamic Bug Detection Analyses

Michael D. Bond, Graham Z. Baker, and Samuel Z. Guyer
ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2010), Toronto, June 2010

Paper (for now, the submission is available as a tech report):   PDF

Source code:

The source code will be available from the Jikes RVM Research Archive when the final paper is ready (March or April). Let me know if you'd like a preliminary version of the source code.
Abstract:

Calling context – the set of active methods on the stack – is critical for understanding the dynamic behavior of large programs. Dynamic program analysis tools, however, are almost exclusively context insensitive because of the prohibitive cost of representing calling contexts at run time. Deployable dynamic analyses, in particular, are limited to reporting only static program locations.

This paper presents Breadcrumbs, an efficient technique for recording and reporting dynamic calling contexts. It builds on an existing technique for computing a compact (one word) encoding of each calling context that client analyses can use in place of a program location. The key feature of our system is a search algorithm that can reconstruct a calling context from its encoding using only a static call graph and a small amount of dynamic information collected in cold methods. Breadcrumbs requires no offline training or program modifications, and handles all language features, including dynamic class loading. On average, it adds 10% to 20% overhead to existing dynamic analyses, depending on how much additional information it collects: more information slows down execution, but improves the decoding algorithm.

We use Breadcrumbs to add context sensitivity to two dynamic analyses: a race detector and an analysis that identifies the origins of null pointer exceptions. Our system can reconstruct nearly all of the contexts for the reported bugs in a few seconds. These calling contexts are non-trivial, and they significantly improve both the precision of the analyses and the quality of the bug reports.



Leak Pruning

Michael D. Bond and Kathryn S. McKinley
14th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2009), Washington, DC, March 2009

Paper:  PDF   PS

Talk:  PPT (version 2007)   PPT (versions 97-2003)   PDF

Source code:

Available for download from the Jikes RVM Research Archive
Abstract:

Managed languages improve programmer productivity with type safety and garbage collection, which eliminate memory errors such as dangling pointers, double frees, and buffer overflows. However, because garbage collection uses reachability to over-approximate live objects, programs may still leak memory if programmers forget to eliminate the last reference to an object that will not be used again. Leaks slow programs by increasing collector workload and frequency. Growing leaks eventually crash programs.

This paper introduces leak pruning, which keeps programs running by predicting and reclaiming leaked objects at run time. It predicts dead objects and reclaims them based on observing data structure usage patterns. Leak pruning preserves semantics because it waits for heap exhaustion before reclaiming objects and poisons references to objects it reclaims. If the program later tries to access a poisoned reference, the virtual machine (VM) throws an error. We show leak pruning has low overhead in a Java VM and evaluate it on 10 leaking programs. Leak pruning does not help two programs, executes five substantial programs 1.6-81X longer, and executes three programs, including a leak in Eclipse, for at least 24 hours. In the worst case, leak pruning defers fatal errors. In the best case, it keeps leaky programs running with preserved semantics and consistent throughput.



Laminar: Practical Fine-Grained Decentralized Information Flow Control

Indrajit Roy, Donald E. Porter, Michael D. Bond, Kathryn S. McKinley, and Emmett Witchel
ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2009), Dublin, June 2009

Paper:  PDF   PS

Talk (by Indrajit Roy):   PPT

Source code:

The JVM and OS source are available for download from the Jikes RVM Research Archive.
Abstract:

Decentralized information flow control (DIFC) is a promising model for writing programs with powerful, end-to-end security guarantees. Current DIFC systems that run on commodity hardware can be broadly categorized into two types: language-level and operating system-level DIFC. Language level solutions provide no guarantees against security violations on system resources, like files and sockets. Operating system solutions can mediate accesses to system resources, but are inefficient at monitoring the flow of information through fine-grained program data structures.

This paper describes Laminar, the first system to implement decentralized information flow control using a single set of abstractions for OS resources and heap-allocated objects. Programmers express security policies by labeling data with secrecy and integrity labels, and then access the labeled data in lexically scoped security regions. Laminar enforces the security policies specified by the labels at runtime. Laminar is implemented using a modified Java virtual machine and a new Linux security module. This paper shows that security regions ease incremental deployment and limit dynamic security checks, allowing us to retrofit DIFC policies on four application case studies. Replacing the applications' ad-hoc security policies changes less than 10% of the code, and incurs performance overheads from 1% to 56%. Whereas prior DIFC systems only support limited types of multithreaded programs, Laminar supports a more general class of multithreaded DIFC programs that can access heterogeneously labeled data.



Tolerating Memory Leaks

Michael D. Bond and Kathryn S. McKinley
23rd Annual International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2008), Nashville, October 2008

Paper:  PDF   PS

Talk:  PPT (version 2007)   PPT (versions 97-2003)   PDF

Source code:

Available for download from the Jikes RVM Research Archive
Abstract:

Type safety and garbage collection in managed languages eliminate memory errors such as dangling pointers, double frees, and leaks of unreachable objects. Unfortunately, a program still leaks memory if it maintains references to objects it will never use again. Leaked objects decrease program locality and increase garbage collection frequency and workload. A growing leak will eventually exhaust memory and crash the program.

This paper introduces a leak tolerance approach called Melt that safely eliminates performance degradations and crashes due to leaks of dead but reachable objects in managed languages, given sufficient disk space to hold leaking objects. Melt (1) identifies stale objects that the program is not accessing; (2) segregates in-use and stale objects by storing stale objects to disk; and (3) preserves safety by activating stale objects if the program subsequently accesses them. We design and build a prototype implementation of Melt in a Java VM and show it adds overhead low enough for production systems. Whereas existing VMs grind to a halt and then crash on programs with leaks, Melt keeps many of these programs running much longer without significantly degrading performance. Melt provides users the illusion of a fixed leak and gives developers more time to fix leaky programs.



Probabilistic Calling Context

Michael D. Bond and Kathryn S. McKinley
22nd Annual International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2007), Montreal, October 2007

Paper:  PDF   PS

Talk:  PPT   PDF

Source code:

Available for download from the Jikes RVM Research Archive

Jones and Ryder used our implementation for context-sensitive allocation sites in their ISMM 2008 paper.

Abstract:

Calling context enhances program understanding and dynamic analyses by providing a rich representation of program location. Compared to imperative programs, object-oriented programs use more interprocedural and less intraprocedural control flow, increasing the importance of context sensitivity for analysis. However, prior online methods for computing calling context, such as stack-walking or maintaining the current location in a calling context tree, are expensive in time and space. This paper introduces a new online approach called probabilistic calling context (PCC) that continuously maintains a probabilistically unique value representing the current calling context. For millions of unique contexts, a 32-bit PCC value has few conflicts. Computing the PCC value adds 3% average overhead to a Java virtual machine. PCC is well-suited to clients that detect new or anomalous behavior since PCC values from training and production runs can be compared easily to detect new context-sensitive behavior; clients that query the PCC value at every system call, Java utility call, and Java API call add 0-9% overhead on average. PCC adds space overhead proportional to the distinct contexts stored by the client (one word per context). Our results indicate PCC is efficient and accurate enough to use in deployed software for residual testing, bug detection, and intrusion detection.



Tracking Bad Apples: Reporting the Origin of Null and Undefined Value Errors

Michael D. Bond, Nicholas Nethercote, Stephen W. Kent, Samuel Z. Guyer, and Kathryn S. McKinley
22nd Annual International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2007), Montreal, October 2007

Paper:  PDF   PS

Talk:  PPT   PDF

Source code:

Both implementations are publicly available:
Bug suite:
We evaluated origin tracking of null references in Java using 12 null pointer exceptions. We've made them available in the Bad Apples Suite. The suite automatically downloads, installs, and runs the buggy programs with NPE-inducing inputs.
Abstract:

Programs sometimes crash due to unusable values, for example, when Java and C# programs dereference null pointers and when C and C++ programs use undefined values to affect program behavior. A stack trace produced on such a crash identifies the effect of the unusable value, not its cause, and is often not much help to the programmer.

This paper presents efficient origin tracking of unusable values; it shows how to record where these values come into existence, correctly propagate them, and report them if they cause an error. The key idea is value piggybacking: when the original program stores an unusable value, value piggybacking instead stores origin information in the spare bits of the unusable value. Modest compiler support alters the program to propagate these modified values through operations such as assignments and comparisons. We evaluate two implementations: the first tracks null pointer origins in a JVM, and the second tracks undefined value origins in a memory-checking tool built with Valgrind. These implementations show that origin tracking via value piggybacking is fast and often useful, and in the Java case, has low enough overhead for use in a production environment.



Correcting the Dynamic Call Graph Using Control Flow Constraints

Byeongcheol Lee, Kevin Resnick, Michael D. Bond, and Kathryn S. McKinley
16th International Conference on Compiler Construction (CC 2007), Braga, Portugal, March 2007

Paper:  PDF  PS  (A longer tech report version includes proofs and algorithms.)

Talk (by Byeongcheol Lee):   PPT

Source code:

Available for download from the Jikes RVM Research Archive
Abstract:

To reason about programs, dynamic optimizers and analysis tools use sampling to collect a dynamic call graph (DCG). However, sampling has not achieved high accuracy with low runtime overhead. As object-oriented programmers compose increasingly complex programs, inaccurate call graphs will inhibit analysis and optimizations. This paper demonstrates how to use static and dynamic control flow graph (CFG) constraints to improve the accuracy of the DCG. We introduce the frequency dominator (FDOM), a novel CFG relation that extends the dominator relation to expose static relative execution frequencies of basic blocks. We combine conservation of flow and dynamic CFG basic block profiles to further improve the accuracy of the DCG. Together these approaches add minimal overhead (1%) and achieve 85% accuracy compared to a perfect call graph for SPEC JVM98 and DaCapo benchmarks. Compared to sampling alone, accuracy improves by 12 to 36%. These results demonstrate that static and dynamic control-flow information offer accurate information for efficiently improving the DCG.



Bell: Bit-Encoding Online Memory Leak Detection

Michael D. Bond and Kathryn S. McKinley
12th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-XII), San Jose, October 2006

Paper:  PDF   PS

Talk:  PPT  PDF

Source code:

Available for download from the Jikes RVM Research Archive

Tang, Gao, and Qin modified our implementation for their USENIX 2008 paper.

Abstract:

Memory leaks compromise availability and security by crippling performance and crashing programs. Leaks are difficult to diagnose because they have no immediate symptoms. Online leak detection tools benefit from storing and reporting per-object sites (e.g., allocation sites) for potentially leaking objects. In programs with many small objects, per-object sites add high space overhead, limiting their use in production environments.

This paper introduces Bit-Encoding Leak Location (Bell), a statistical approach that encodes per-object sites to a single bit per object. A bit loses information about a site, but given sufficient objects that use the site and a known, finite set of possible sites, Bell uses brute-force decoding to recover the site with high accuracy.

We use this approach to encode object allocation and last-use sites in Sleigh, a new leak detection tool. Sleigh detects stale objects (objects unused for a long time) and uses Bell decoding to report their allocation and last-use sites. Our implementation steals four unused bits in the object header and thus incurs no per-object space overhead. Sleigh's instrumentation adds 29% execution time overhead, which adaptive profiling reduces to 11%. Sleigh's output is directly useful for finding and fixing leaks in SPEC JBB2000 and Eclipse, although sufficiently many objects must leak before Bell decoding can report sites with confidence. Bell is suitable for other leak detection approaches that store per-object sites, and for other problems amenable to statistical per-object metadata.



Continuous Path and Edge Profiling

Michael D. Bond and Kathryn S. McKinley
38th International Symposium on Microarchitecture (MICRO-38), Barcelona, November 2005

Paper:  PDF  PS

Talk:  PPT  PDF

Source code:

Available for download from the Jikes RVM Research Archive
Abstract:

Microarchitectures increasingly rely on dynamic optimization to improve performance in ways that are difficult or impossible for ahead-of-time compilers. Dynamic optimizers in turn require continuous, portable, low cost, and accurate control-flow profiles to inform their decisions, but prior approaches have struggled to meet these goals simultaneously.

This paper presents PEP, a hybrid instrumentation and sampling approach for continuous path and edge profiling that is efficient, accurate, and portable. PEP uses a subset of Ball-Larus path profiling to identify paths with low overhead, and uses sampling to mitigate the expense of storing paths. PEP further reduces overhead by using profiling to guide instrumentation placement. PEP improves profile accuracy with a modified version of Arnold-Grove sampling. The resulting system has 1.2% average and 4.3% maximum overhead, 94% path profile accuracy, and 96% edge profile accuracy on a set of Java benchmarks.



Practical Path Profiling for Dynamic Optimizers

Michael D. Bond and Kathryn S. McKinley
3rd International Symposium on Code Generation and Optimization (CGO-3), San Jose, March 2005

Paper:  PDF  PS

Talk:  PPT  PDF

Source code:

Available as part of the Scale compiler. Please contact me if you're interested in using the same version we used in the paper.

Vaswani, Nori, and Chilimbi modified our path profiling implementation for their POPL 2007 and FSE 2007 papers.

Abstract:

Modern processors are hungry for instructions. To satisfy them, compilers need to find and optimize execution paths across multiple basic blocks. Path profiles provide this context, but their high overhead has so far limited their use by dynamic compilers. We present new techniques for low overhead online practical path profiling (PPP). Following targeted path profiling (TPP), PPP uses an edge profile to simplify path profile instrumentation (profile-guided profiling). PPP improves over prior work by (1) reducing the amount of profiling instrumentation on cold paths and paths that the edge profile predicts well and (2) reducing the cost of the remaining instrumentation.

Experiments in an ahead-of-time compiler perform edge profile-guided inlining and unrolling prior to path profiling instrumentation. These transformations are faithful to staged optimization, and create longer, harder to predict paths. We introduce the branch-flow metric to measure path flow as a function of branch decisions, rather than weighting all paths equally as in prior work. On SPEC2000, PPP maintains high accuracy and coverage, but has only 5% overhead on average (ranging from -3% to 13%), making it appealing for use by dynamic compilers.



Targeted Path Profiling: Lower Overhead Path Profiling for Staged Dynamic Optimization Systems

Rahul Joshi, Michael D. Bond, and Craig Zilles
2nd International Symposium on Code Generation and Optimization (CGO-2), Palo Alto, March 2004

Best student presenter

Paper:  PDF

Talk:  PPT  PDF

Source code:

Subsumed by the Practical Path Profiling source code. Please contact me if you're interested in the original Targeted Path Profiling implementation.
Abstract:

In this paper, we present a technique for reducing the overhead of collecting path profiles in the context of a dynamic optimizer. The key idea to our approach, called Targeted Path Profiling (TPP), is to use an edge profile to simplify the collection of a path profile. This notion of profile-guided profiling is a natural fit for dynamic optimizers, which typically optimize the code in a series of stages.

TPP is an extension to the Ball-Larus Efficient Path Profiling algorithm. Its increased efficiency comes from two sources: (i) reducing the number of potential paths by not enumerating paths with cold edges, allowing array accesses to be substituted for more expensive hash table lookups, and (ii) not instrumenting regions where paths can be unambiguously derived from an edge profile. Our results suggest that on average the overhead of profile collection can be reduced by half (SPEC95) to almost two-thirds (SPEC2000) relative to the Ball-Larus algorithm with minimal impact on the information collected.



Efficient, Context-Sensitive Detection of Semantic Attacks

Michael D. Bond, Varun Srivastava, Kathryn S. McKinley, and Vitaly Shmatikov
UT Austin Computer Sciences Technical Report TR-09-14, 2009

Tech report:   PDF

Abstract:

Software developers are increasingly choosing memory-safe languages such as Java because they help deploy higher-quality software faster. As a result, semantic vulnerabilities – omitted security checks, misconfigured security policies, and other software design errors – are supplanting memory-corruption exploits as the primary cause of security violations. Semantic attacks are difficult to detect because they do not violate underlying program semantics.

Dynamic anomaly detection identifies unusual program behavior and thus offers a potential defense against semantic attacks. We have reproduced and evaluated several real-world semantic exploits that target subtle bugs in real Java applications and libraries. Some of these attacks involve code paths that appear anomalous only if program calling context and history are tracked. While greater context and history sensitivity enables detection of these attacks, very high sensitivity leads to many benign behaviors appearing anomalous (false positives).

We present Pecan, a defense against semantic attacks based on dynamic anomaly detection. Pecan efficiently tracks program behavior using depth-limited context and history sensitivity, and thus enables precise detection of unusual behaviors associated with security violations without reporting many false positives. We extend prior work on efficient dynamic context sensitivity to achieve efficient depth-limited context sensitivity. Pecan adds overhead low enough for deployed use, and with a particular configuration of context and history sensitivity, it can detect all the attacks, while reporting few false positives during benign executions.



Diagnosing and Tolerating Bugs in Deployed Systems

Michael David Bond
Doctoral Dissertation, Department of Computer Sciences, The University of Texas at Austin, December 2008

ACM SIGPLAN Outstanding Doctoral Dissertation Award (citation)

Document:

Official PDF
PDF with 2 pages on a page
PDF with 4 pages on a page
Abstract:

Deployed software is never free of bugs. These bugs cause software to fail, wasting billions of dollars and sometimes causing injury or death. Bugs are pervasive in modern software, which is increasingly complex due to demand for features, extensibility, and integration of components. Complete validation and exhaustive testing are infeasible for substantial software systems, and therefore deployed software exhibits untested and unanalyzed behaviors.

Software behaves differently after deployment due to different environments and inputs, so developers cannot find and fix all bugs before deploying software, and they cannot easily reproduce post-deployment bugs outside of the deployed setting. This dissertation argues that post-deployment is a compelling environment for diagnosing and tolerating bugs, and it introduces a general approach called post-deployment debugging. Techniques in this class are efficient enough to go unnoticed by users and accurate enough to find and report the sources of errors to developers. We demonstrate that they help developers find and fix bugs and help users get more functionality out of failing software.

To diagnose post-deployment failures, programmers need to understand the program operations – control and data flow – responsible for failures. Prior approaches for widespread tracking of control and data flow often slow programs by two times or more and increase memory usage significantly, making them impractical for online use. We present novel techniques for representing control and data flow that add modest overhead while still providing diagnostic information directly useful for fixing bugs. The first technique, probabilistic calling context (PCC), provides low-overhead context sensitivity to dynamic analyses that detect new or anomalous deployed behavior. Second, Bell statistically correlates control flow with data, and it reconstructs program locations associated with data. We apply Bell to leak detection, where it tracks and reports program locations responsible for real memory leaks. The third technique, origin tracking, tracks the originating program locations of unusable values such as null references, by storing origins in place of unusable values. These origins are cheap to track and are directly useful for diagnosing real-world null pointer exceptions.

Post-deployment diagnosis helps developers find and fix bugs, but in the meantime, users need help with failing software. We present techniques that tolerate memory leaks, which are particularly difficult to diagnose since they have no immediate symptoms and may take days or longer to materialize. Our techniques effectively narrow the gap between reachability and liveness by providing the illusion that dead but reachable objects do not consume resources. The techniques identify stale objects not used in a while and remove them from the application and garbage collector's working set. The first technique, Melt, relocates stale memory to disk, so it can restore objects if the program uses them later. Growing leaks exhaust the disk eventually, and some embedded systems have no disk. Our second technique, leak pruning, addresses these limitations by automatically reclaiming likely leaked memory. It preserves semantics by waiting until heap exhaustion to reclaim memory – then intercepting program attempts to access reclaimed memory.

We demonstrate the utility and efficiency of post-deployment debugging on large, real-world programs – where they pinpoint bug causes and improve software availability. Post-deployment debugging efficiently exposes and exploits programming language semantics and opens up a promising direction for improving software robustness.