Cognitive Systems Research Group

[ Description | Meeting Schedule | Paper Readings]

Description

Autonomic computing has tremendous potential to advance computer systems research by simplifying the design, deployment, configuration, and support of large-scale systems. To realize such potential in practice, however, a number of research challenges must be addressed to integrate machine learning with computer systems. The goal of the proposed research is to obtain a fundamental understanding of such challenges and develop effective techniques to address them in the most general way possible. To this end, we will develop prototypes that integrate machine learning and systems for a broad range of very different system applications that span distributed systems, software support, operating systems, networking, and security. By leveraging our expertise across levels of computer systems and across types of machine learning, our research promises to significantly advance the state of the art in all these systems by making them self-tuning, self-correcting, self-reporting, self-managing, and self-protecting. In addition, our effort will help broaden the foundation of machine learning by both developing new techniques and adapting existing techniques to better fulfill the requirements of these real-world autonomic systems. Finally, the techniques that we develop and the lessons learned from our experience will help both ourselves and others to make progress towards fully integrating machine learning and systems.

Our own experience, and the experience of others, shows that machine learning cannot be integrated into systems as a simple black box. This proposal is motivated by the recognition that to realize the goals of autonomic computing, we will need to achieve a much tighter coupling between systems and machine learning in which system designs are adapted to facilitate machine-learning-based control, and in which machine learning techniques are advanced to meet the demands of large-scale systems.

In order to tightly couple systems and machine learning, this project addresses two classes of research challenges: those relating to defining the systems/AI interface, and those pertaining to tailoring machine learning towards autonomic computing.

By careful definition of the systems/AI interface we aim to develop general techniques for designing systems that better support autonomic operation. The key enablers of autonomic operation are (i) to develop expressive representations of system behavior that are efficient to measure; and (ii) to develop techniques to use machine learning models to improve system feedback and interoperability; and (iii) to ensure safe system control so users can trust autonomic operation.

The project aims to advance the state of the art in machine learning by developing new techniques to meet the demands of autonomic computing. In order for machine learning techniques to meet the challenge of autonomic systems: (i) machine learning algorithms need to be modified to address privacy and security issues, and (ii) new machine learning algorithms need to be developed including new reinforcement learning algorithms for sequential decision-making problems and ensemble methods designed to exploit hierarchical feature representations.

Our goal is to learn how to build autonomic systems. To make autonomic computing practical and widely applicable we must build several different systems that integrate machine learning and systems. Machine learning and computer systems are large and varied fields: the deep lessons of how to make them work together will become clear only if our case studies contain a reasonable sampling from each field. We choose three case studies that cover a wide variety of systems and employ many different types of learning algorithms. The first system does adaptive resource management for performance tuning of distributed systems, particularly a web server with database back-end. The second provides improved software support by classifying program behavior. The final system can automate detection, diagnosis and reaction to changing network conditions.

This project is supported by the NSF grant CNS-0615104.

The seed research on this project was sponsored by an IBM faculty award to Peter Stone.

Members

Faculty

James C. Browne

Mike Dahlin

Kathryn S. McKinley

Risto Miikkulainen

Raymond J. Mooney

Vitaly Shmatikov

Peter Stone

Emmett Witchel

Yin Zhang

Students

Nalini Belaramani

Jason Davis

Jungwoo Ha

Nick Jong

Dmitry Kit

Hany Ramadan

Christoper Rossbach

Indrajit Roy

Shimon Whiteson

Jonathan Wildstrom

Publications

Jungwoo Ha, Christopher J. Rossbach, Jason V. Davis, Indrajit Roy, Hany E. Ramadan, Donald E. Porter, David L. Chen, Emmett Witchel.
Improved Error Reporting for Software that Uses Black Box Components.
In Proceedings of the ACM SIGPLAN 2007 Conference on Programming Language Design and Implementation, San Deigo, CA June 2007.
Jonathan Wildstrom, Peter Stone, Emmett Witchel, and Mike Dahlin.
Machine Learning for On-Line Hardware Reconfiguration. [pdf]
In The Twentieth International Joint Conference on Artificial Intelligence (IJCAI), Hyderabad, India Jan 2007.
Jason V. Davis, Jungwoo Ha, Christopher J. Rossbach, Hany E. Ramadan, and Emmett Witchel.
Cost-Sensitive Decision Tree Learning for Forensic Classification. [pdf]
In Proceedings of the The 17th European Conference on Machine Learning, Berlin, Germany September 2006.

Fall 2007 Meeting Schedule

Starting September 11, meetings will be held on a bi-weekly basis at 11am in ACES 2.404B

Paper Readings

Fall 2007

Mary Jean Harrold, James A. Jones, Tongyu Li, Donglin Liang, Alessandro Orso, Maikel Pennings, Saurabh Sinha, Steven Spoon. Regression Test Selection for Java Software, OOPSLA 2001
Shan Lu et. al. MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs, SOSP 2007
Secil Ugurel, Robert Krovetz, C. Lee Giles, David M. Pennock, Eric J. Glover, and Hongyuan Zha What's the Code? Automatic Classification of Source Code Archives, KDD 2002
George Kofi Baah, Alexander Gray, and Mary Jean Harrold Online Anomaly Detection of Deployed Software: A Statistical Machine Learning Approach SOQUA, 2006
Lin Tan, Ding Yuan, and Yuanyuan Zhou /* iComment: Bugs or Bad Comments? */ SOSP, 2007

Summer 2007

Jeremy Lau, Stefan Schoenmackers, and Brad Calder Transition Phase Classification and Prediction HPCA, 2005

Spring 2007

K. Cooper and D. Subramanian and L. Torczon Adaptive optimizing compilers for the 21st century Journal of Supercomputing, 2002
Giorgio Fumera, Ignazio Pillai, Fabio Roli Spam Filtering Based On The Analysis Of Text Information Embedded Into Images, JMLR 2006
Long Fei, Samuel P. Midkiff Artemis: Practical Runtime Monitoring of Applications for Errors , PLDI 2006
J. Zico Kolter, Marcus A. Maloof Learning to Detect and Classify Malicious Executables in the Wild JMLR 2006
John Cavazos and Michael O'Boyle Method-Specific Dynamic Compilation using Logistic Regression ,OOPSLA 2006

Fall 2006

Greg Hamerly, Erez Perelman, Jeremy Lau, Brad Calder, Timothy Sherwood Using Machine Learning to Guide Architecture Simulation , JMLR 2006.
James Newsome, Brad Karp, Dawn Song Paragraph: Thwarting Signature Learning By Training Maliciously , RAID 2006
Jonathan Wildstrom, Peter Stone, Emmett Witchel, and Mike Dahlin. Machine learning for on-line hardware reconfiguration , IJCAI 2007
Varun Aggarwal, Wesley O. Jim, Una-May O'Reilly Filter Approximation Using Explicit Time and Frequency Domain Specifications , GECCO 2006
Bianca Schroeder, Adam Wierman and Mor Harchol-Balter Open Versus Closed: A Cautionary Tale, NSDI 2006

Summer 2006

G. Tesauro, R. Das, N. Jong and M. Bennani A Hybrid Reinforcement Learning Approach to Autonomic Resource Allocation , ICAC 2006
Alice X. Zheng, Michael I. Jordan, Ben Liblit, Mayur Naik, Alex Aiken. Statistical Debugging: Simultaneous Identification of Multiple Bugs , ICML 2006
Sandeep Uttamchandani, Li Yin, Guillermo Alvarez, John Palmer, Gul Agha Chameleon: a self-evolving, fully-adaptive resource arbitrator for storage systems , USENIX 2005

Spring 2006

Benjamin J. Kuipers, Alex X. Liu, Aashin Gautam, and Mohamed G. Gouda. Zmail: Zero-sum free market control of spam, ASDN 2005
V. Yegneswaran, J.T Giffin, P Barford and S JhaAn Architecture for Generating Semantics-Aware Signatures, USENIX Security, 2005
Blum, Dwork, McSherry, Nissim.Practical Privacy: The SuLQ Framework
J.Semke, J.Mahdavi, M.Mathis, Automatic TCP Buffer Tuning , ACM SIGCOMM '98
Lindell, Pinkas.Privacy Preserving Data Mining, Lecture Notes in Computer Science 2000
Evfimievski A., Gehrke J., Srikant R. Limiting Privacy Breaches in Privacy Preserving Data Mining , PODS 2003
P. Broadwell, M. Harren and Naveen Sastry , Scrash: A System for Generating Secure Crash Information , Usenix Security 2003

Fall 2005

Calder, Grunwald, Jones, Lindsay, Martin, Mozer, and Zorn Evidence-based Static Branch Prediction using Machine Learning
Helen J. Wang, John C. Platt, Yu Chen, Ruyun Zhang and Yi-Min Wang Automatic Misconfiguration Troubleshooting with PeerPressure
Ira Cohen, S. Zhang, M. Goldszmidt, J. Symons, T. Kelly, A. Fox Capturing, Indexing, Clustering, and Retrieving System History
D.A. Menasce, R. Dodge and D. Barbara Preserving QoS of E-commerce Sites Through Self-Tuning: A Performance Model Approach
Li Zhuang, Feng Zhou, and J. D. Tygar Keyboard Acoustic Emanations Revisited
C. Liu, X. Yan, H. Yu, J. Han & P. S. Yu Mining Behavior Graphs for "Backtrace" of Noncrashing Bugs
D. T. McWherter, B. Schroeder, A Ailamaki & M. Harchol-Balter Priority Mechanisms for OLTP and Transactional Web Applications

Summer 2005

J.F. Murray, G. F. Hughes & K. Kreutz-Delgado Machine Learning Methods for Predicting Failures in Hard Drives: A Multiple-Instance Application JMLR 6 (May): 783--816, 2005
B. Liblit, M. Naik, A. X. Zheng, A. Aiken & M. Jordan Scalable Statistical Bug Isolation PLDI 2005
N. Dalvi, P. Domingos, Mausam, S. Sanghai & D. Verma Adversarial Classification

Spring 2005

L. Ertoz, E. Eilertson,A. Lazarevic, P. Tan, J. Srivastava, V. Kumar, P. Dokas The MINDS - Minnesota Intrusion Detection System, "Next Generation Data Mining MIT Press, 2004
J. Z. Kolter & M. A. Maloof Learning to Detect Malicious Executables in the Wild. KDD04
J. Newsome, B. Karp & D. Song Polygraph: Automatically Generating Signatures for polymorphic worms SPS 05
G. Tesauro et. al. Decompositional Reinforcement Learning and Workload Management
S. Forrest, J. Balthrop, M. Glickman & D. Ackley Computation in the Wild
J. Balthrop, F. Esponda, S. Forrest & M. Glickman Coverage and Generalization in an Artificial Immune System
A. Fox, E. Kiciman & D. Patterson. Combining Statistical Monitoring and Predictable Recovery for Self-Management WOSS 04

Fall 2004

A. Fern, R. Givan, B. Falsafi & T. N. VijayKumar. Dynamic Feature Selection for Hardware Prediction 2004
A. V. Mirgorodskiy & B. P. Miller. Autonomous Analysis of Interactive Systems with Self-Propelled Instrumentation MCNC 2005
IBM white paper An architectural blueprint for Autonomic Computing
B. Liblit, A. Aiken, A. X. Zheng & M. I. Jordan.Bug Isolation via Remote Program SamplingPLDI 2003
I. Cohen, M. Glodszmidt, T. Kelly, J. Symons &s; J. Chase. Correlating instrumentation data to system states: a building block for automated diagnosis and control OSDI 04
A. B. Brown, J. Hellerstein, M. Hogstrom, T. Lau, S. Lightstone, P. Shum & M. Peterson. Benchmarking Autonomic Capabilities: Promises and Pitfalls. ICAC 2004
Y. Diao, J.L. Hellerstein, S. Parekh, & J.P. Bigus. Managing Web Server Performance with AutoTune Agent. IBM Systems Journal, Vol 42, No. 1, 2003.

Summer 2004

M. Y. Chen, E. Kiciman, E. Fratkin, A. Fox & E. Brewer Pinpoint: Problem Determination in Large, Dynamic Internet Services . DSN02
P. Barham, R. Isaacs, R. Mortier, & D. Narayanan Magpie: real-time modelling and performance-aware systems . HotOS03
A. Brown, G. Kar & A. Keller. An Active Approach to Characterizing Dynamic Dependencies for Problem Determination in a Distributed Application Environment. IM01
Y.H. Chang, T. Ho, & L. P. Kaelbling. Mobilized Ad-Hoc Networks: A Reinforcement Learning Approach. AIM03.
M. K. Aguilera, J. C. Mogul, J. L. Wiener, P. Reynolds & A. Muthitacharoen Performance Debugging for Distributed Systems of Black Boxes. SOOP03
W. E. Walsh, G. Tesauro, J. O. Kephart & R. Das. Utility Functions in Autonomic Systems. ICAC04

Spring 2004

J. L. Hellerstein, F. Zhang & P. Shahabuddin. Characterizing Normal operation of a web server: Application to workload forecasting and problem detection. CMG 98
G. Aggarwal, M. Datar, N. Mishra & R. Motwani On Identifying Stable Ways to Configure Systems ICAC04
M. Mesnier, E. Thereska, D. Ellard, G. R. Ganger & M. Seltzer. File Classification in Self-* Storage Systems.ICAC04
S. Whiteson & P. Stone Towards Autonomic Computing: Adaptive Network Routing and Scheduling. IAAI04
T. Abdelzaher, K. G. Shin, N. Bhatti. Performance Guarantees for Web Server End-Systems: A Control-Theoretical Approach. PDS02
M. Chen, A. X. Zheng, J. Lloyd, M. I. Jordan & E. Brewer Failure Diagnosis Using Decision Trees.ICAC04
F. Gomez, D. Burger & R. Miikkulainen A Neuroevolution Method for Dynamic Resource Allocation on a Chip Multiprocessor. IJCNN-01
J.O. Kephart & D. M. Chess The Vision of Autonomic Computing IEEE Computer, 36(1):41--50. IEEE, January 2003
R.J. Brachman Systems That Know What They're Doing IEEE Intelligent Systems, 17(6), Nov-Dec, 2002, pages67-71

For more information, please contact Indrajit Roy