Cognitive Systems Research Group

[ Description | Meeting Schedule | Paper Readings]


Autonomic computing has tremendous potential to advance computer systems research by simplifying the design, deployment, configuration, and support of large-scale systems. To realize such potential in practice, however, a number of research challenges must be addressed to integrate machine learning with computer systems. The goal of the proposed research is to obtain a fundamental understanding of such challenges and develop effective techniques to address them in the most general way possible. To this end, we will develop prototypes that integrate machine learning and systems for a broad range of very different system applications that span distributed systems, software support, operating systems, networking, and security. By leveraging our expertise across levels of computer systems and across types of machine learning, our research promises to significantly advance the state of the art in all these systems by making them self-tuning, self-correcting, self-reporting, self-managing, and self-protecting. In addition, our effort will help broaden the foundation of machine learning by both developing new techniques and adapting existing techniques to better fulfill the requirements of these real-world autonomic systems. Finally, the techniques that we develop and the lessons learned from our experience will help both ourselves and others to make progress towards fully integrating machine learning and systems.

Our own experience, and the experience of others, shows that machine learning cannot be integrated into systems as a simple black box. This proposal is motivated by the recognition that to realize the goals of autonomic computing, we will need to achieve a much tighter coupling between systems and machine learning in which system designs are adapted to facilitate machine-learning-based control, and in which machine learning techniques are advanced to meet the demands of large-scale systems.

In order to tightly couple systems and machine learning, this project addresses two classes of research challenges: those relating to defining the systems/AI interface, and those pertaining to tailoring machine learning towards autonomic computing.

By careful definition of the systems/AI interface we aim to develop general techniques for designing systems that better support autonomic operation. The key enablers of autonomic operation are (i) to develop expressive representations of system behavior that are efficient to measure; and (ii) to develop techniques to use machine learning models to improve system feedback and interoperability; and (iii) to ensure safe system control so users can trust autonomic operation.

The project aims to advance the state of the art in machine learning by developing new techniques to meet the demands of autonomic computing. In order for machine learning techniques to meet the challenge of autonomic systems: (i) machine learning algorithms need to be modified to address privacy and security issues, and (ii) new machine learning algorithms need to be developed including new reinforcement learning algorithms for sequential decision-making problems and ensemble methods designed to exploit hierarchical feature representations.

Our goal is to learn how to build autonomic systems. To make autonomic computing practical and widely applicable we must build several different systems that integrate machine learning and systems. Machine learning and computer systems are large and varied fields: the deep lessons of how to make them work together will become clear only if our case studies contain a reasonable sampling from each field. We choose three case studies that cover a wide variety of systems and employ many different types of learning algorithms. The first system does adaptive resource management for performance tuning of distributed systems, particularly a web server with database back-end. The second provides improved software support by classifying program behavior. The final system can automate detection, diagnosis and reaction to changing network conditions.

This project is supported by the NSF grant CNS-0615104.

The seed research on this project was sponsored by an IBM faculty award to Peter Stone.



James C. Browne
Mike Dahlin
Kathryn S. McKinley
Risto Miikkulainen
Raymond J. Mooney
Vitaly Shmatikov
Peter Stone
Emmett Witchel
Yin Zhang


Nalini Belaramani
Jason Davis
Jungwoo Ha
Nick Jong
Dmitry Kit
Hany Ramadan
Christoper Rossbach
Indrajit Roy
Shimon Whiteson
Jonathan Wildstrom


Fall 2007 Meeting Schedule


Suggested Readings

  • Crispin Cowan, Seth Arnold, Steve Beattie, Chris Wright, and John Viega Defcon Capture the Flag: Defending Vulnerable Code from Intense Attack, DARPA DISCEX III Conference 2003
  • Secil Ugurel, Robert Krovetz, C. Lee Giles, David M. Pennock, Eric J. Glover, and Hongyuan Zha What's the Code? Automatic Classification of Source Code Archives, KDD 2002
  • Murali Haran, Alan Karr, Alessandro Orso, Adam Porter, and Ashish Sanil Applying Classification Techniques to RemotelyCollected Program Execution Data.FSE, 2005
  • George K. Baah, Alexander Gray, and Mary Jean Harrold On­line Anomaly Detection of Deployed Software: A Statistical Machine Learning Approach FSE, 2006
  • A. Zheng, M. I. Jordan, B. Liblit, M. Nayur, and A. Aiken Statistical debugging: Simultaneous identification of multiple bugs. ICML, 2006
  • H. Liu, V. Bhat, M. Parashar and S. Klasky An Autonomic Service Architecture for Self-Managing Grid Applications , IEEE Computer Society Press, 2005
  • Andrej Bratko, Bogdan Filipic, Gordon V. Cormack, Thomas R. Lynam, Blaz Zupan Spam Filtering Using Statistical Data Compression Models JMLR 2006
  • P. Ruth, J. Rhee, D. Xu , R. Kennell and S. Goasguen Autonomic Live Adaptation of Virtual Computational Environments in a Multi-domain Infrastructure , ICAC 2006
  • R. Tibshirani and T. Hastie Margin Trees for High-dimensional Classification , JMLR, 2007
  • Umut A. Acar, Guy Blelloch, Matthias Blume, Kanat Tangwongsan, An Experimental Analysis of Self-Adjusting Computation , PLDI 2006
  • Sandeep Uttamchandani, Kaladhar Voruganti, Sudarshan M. Srinivasan, John Palmer, David Pease Polus: Growing Storage QoS Management Beyond a "4-Year Old Kid" , FAST 2004
  • Lin Qiao, Balakrishna R. Iyer, Divyakant Agrawal, Amr El Abbadi, Sandeep Uttamchandani PulStore: Automated Storage Management with QoS Guarantee in Large-scale Virtualized Storage Systems , ICAC 2005.
  • Miscellaneous Topics in Privacy Preserving (Collection of papers)
  • If you find a certain paper interesting and would like to recommend reading, please feel free to let us know during the meeting or mail Indrajit Roy.

    Paper Readings

    Fall 2007

    Summer 2007

    Spring 2007

    Fall 2006

    Summer 2006

    Spring 2006

    Fall 2005

    Summer 2005

    Spring 2005

    Fall 2004

    Summer 2004

    Spring 2004

    For more information, please contact Indrajit Roy