publications| bio| research statement|
cv | professional activities | twitter


Email: swarat@cs.utexas.edu
Skype: swarat.chaudhuri

Administrative support:
Megan Booth, meganb@cs.utexas.edu

Mailing address:
GDC 5.810
Computer Science Department,
University of Texas at Austin
Austin, TX 78712.


My new lab at UTCS is still growing! If you would like to do a Ph.D. with me, please apply to UTCS and mention my name in the application, or write to me if you have questions. If you would like to do a postdoc, please write to me with a summary of your background and objectives.


My research lies in the intersection of programming languages (PL) and machine learning (ML). Through a synthesis of ideas from PL and ML, I aim to help create a new generation of intelligent systems that are reliable, transparent, and secure, and can perform complex tasks that are beyond the scope of contemporary AI.

A recurring theme in my recent work is to represent learning-enabled systems as neurosymbolic programs that model high-level reasoning using PL primitives and lower-level pattern recognition using neural networks. The learning of such models is an instance of program synthesis, the problem of automatically discovering programs from user-provided specifications. We approach this problem using a combination of statistical ML techniques and language-directed, solver-driven algorithms from PL and formal methods.

Here are some of the concrete contexts in which my students, collaborators, and I are exploring these ideas.

  • Learning over constrained data: Neurosymbolic methods are useful for learning over data that must satisfy rich structural and semantic constraints. For example, our recent work [ICLR18] studies learning over source code, a form of such data, with the goal of building AIs that help software engineers. We show that a combination of deep learning and symbolic abstraction mechanisms outperform purely neural techniques in this setting.
  • Interpretable and certified learning: Trust is a central challenge in modern AI. In particular, neural networks can be alarmingly brittle and biased, but yet cannot be debugged using traditional software development methodologies. Our research addresses these problems in two ways. First, we develop methods for formally certifying neural networks [S&P18, PLDI19]. Second, we develop methods for learning high-level programmatic models that are expressive but can also be vetted and debugged by humans [ICML18].
  • Programming knowledge into learners: Blackbox ML models often need unreasonable amounts of data to learn reliably and understand commonsense facts about the world. Models with programmatic elements that capture partial, domain-specific knowledge about the world can be a way around this problem. For example, our recent work studies programmatic policies, discovered using neurosymbolic methods, for reinforcement learning, and shows that they lead to better generalization and lower variance in learning than purely neural policies [ICML18, ICML19, NeurIPS19].
  • Compositional learning: Traditional ML models are monolithic entities that must be trained from scratch on every new dataset. In contrast, modern programming is inherently compositional. Our recent work bridges the gap between these perspectives through a program synthesis system that searches over complex compositions or a set of symbolic combinators and a set of neural "library functions". We show that such a compositional approach is an effective way of transferring knowledge across rounds of learning [NeurIPS18].
  • Learning to reason: Finally, I am interested in combining statistical and symbolic techniques in building the reasoning engines that enable modern formal methods. Such engines, ranging from satisfiability solvers to higher-order theorem provers to formal program synthesizers, must search vast combinatorial spaces of symbolic expressions. This is, needless to say, difficult; however, hand-designed heuristics have sped up these solvers tremendously in the recent past. We aim to use domain-specific data and exploration to automatically learn and optimize these heuristics.

Check out this keynote I gave at CAV 2019 for more details on some of these topics. You can also read these posts ([1], [2], [3]) I wrote at ACM SIGPLAN's blog PL Perspectives.


Before I began working on PL ∩ ML, I worked for many years on the theory and practice of classical formal methods. My Ph.D. work was on the logical and automata-theoretic foundations of software model checking [TOPLAS11]. Subsequently, I worked on the analysis of quantitative program properties such as robustness [POPL10, CACM12]. I have been working on program synthesis from formal specifications since 2009. Highlights of my work on this topic include:

  • Synthesis of functional programs: We developed techniques, combining language-directed search and type-directed automatic reasoning, for synthesizing functional programs that satisfy a given constraint, for instance a specification of what output to produce on a set of example inputs [PLDI15]. These methods are especially useful for automatically discovering scripts for accomplishing various end-user tasks [PLDI16, PLDI17].
  • Optimal program synthesis: We studied methods for synthesizing programs that optimize a quantitative objective (in some cases, while provably satisfying a boolean requirement). The most significant of our approaches to this problem was program smoothing, a systematic method for approximating programs by differentiable functions [POPL14-b, PLDI10].
  • Reactive program synthesis: We developed a new symbolic approach to the classic problem of synthesizing reactive controllers that provably satisfy of a temporal logic requirement. [POPL14-a]. Subsequent work utilized these methods to do better policy synthesis for robots [ICAPS16].
  • Multilevel program synthesis: We developed techniques to synthesize programs under requirements that cut across multiple levels of abstraction. We studied this problem in the setting of robotics; our motivating application was the discovery of policies for robots that satisfy a set of high-level task objectives as well as a set of lower-level motion constraints [ICRA14, RSS16, ICAPS16, AAMAS18, WAFR18].

There is a direct arc from this line of work to research on program learning. Indeed, program learning extends formal program synthesis by allowing specifications to be only partially known and setting up generalization to unseen inputs as an explicit goal.

  • Abhinav Verma (Ph.D.)
  • Greg Anderson (Ph.D.; co-advised with Isil Dillig)
  • Josh Hoffman (Ph.D.; co-advised with Joydeep Biswas)
  • Chenxi Yang (Ph.D.)
  • Ethan Wen (Ph.D.; starting in Spring 2021)
  • Calvin Smith (Postdoc)
  • Anders Miltner (Postdoc; co-advised with Isil Dillig)
  • Dipak Chaudhari (Research scientist)


(Fall 2020) CS 378: Logic in Computer Science and Artificial Intelligence

(Spring 2020) CS 395T: Program Synthesis

(Spring 2019, Spring 2018) COMP 403/503: Reasoning about software

(Fall 2019, Fall 2018, Fall 2016, Fall 2015, Fall 2014, Spring 2014) COMP 382: Reasoning about algorithms

(Spring 2015, Fall 2013, Fall 2012) COMP 507: Computer-Aided Program Design