Machine Learning Research Group | University of Texas

Publications: Learning for Planning and Problem Solving

Most learning research concerns classification. Research in learning and planning and problem solving focuses on improving the performance of an AI planning or problem solving system through experience. Our work has focussed on integrating explanation-based learning (EBL) and inductive learning (specifically ILP) to improve the efficiency (speedup learning) and solution-quality for planning and problem solving systems by solving sample problems and learning heuristics that avoid backtracking or sub-optimal solutions.

Our work has focused on two systems:

SCOPE: Learning control rules for partial-order planning to improve efficiency and plan quality
DOLPHIN: Learning clause-selection rules for dynamic optimization of logic programs

Hide abstracts

CAPE: Corrective Actions from Precondition Errors using Large Language Models
[Details] [PDF]
Shreyas Sundara Raman, Vanya Cohen, Ifrah Idrees, Eric Rosen, Raymond Mooney, Stefanie Tellex, and David Paulius
In International Conference on Robotics and Automation (ICRA), May 2024.
Extracting knowledge and reasoning from large language models (LLMs) offers a path to designing intelligent robots. Common approaches that leverage LLMs for planning are unable to recover when actions fail and resort to retrying failed actions without resolving the underlying cause. We propose a novel approach (CAPE) that generates corrective actions to resolve precondition errors during planning. CAPE improves the quality of generated plans through few-shot reasoning on action preconditions. Our approach enables embodied agents to execute more tasks than baseline methods while maintaining semantic correctness and minimizing re-prompting. In VirtualHome, CAPE improves a human-annotated plan correctness metric from 28.89 percent to 49.63 percent over SayCan, whilst achieving competitive executability. Our improvements transfer to a Boston Dynamics Spot robot initialized with a set of skills (specified in language) and associated preconditions, where CAPE improves correctness by 76.49 percent with higher executability compared to SayCan. Our approach enables embodied agents to follow natural language commands and robustly recover from failures.
ML ID: 426
Using Planning to Improve Semantic Parsing of Instructional Texts
[Details] [PDF] [Slides (PDF)]
Vanya Cohen, Raymond Mooney
Association of Computational Linguistics (ACL), Natural Language Reasoning and Structured Explanations Workshop, July 2023.
We develop a symbolic planning-based decoder to improve the few-shot semantic parsing of instructional texts. The system takes long-form instructional texts as input and produces sequences of actions in a formal language that enable execution of the instructions. This task poses unique challenges since input texts may contain long context dependencies and ambiguous and domain-specific language. Valid semantic parses also require sequences of steps that constitute an executable plan. We build on recent progress in semantic parsing by leveraging large language models to learn parsers from small amounts of training data. During decoding, our method employs planning methods and domain information to rank and correct candidate parses. To validate our method, we evaluate on four domains: two household instruction-following domains and two cooking recipe interpretation domains. We present results for few-shot semantic parsing using leave-one-out cross-validation. We show that utilizing planning domain information improves the quality of generated plans. Through ablations we also explore the effects of our decoder design choices.
ML ID: 420
Planning with Large Language Models via Corrective Re-prompting
[Details] [PDF]
Shreyas Sundara Raman, Vanya Cohen, Eric Rosen, Ifrah Idrees, David Paulius, Stefanie Tellex
In Foundation Models for Decision Making Workshop at NeurIPS 2022, January 2022.
Extracting the common sense knowledge present in Large Language Models (LLMs) offers a path to designing intelligent, embodied agents. Related works have queried LLMs with a wide-range of contextual information, such as goals, sensor observations and scene descriptions, to generate high-level action plans for specific tasks; however these approaches often involve human intervention or additional machinery to enable sensor-motor interactions. In this work, we propose a prompting-based strategy for extracting executable plans from an LLM, which leverages a novel and readily-accessible source of information: precondition errors. Our approach assumes that actions are only afforded execution in certain contexts, i.e., implicit preconditions must be met for an action to execute (e.g., a door must be unlocked to open it), and that the embodied agent has the ability to determine if the action is/is not executable in the current context (e.g., detect if a precondition error is present). When an agent is unable to execute an action, our approach re-prompts the LLM with precondition error information to extract an executable corrective action to achieve the intended goal in the current context. We evaluate our approach in the VirtualHome simulation environment on 88 different tasks and 7 scenes. We evaluate different prompt templates and compare to methods that naively resample actions from the LLM. Our approach, using precondition errors, improves executability and semantic correctness of plans, while also reducing the number of re-prompts required when querying actions.
ML ID: 415
Using Multi-Strategy Learning to Improve Planning Efficiency and Quality
[Details] [PDF]
Tara A. Estlin
PhD Thesis, Department of Computer Sciences, The University of Texas at Austin, Austin, TX, 1998.
Artificial intelligence planning systems have become an important tool for automating a wide variety of tasks. However, even the most current planning algorithms suffer from two major problems. First, they often require infeasible amounts of computation time to solve problems in most domains. And second, they are not guaranteed to return the best solution to a planning problem, and in fact can sometimes return very low-quality solutions. One way to address these problems is to provide a planning system with domain-specific control knowledge, which helps guide the planner towards more promising search paths. Machine learning techniques enable a planning system to automatically acquire search-control knowledge for different applications. A considerable amount of planning and learning research has been devoted to acquiring rules that improve planning efficiency, also known as speedup learning. Much less work has been down in learning knowledge to improve the quality of plans, even though this is an essential feature for many real-world planning systems. Furthermore, even less research has been done in acquiring control knowledge to improve both these metrics.
The learning system presented in this dissertation, called SCOPE, is a unique approach to learning control knowledge for planning. SCOPE learns domain-specific control rules for a planner that improve both planning efficiency and plan quality, and it is one of the few systems that can learn control knowledge for partial-order planning. SCOPE's architecture integrates explanation-based learning (EBL) with techniques from inductive logic programming. Specifically, EBL is used to constrain an inductive search for control heuristics that help a planner choose between competing plan refinements. Since SCOPE uses a very flexible training approach, its learning algorithm can be easily focused to prefer search paths that are better for particular evaluation metrics. SCOPE is extensively tested on several planning domains, including a logistics transportation domain and a production manufacturing domain. In these tests, it is shown to significantly improve both planning efficiency and quality and is shown to be more robust than a competing approach.
ML ID: 81
Learning to Improve both Efficiency and Quality of Planning
[Details] [PDF]
Tara A. Estlin and Raymond J. Mooney
In Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence (IJCAI-97), 1227-1232, Nagoya, Japan, 1997.
Most research in learning for planning has concentrated on efficiency gains. Another important goal is improving the quality of final plans. Learning to improve plan quality has been examined by a few researchers, however, little research has been done learning to improve both efficiency and quality. This paper explores this problem by using the SCOPE learning system to acquire control knowledge that improves on both of these metrics. Since SCOPE uses a very flexible training approach, we can easily focus it's learning algorithm to prefer search paths that are better for particular evaluation metrics. Experimental results show that SCOPE can significantly improve both the quality of final plans and overall planning efficiency.
ML ID: 77
Integrating Explanation-Based and Inductive Learning Techniques to Acquire Search-Control for Planning
[Details] [PDF]
Tara A. Estlin
Technical Report AI96-250, Department of Computer Sciences, University of Texas, Austin, TX, 1996.
Planning systems have become an important tool for automating a wide variety of tasks. Control knowledge guides a planner to find solutions quickly and is crucial for efficient planning in most domains. Machine learning techniques enable a planning system to automatically acquire domain-specific search-control knowledge for different applications. Past approaches to learning control information have usually employed explanation-based learning (EBL) to generate control rules. Unfortunately, EBL alone often produces overly complex rules that actually decrease rather than improve overall planning efficiency. This paper presents a novel learning approach for control knowledge acquisition that integrates explanation-based learning with techniques from inductive logic programming. In our learning system SCOPE, EBL is used to constrain an inductive search for control heuristics that help a planner choose between competing plan refinements. SCOPE is one of the few systems to address learning control information for newer, partial-order planners. Specifically, this proposal describes how SCOPE learns domain-specific control rules for the UCPOP planning algorithm. The resulting system is shown to produce significant speedup in two different planning domains, and to be more effective than a pure EBL approach.
Future research will be performed in three main areas. First, SCOPE's learning algorithm will be extended to include additional techniques such as constructive induction and rule utility analysis. Second, SCOPE will be more thoroughly tested; several real-world planning domains have been identified as possible testbeds, and more in-depth comparisons will be drawn between SCOPE and other competing approaches. Third, SCOPE will be implemented in a different planning system in order to test its portability to other planning algorithms. This work should demonstrate that machine-learning techniques can be a powerful tool in the quest for tractable real-world planning.
ML ID: 67
Multi-Strategy Learning of Search Control for Partial-Order Planning
[Details] [PDF]
Tara A. Estlin and Raymond J. Mooney
In Proceedings of the Thirteenth National Conference on Artificial Intelligence (AAAI-96), 843-848, Portland, OR, August 1996.
Most research in planning and learning has involved linear, state-based planners. This paper presents SCOPE, a system for learning search-control rules that improve the performance of a partial-order planner. SCOPE integrates explanation-based and inductive learning techniques to acquire control rules for a partial-order planner. Learned rules are in the form of selection heuristics that help the planner choose between competing plan refinements. Specifically, SCOPE learns domain-specific control rules for a version of the UCPOP planning algorithm. The resulting system is shown to produce significant speedup in two different planning domains.
ML ID: 63
Integrating EBL and ILP to Acquire Control Rules for Planning
[Details] [PDF]
Tara A. Estlin and Raymond J. Mooney
In Proceedings of the Third International Workshop on Multi-Strategy Learning (MSL-96), 271--279, Harpers Ferry, WV, May 1996.
Most approaches to learning control information in planning systems use explanation-based learning to generate control rules. Unfortunately, EBL alone often produces overly complex rules that actually decrease planning efficiency. This paper presents a novel learning approach for control knowledge acquisition that integrates explanation-based learning with techniques from inductive logic programming. EBL is used to constrain an inductive search for selection heuristics that help a planner choose between competing plan refinements. SCOPE is one of the few systems to address learning control information in the newer partial-order planners. Specifically, SCOPE learns domain-specific control rules for a version of the UCPOP planning algorithm. The resulting system is shown to produce significant speedup in two different planning domains.
ML ID: 60
Hybrid Learning of Search Control for Partial-Order Planning
[Details] [PDF]
Tara A. Estlin and Raymond J. Mooney
In Malik Ghallab and Alfredo Milani, editors, New Directions in AI Planning, 129-140, Amsterdam, 1996. IOS Press.
This paper presents results on applying a version of the DOLPHIN search-control learning system to speed up a partial-order planner. DOLPHIN integrates explanation-based and inductive learning techniques to acquire effective clause-selection rules for Prolog programs. A version of the UCPOP partial-order planning algorithm has been implemented as a Prolog program and DOLPHIN used to automatically learn domain-specific search control rules that help eliminate backtracking. The resulting system is shown to produce significant speedup in several planning domains.
ML ID: 52
Integrating ILP and EBL
[Details] [PDF]
Raymond J. Mooney and John M. Zelle
Sigart Bulletin (special issue on Inductive Logic Programmming), 5(1):12-21, 1994.
This paper presents a review of recent work that integrates methods from Inductive Logic Programming (ILP) and Explanation-Based Learning (EBL). ILP and EBL methods have complementary strengths and weaknesses and a number of recent projects have effectively combined them into systems with better performance than either of the individual approaches. In particular, integrated systems have been developed for guiding induction with prior knowledge (ML-SMART, FOCL, GRENDEL) refining imperfect domain theories (FORTE, AUDREY, Rx), and learning effective search-control knowledge (AxA-EBL, DOLPHIN).
ML ID: 30
Combining FOIL and EBG to Speed-Up Logic Programs
[Details] [PDF]
John M. Zelle and Raymond J. Mooney
In Proceedings of the 13th International Joint Conference on Artificial Intelligence, 1106-1111, 1993. San Francisco, CA: Morgan Kaufmann.
This paper presents an algorithm that combines traditional EBL techniques and recent developments in inductive logic programming to learn effective clause selection rules for Prolog programs. When these control rules are incorporated into the original program, significant speed-up may be achieved. The algorithm is shown to be an improvement over competing EBL approaches in several domains. Additionally, the algorithm is capable of automatically transforming some intractable algorithms into ones that run in polynomial time.
ML ID: 27
Learning Search-Control Heuristics for Logic Programs: Applications to Speedup Learning and Language Acquisition
[Details] [PDF]
John M. Zelle
March 1993. Ph.D. proposal, Department of Computer Sciences, University of Texas at Austin.
This paper presents a general framework, learning search-control heuristics for logic programs, which can be used to improve both the efficiency and accuracy of knowledge-based systems expressed as definite-clause logic programs. The approach combines techniques of explanation-based learning and recent advances in inductive logic programming to learn clause-selection heuristics that guide program execution. Two specific applications of this framework are detailed: dynamic optimization of Prolog programs (improving efficiency) and natural language acquisition (improving accuracy). In the area of program optimization, a prototype system, DOLPHIN, is able to transform some intractable specifications into polynomial-time algorithms, and outperforms competing approaches in several benchmark speedup domains. A prototype language acquisition system, CHILL, is also described. It is capable of automatically acquiring semantic grammars, which uniformly incorprate syntactic and semantic constraints to parse sentences into case-role representations. Initial experiments show that this approach is able to construct accurate parsers which generalize well to novel sentences and significantly outperform previous approaches to learning case-role mapping based on connectionist techniques. Planned extensions of the general framework and the specific applications as well as plans for further evaluation are also discussed.
ML ID: 21
Speeding-up Logic Programs by Combining EBG and FOIL
[Details] [PDF]
John M. Zelle and Raymond J. Mooney
In Proceedings of the 1992 Machine Learning Workshop on Knowledge Compilation and Speedup Learning, Aberdeen, Scotland, July 1992.
This paper presents an algorithm that combines traditional EBL techniques and recent developments in inductive logic programming to learn effective clause selection rules for Prolog programs. When these control rules are incorporated into the original program, significant speed-up may be achieved. The algorithm produces not only EBL-like speed up of problem solvers, but is capable of automatically transforming some intractable algorithms into ones that run in polynomial time.
ML ID: 18
The Effect of Rule Use on the Utility of Explanation-Based Learning
[Details] [PDF]
Raymond J. Mooney
In Proceedings of the 11th International Joint Conference on Artificial Intelligence, 725-730, 1989. San Francisco, CA: Morgan Kaufmann.
The utility problem in explanation-based learning concerns the ability of learned rules or plans to actually improve the performance of a problem solving system. Previous research on this problem has focused on the amount, content, or form of learned information. This paper examines the effect of the use of learned information on performance. Experiments and informal analysis show that unconstrained use of learned rules eventually leads to degraded performance. However, constraining the use of learned rules helps avoid the negative effect of learning and lead to overall performance improvement. Search strategy is also shown to have a substantial effect on the contribution of learning to performance by affecting the manner in which learned rules are used. These effects help explain why previous experiments have obtained a variety of different results concerning the impact of explanation-based learning on performance.
ML ID: 210

Generalizing the Order of Operators in Macro-Operators
[Details] [PDF]
Raymond J. Mooney
In Proceedings of the Fifth International Conference on Machine Learning (ICML-88), 270-283, Ann Arbor, MI, June 1988.
A number of machine learning systems have been built which learn macro-operators or plan schemata, i.e. general compositions of actions which achieve a goal. However, previous research has not addressed the issue of generalizing the temporal order of operators and learning macro-operators with partially-ordered actions. This paper presents an algorithm for learning partially-ordered macro-operators which has been incorporated into the EGGS domain-independent explanation-based learning system. Examples from the domains of computer programming and narrative understanding are used to illustrate the performance of this system. These examples demonstrate that generalizing the order of operators can result in more general as well as more justified concepts. A theoretical analysis of the time complexity of the generalization algorithm is also presented.
ML ID: 209