• Classified by Topic • Classified by Publication Type • Sorted by Date • Sorted by First Author Last Name • Classified by Funding Source •
Leveraging Commonsense Reasoning and Multimodal Perception for Robot Spoken Dialog Systems.
Dongcai Lu, Shiqi
Zhang, Peter Stone, and Xiaoping
Chen.
In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), September
2017.
Probabilistic graphical models, such as partially observable Markov decision processes (POMDPs), have been used in stochastic spoken dialog systems to handle the inherent uncertainty in speech recognition and language understanding. Such dialog systems suffer from the fact that only a relatively small number of domain variables are allowed in the model, so as to ensure the generation of good-quality dialog policies. At the same time, the non-language perception modalities on robots, such as vision-based facial expression recognition and Lidar-based distance detection, can hardly be integrated into this process. In this paper, we use a probabilistic commonsense reasoner to “guide” our POMDP-based dialog manager, and present a principled, multimodal dialog management (MDM) framework that allows the robot’s dialog belief state to be seamlessly updated by both observations of human spoken language, and exogenous events such as the change of human facial expressions. The MDM approach has been implemented and evaluated both in simulation and on a real mobile robot using guidance tasks.
@InProceedings{IROS17-Lu,
author = {Dongcai Lu and Shiqi Zhang and Peter Stone and Xiaoping Chen},
title = {Leveraging Commonsense Reasoning and Multimodal Perception for Robot
Spoken Dialog Systems},
booktitle = {Proceedings of the IEEE/RSJ International Conference on
Intelligent Robots and Systems (IROS)},
location = {Vancouver, Canada},
month = {September},
year = {2017},
abstract = {
Probabilistic graphical models, such as partially observable Markov decision
processes (POMDPs), have been used in stochastic spoken dialog systems to handle
the inherent uncertainty in speech recognition and language understanding. Such
dialog systems suffer from the fact that only a relatively small number of
domain variables are allowed in the model, so as to ensure the generation of
good-quality dialog policies. At the same time, the non-language perception
modalities on robots, such as vision-based facial expression recognition and
Lidar-based distance detection, can hardly be integrated into this process. In
this paper, we use a probabilistic commonsense reasoner to âguideâ our
POMDP-based dialog manager, and present a principled, multimodal dialog
management (MDM) framework that allows the robotâs dialog belief state to be
seamlessly updated by both observations of human spoken language, and exogenous
events such as the change of human facial expressions. The MDM approach has been
implemented and evaluated both in simulation and on a real mobile robot using
guidance tasks.
},
}
Generated by bib2html.pl (written by Patrick Riley ) on Sat Nov 01, 2025 23:24:58