This book discusses the Partially Observable Markov Decision Process (POMDP) framework applied in dialogue systems. It presents POMDP as a formal framework to represent uncertainty explicitly while supporting automated policy solving. The authors propose and implement an end-to-end learning approach for dialogue POMDP model components. Starting from scratch, they present the state, the transition model, the observation model and then finally the reward model from unannotated and noisy dialogues. These altogether form a significant set of contributions that can potentially inspire substantial further work. This concise manuscript is written in a simple language, full of illustrative examples, figures, and tables.
1 Introduction.- 2 A few words on topic modeling.- 3 Sequential decision making in spoken dialog management.- 4 Learning the dialog POMDP model components.- 5 Learning the reward function.- 6 Application on healthcare dialog management.- 7 Conclusions and future work.
Hamidreza Chinaei is a postdoctoral fellow at the Computer Science Department in University of Toronto under the supervision of Dr. Frank Rudzicz through an NSERC Engage Fund with IBM Canada. Dr. Chinaei has received his PhD in 2013 in Computer Science from Laval University on the application of machine learning for speech and natural language processing tasks, and MMath in Computer Science from the University of Waterloo on semantic query optimization. He has received the Industrial Track Student Scholarship and Award from the 2012 Canadian AI Conference and the Best Student Paper Award from the International Conference on Agents and Artificial Intelligence in 2009.
Brahim Chaib-draa received a Diploma in Computer Engineering from the ?cole Sup?rieure d?lectricit? (SUPELEC) de Paris, Paris, France, in 1978 and a Ph.D. degree in Computer Science from the Universit? du Hainaut-Cambr?sis, Valenciennes, France, in 1990. In 1990, he joined the Departmentlc*