Learning user modelling strategies for adaptive referring expression generation in spoken dialogue systems
View/ Open
Date
30/06/2011Author
Janarthanam, Srinivasan Chandrasekaran
Metadata
Abstract
We address the problem of dynamic user modelling for referring expression generation in spoken dialogue systems, i.e how a spoken dialogue system should choose
referring expressions to refer to domain entities to users with different levels of domain
expertise, whose domain knowledge is initially unknown to the system. We approach
this problem using a statistical planning framework: Reinforcement Learning techniques in Markov Decision Processes (MDP).
We present a new reinforcement learning framework to learn user modelling strategies for adaptive referring expression generation (REG) in resource scarce domains
(i.e. where no large corpus exists for learning). As a part of the framework, we present
novel user simulation models that are sensitive to the referring expressions used by
the system and are able to simulate users with different levels of domain knowledge.
Such models are shown to simulate real user behaviour more closely than baseline user
simulation models.
In contrast to previous approaches to user adaptive systems, we do not assume that
the user’s domain knowledge is available to the system before the conversation starts.
We show that using a small corpus of non-adaptive dialogues it is possible to learn an
adaptive user modelling policy in resource scarce domains using our framework. We
also show that the learned user modelling strategies performed better in terms of adaptation than hand-coded baselines policies on both simulated and real users. With real
users, the learned policy produced around 20% increase in adaptation in comparison
to the best performing hand-coded adaptive baseline. We also show that adaptation to
user’s domain knowledge results in improving task success (99.47% for learned policy vs 84.7% for hand-coded baseline) and reducing dialogue time of the conversation
(11% relative difference). This is because users found it easier to identify domain
objects when the system used adaptive referring expressions during the conversations.