|dc.description.abstract||In complex teaching scenarios it can be difficult for teachers to exhaustively express all information a learner requires to master a task. However, the teacher, who will have internalised the task's objectives, will be able to identify good and bad actions in specific scenarios and would be able to formulate advice upon observing those scenarios. This thesis focuses on the design, implementation and evaluation of models that enable experts to teach agents through such situated feedback in an Interactive Task Learning (ITL) setting.
There is a class of highly natural speech acts which have so far gone largely unexplored in the domain of ITL: how to exploit evidence provided by a teacher when they correct the learning agent by articulating the mistake they just made. The aim of this thesis is to show that such speech acts can be exploited in an ITL to learn a task in a data efficient manner. Further we aim to show that this is made possible by capturing within the learning agent's models the constraints that are imposed by dialogue coherence. A dialogue is coherent if the current utterance relates to a salient part of its dialogue context with a specific coherence relation, such as explanation, contrast, correction, or elaboration. Our model will exploit the semantics of these relations to restrict the set of possible interpretations of the teacher's utterance and how the utterance relates to the objects involved in the action the teacher is giving feedback on.
We test our hypothesis on a tower building task where the set of allowed towers is constrained by rules. The agent starts out ignorant of these rules, and perhaps more fundamentally, is also unaware of the domain-level concepts used to define the rules and natural language terms that denote those concepts. We develop an agent which utilises the coherence of the extended dialogue to interpret and disambiguate the teacher's feedback, and utilises this (estimated) interpretation to refine its model of the domain, the mapping from NL descriptions to their denotations, given their observable visual features, and the planning problem being addressed. We extend this model to deal with utterances containing anaphora and to deal with an imperfect teacher: that is, one who occasionally doesn't provide the correct correction in a timely way, and/or who is confident, but wrong, about the learner's ability to identify from her utterance the salient part of the context that it is intended to correct. Finally, we use these ideas to learn the manner in which actions should be performed.||en