Learning to tell tales: automatic story generation from Corpora
McIntyre, Neil Duncan
Automatic story generation has a long-standing tradition in the field of Artificial Intelligence. The ability to create stories on demand holds great potential for entertainment and education. For example, modern computer games are becoming more immersive, containing multiple story lines and hundreds of characters. This has substantially increased the amount of work required to produce each game. However, by allowing the game to write its own story line, it can remain engaging to the player whilst shifting the burden of writing away from the game’s developers. In education, intelligent tutoring systems can potentially provide students with instant feedback and suggestions of how to write their own stories. Although several approaches have been introduced in the past (e.g., story grammars, story schema and autonomous agents), they all rely heavily on handwritten resources. Which places severe limitations on its scalability and usage. In this thesis we will motivate a new approach to story generation which takes its inspiration from recent research in Natural Language Generation. Whose result is an interactive data-driven system for the generation of children’s stories. One of the key features of this system is that it is end-to-end, realising the various components of the generation pipeline stochastically. Knowledge relating to the generation and planning of stories is leveraged automatically from corpora and reformulated into new stories to be presented to the user. We will also show that story generation can be viewed as a search task, operating over a large number of stories that can be generated from knowledge inherent in a corpus. Using trainable scoring functions, our system can search the story space using different document level criteria. In this thesis we focus on two of these, namely, coherence and interest. We will also present two major paradigms for generation through search, (a) generate and rank, and (b) genetic algorithms. We show the effects on perceived story interest, fluency and coherence that result from these approaches. In addition, we show how the explicit use of plots induced from the corpus can be used to guide the generation process, providing a heuristically motivated starting point for story search. We motivate extensions to the system and show that additional modules can be used to improve the quality of the generated stories and overall scalability. Finally we highlight the current strengths and limitations of our approach and discuss possible future approaches to this field of research.