A common feature of news reports is the reference to events other than the one which is
central to the discourse. Previous research has suggested Gricean explanations for this;
more generally, the phenomenon has been referred to simply as "journalistic style".
Whatever the underlying reasons, recent investigations into information extraction
have emphasised the need for a better understanding of the mechanisms that can be
used to recognise and distinguish between multiple events in discourse.
Existing information extraction systems approach the problem of event recognition
in a number of ways. However, although frameworks and techniques for black box
evaluations of information extraction systems have been developed in recent years,
almost no attention has been given to the evaluation of techniques for event recognition,
despite general acknowledgment of the inadequacies of current implementations. Not
only is it unclear which mechanisms are useful, but there is also little consensus as to
how such mechanisms could be compared.
This thesis presents a formalism for representing event structure, and introduces an
evaluation metric through which a range of event recognition mechanisms are quantitatively compared. These mechanisms are implemented as modules within the CONTESS
event recognition system, and explore the use of linguistic phenomena such as temporal
phrases, locative phrases and cue phrases, as well as various discourse structuring heuristics.
Our results show that, whilst temporal and cue phrases are consistently useful in
event recognition, locative phrases are better ignored. A number of further linguistic
phenomena and heuristics are examined, providing an insight into their value for event