Incremental semi-supervised learning for anomalous trajectory detection
Sillito, Rowland R.
The acquisition of a scene-specific normal behaviour model underlies many existing approaches to the problem of automated video surveillance. Since it is unrealistic to acquire a comprehensive set of labelled behaviours for every surveyed scenario, modelling normal behaviour typically corresponds to modelling the distribution of a large collection of unlabelled examples. In general, however, it would be desirable to be able to filter an unlabelled dataset to remove potentially anomalous examples. This thesis proposes a simple semi-supervised learning framework that could allow a human operator to efficiently filter the examples used to construct a normal behaviour model by providing occasional feedback: Specifically, the classification output of the model under construction is used to filter the incoming sequence of unlabelled examples so that human approval is requested before incorporating any example classified as anomalous, while all other examples are automatically used for training. A key component of the proposed framework is an incremental one-class learning algorithm which can be trained on a sequence of normal examples while allowing new examples to be classified at any stage during training. The proposed algorithm represents an initial set of training examples with a kernel density estimate, before using merging operations to incrementally construct a Gaussian mixture model while minimising an information-theoretic cost function. This algorithm is shown to outperform an existing state-of-the-art approach without requiring off-line model selection. Throughout this thesis behaviours are considered in terms of whole motion trajectories: in order to apply the proposed algorithm, trajectories must be encoded with fixed length vectors. To determine an appropriate encoding strategy, an empirical comparison is conducted to determine the relative class-separability afforded by several different trajectory representations for a range of datasets. The results obtained suggest that the choice of representation makes a small but consistent difference to class separability, indicating that cubic B-Spline control points (fitted using least-squares regression) provide a good choice for use in subsequent experiments. The proposed semi-supervised learning framework is tested on three different real trajectory datasets. In all cases the rate of human intervention requests drops steadily, reaching a usefully low level of 1% in one case. A further experiment indicates that once a sufficient number of interventions has been provided, a high level of classification performance can be achieved even if subsequent requests are ignored. The automatic incorporation of unlabelled data is shown to improve classification performance in all cases, while a high level of classification performance is maintained even when unlabelled data containing a high proportion of anomalous examples is presented.