Recognising activities by jointly modelling actions and their effects
View/ Open
Date
26/11/2015Author
Vafeias, Efstathios
Metadata
Abstract
With the rapid increase in adoption of consumer technologies, including inexpensive
but powerful hardware, robotics appears poised at the cusp of widespread deployment
in human environments. A key barrier that still prevents this is the machine understanding
and interpretation of human activity, through a perceptual medium such as
computer vision, or RBG-D sensing such as with the Microsoft Kinect sensor.
This thesis contributes novel video-based methods for activity recognition. Specifically,
the focus is on activities that involve interactions between the human user and
objects in the environment. Based on streams of poses and object tracking, machine
learning models are provided to recognize various of these interactions. The thesis
main contributions are (1) a new model for interactions that explicitly learns the
human-object relationships through a latent distributed representation, (2) a practical
framework for labeling chains of manipulation actions in temporally extended activities
and (3) an unsupervised sequence segmentation technique that relies on slow feature
analysis and spectral clustering.
These techniques are validated by experiments with publicly available data sets,
such as the Cornell CAD-120 activity corpus which is one of the most extensive publicly
available such data sets that is also annotated with ground truth information. Our
experiments demonstrate the advantages of the proposed methods, over and above state
of the art alternatives from the recent literature on sequence classifiers.
Collections
The following license files are associated with this item: