Representation and recognition of human actions in video
Metadata
Show full item recordAbstract
Automated human action recognition plays a critical role in the development of human-machine
communication, by aiming for a more natural interaction between artificial intelligence and the
human society. Recent developments in technology have permitted a shift from a traditional
human action recognition performed in a well-constrained laboratory environment to realistic
unconstrained scenarios. This advancement has given rise to new problems and challenges still
not addressed by the available methods. Thus, the aim of this thesis is to study innovative approaches
that address the challenging problems of human action recognition from video captured
in unconstrained scenarios. To this end, novel action representations, feature selection methods,
fusion strategies and classification approaches are formulated.
More specifically, a novel interest points based action representation is firstly introduced, this
representation seeks to describe actions as clouds of interest points accumulated at different temporal
scales. The idea behind this method consists of extracting holistic features from the point
clouds and explicitly and globally describing the spatial and temporal action dynamic. Since
the proposed clouds of points representation exploits alternative and complementary information
compared to the conventional interest points-based methods, a more solid representation is then
obtained by fusing the two representations, adopting a Multiple Kernel Learning strategy. The
validity of the proposed approach in recognising action from a well-known benchmark dataset is
demonstrated as well as the superior performance achieved by fusing representations.
Since the proposed method appears limited by the presence of a dynamic background and fast
camera movements, a novel trajectory-based representation is formulated. Different from interest
points, trajectories can simultaneously retain motion and appearance information even in noisy
and crowded scenarios. Additionally, they can handle drastic camera movements and a robust
region of interest estimation. An equally important contribution is the proposed collaborative
feature selection performed to remove redundant and noisy components. In particular, a novel
feature selection method based on Multi-Class Delta Latent Dirichlet Allocation (MC-DLDA)
is introduced. Crucial, to enrich the final action representation, the trajectory representation is
adaptively fused with a conventional interest point representation. The proposed approach is
extensively validated on different datasets, and the reported performances are comparable with
the best state-of-the-art. The obtained results also confirm the fundamental contribution of both
collaborative feature selection and adaptive fusion.
Finally, the problem of realistic human action classification in very ambiguous scenarios is
taken into account. In these circumstances, standard feature selection methods and multi-class
classifiers appear inadequate due to: sparse training set, high intra-class variation and inter-class
similarity. Thus, both the feature selection and classification problems need to be redesigned.
The proposed idea is to iteratively decompose the classification task in subtasks and select the
optimal feature set and classifier in accordance with the subtask context. To this end, a cascaded
feature selection and action classification approach is introduced. The proposed cascade aims to
classify actions by exploiting as much information as possible, and at the same time trying to
simplify the multi-class classification in a cascade of binary separations. Specifically, instead of
separating multiple action classes simultaneously, the overall task is automatically divided into
easier binary sub-tasks. Experiments have been carried out using challenging public datasets;
the obtained results demonstrate that with identical action representation, the cascaded classifier
significantly outperforms standard multi-class classifiers.
Authors
Bregonzio, MatteoCollections
- Theses [3711]