Weakly Supervised Learning of Objects and Attributes.
Publisher
Metadata
Show full item recordAbstract
This thesis presents weakly supervised learning approaches to directly
exploit image-level tags (e.g. objects, attributes) for comprehensive
image understanding, including tasks such as object localisation, image
description, image retrieval, semantic segmentation, person re-identification
and person search, etc. Unlike the conventional approaches which tackle
weakly supervised problem by learning a discriminative model, a generative
Bayesian framework is proposed which provides better mechanisms
to resolve the ambiguity problem. The proposed model significantly differentiates
from the existing approaches in that: (1) All foreground object
classes are modelled jointly in a single generative model that encodes multiple
objects co-existence so that “explaining away” inference can resolve
ambiguity and lead to better learning. (2) Image backgrounds are shared
across classes to better learn varying surroundings and “push out” objects
of interest. (3) the Bayesian formulation enables the exploitation of various
types of prior knowledge to compensate for the limited supervision
offered by weakly labelled data, as well as Bayesian domain adaptation
for transfer learning.
Detecting objects is the first and critical component in image understanding
paradigm. Unlike conventional fully supervised object detection
approaches, the proposed model aims to train an object detector
from weakly labelled data. A novel framework based on Bayesian latent
topic model is proposed to address the problem of localisation of objects
as bounding boxes in images and videos with image level object labels.
The inferred object location can be then used as the annotation to train a
classic object detector with conventional approaches.
However, objects cannot tell the whole story in an image. Beyond detecting
objects, a general visual model should be able to describe objects
and segment them at a pixel level. Another limitation of the initial model is
that it still requires an additional object detector. To remedy the above two
drawbacks, a novel weakly supervised non-parametric Bayesian model is
presented to model objects, attributes and their associations automatically
from weakly labelled images. Once learned, given a new image, the proposed
model can describe the image with the combination of objects and
attributes, as well as their locations and segmentation.
Finally, this thesis further tackles the weakly supervised learning problem
from a transfer learning perspective, by considering the fact that there
are always some fully labelled or weakly labelled data available in a related
domain while only insufficient labelled data exist for training in the
target domain. A powerful semantic description is transferred from the existing
fashion photography datasets to surveillance data to solve the person
re-identification problem.
Authors
SHI, ZHIYUANCollections
- Theses [3711]