Weakly Supervised Learning of Objects and Attributes.
MetadataShow full item record
This thesis presents weakly supervised learning approaches to directly exploit image-level tags (e.g. objects, attributes) for comprehensive image understanding, including tasks such as object localisation, image description, image retrieval, semantic segmentation, person re-identification and person search, etc. Unlike the conventional approaches which tackle weakly supervised problem by learning a discriminative model, a generative Bayesian framework is proposed which provides better mechanisms to resolve the ambiguity problem. The proposed model significantly differentiates from the existing approaches in that: (1) All foreground object classes are modelled jointly in a single generative model that encodes multiple objects co-existence so that “explaining away” inference can resolve ambiguity and lead to better learning. (2) Image backgrounds are shared across classes to better learn varying surroundings and “push out” objects of interest. (3) the Bayesian formulation enables the exploitation of various types of prior knowledge to compensate for the limited supervision offered by weakly labelled data, as well as Bayesian domain adaptation for transfer learning. Detecting objects is the first and critical component in image understanding paradigm. Unlike conventional fully supervised object detection approaches, the proposed model aims to train an object detector from weakly labelled data. A novel framework based on Bayesian latent topic model is proposed to address the problem of localisation of objects as bounding boxes in images and videos with image level object labels. The inferred object location can be then used as the annotation to train a classic object detector with conventional approaches. However, objects cannot tell the whole story in an image. Beyond detecting objects, a general visual model should be able to describe objects and segment them at a pixel level. Another limitation of the initial model is that it still requires an additional object detector. To remedy the above two drawbacks, a novel weakly supervised non-parametric Bayesian model is presented to model objects, attributes and their associations automatically from weakly labelled images. Once learned, given a new image, the proposed model can describe the image with the combination of objects and attributes, as well as their locations and segmentation. Finally, this thesis further tackles the weakly supervised learning problem from a transfer learning perspective, by considering the fact that there are always some fully labelled or weakly labelled data available in a related domain while only insufficient labelled data exist for training in the target domain. A powerful semantic description is transferred from the existing fashion photography datasets to surveillance data to solve the person re-identification problem.
- Theses