Cross-Modal Learning for Sketch Visual Understanding.

Song., Jifei.

dc.contributor.author	Song., Jifei.
dc.date.accessioned	2021-06-25T10:06:00Z
dc.date.available	2021-06-25T10:06:00Z
dc.date.issued	2021-02-12
dc.identifier.citation	Song., Jifei. 2021. Cross-Modal Learning for Sketch Visual Understanding. Queen Mary University of London.	en_US
dc.identifier.uri	https://qmro.qmul.ac.uk/xmlui/handle/123456789/72718
dc.description	PhD Theses	en_US
dc.description.abstract	As touching devices have rapidly proliferated, sketch has gained much popularity as an alternative input to text descriptions and speeches. This is due to the fact that sketch has the advantage of being informative and convenient, which have stimulated sketchrelated research in areas such as sketch recognition, sketch segmentation, sketch-based image retrieval, and photo-to-sketch synthesis. Though these eld has been well touched, existing sketch works still su er from aligning the sketch and photo domains, resulting in unsatisfactory quality for both ne-grained retrieval and synthesis between sketch and photo modalities. To address these problems, in this thesis, we proposed a series novel works on free-hand sketch related tasks and throw out helpful insights to help future research. Sketch conveys ne-grained information, making ne-grained sketch-based image retrieval one of the most important topics for sketch research. The basic solution for this task is learning to exploit the informativeness of sketches and link it to other modalities. Apart from the informativeness of sketches, semantic information is also important to understanding sketch modality and link it with other related modalities. In this thesis, we indicate that semantic information can e ectively ll the domain gap between sketch and photo modalities as a bridge. Based on this observation, we proposed an attributeaware deep framework to exploit attribute information to aid ne-grained SBIR. Text descriptions are considered as another semantic alternative to attributes, and at the same time, with the advantage of more exible and natural, which are exploited in our proposed deep multi-task framework. The experimental study has shown that the semantic attribute information can improve the ne-grained SBIR performance in a large margin. Sketch also has its unique feature like containing temporal information. In sketch synthesis task, the understandings from both semantic meanings behind sketches and sketching i process are required. The semantic meaning of sketches has been well explored in the sketch recognition, and sketch retrieval challenges. However, the sketching process has somehow been ignored, even though the sketching process is also very important for us to understand the sketch modality, especially considering the unique temporal characteristics of sketches. in this thesis, we proposed the rst deep photo-to-sketch synthesis framework, which has provided good performance on sketch synthesis task, as shown in the experiment section. Generalisability is an important criterion to judge whether the existing methods are able to be applied to the real world scenario, especially considering the di culties and costly expense of collecting sketches and pairwise annotation. We thus proposed a generalised ne-grained SBIR framework. In detail, we follow the meta-learning strategy, and train a hyper-network to generate instance-level classi cation weights for the latter matching network. The e ectiveness of the proposed method has been validated by the extensive experimental results.	en_US
dc.language.iso	en	en_US
dc.publisher	Queen Mary University of London.	en_US
dc.title	Cross-Modal Learning for Sketch Visual Understanding.	en_US
dc.type	Thesis	en_US
rioxxterms.funder	Default funder	en_US
rioxxterms.identifier.project	Default project	en_US

Files in this item

Name:: Song_Jifei_150784263_EECS_PhD_ ...
Size:: 6.911Mb
Format:: application/
Description:: PhD Thesis

View/Open

This item appears in the following Collection(s)

Theses [4223]
Theses Awarded by Queen Mary University of London

Show simple item record