• Login
    JavaScript is disabled for your browser. Some features of this site may not work without it.
    Machine Learning Architectures for Video Annotation and Retrieval 
    •   QMRO Home
    • Queen Mary University of London Theses
    • Theses
    • Machine Learning Architectures for Video Annotation and Retrieval
    •   QMRO Home
    • Queen Mary University of London Theses
    • Theses
    • Machine Learning Architectures for Video Annotation and Retrieval
    ‌
    ‌

    Browse

    All of QMROCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects
    ‌
    ‌

    Administrators only

    Login
    ‌
    ‌

    Statistics

    Most Popular ItemsStatistics by CountryMost Popular Authors

    Machine Learning Architectures for Video Annotation and Retrieval

    View/Open
    MARKATOPOULOU_Foteini_Final_PhD_060918.pdf (6.449Mb)
    Publisher
    Queen Mary University of London
    Metadata
    Show full item record
    Abstract
    In this thesis we are designing machine learning methodologies for solving the problem of video annotation and retrieval using either pre-defined semantic concepts or ad-hoc queries. Concept-based video annotation refers to the annotation of video fragments with one or more semantic concepts (e.g. hand, sky, running), chosen from a predefined concept list. Ad-hoc queries refer to textual descriptions that may contain objects, activities, locations etc., and combinations of the former. Our contributions are: i) A thorough analysis on extending and using different local descriptors towards improved concept-based video annotation and a stacking architecture that uses in the first layer, concept classifiers trained on local descriptors and improves their prediction accuracy by implicitly capturing concept relations, in the last layer of the stack. ii) A cascade architecture that orders and combines many classifiers, trained on different visual descriptors, for the same concept. iii) A deep learning architecture that exploits concept relations at two different levels. At the first level, we build on ideas from multi-task learning, and propose an approach to learn concept-specific representations that are sparse, linear combinations of representations of latent concepts. At a second level, we build on ideas from structured output learning, and propose the introduction, at training time, of a new cost term that explicitly models the correlations between the concepts. By doing so, we explicitly model the structure in the output space (i.e., the concept labels). iv) A fully-automatic ad-hoc video search architecture that combines concept-based video annotation and textual query analysis, and transforms concept-based keyframe and query representations into a common semantic embedding space. Our architectures have been extensively evaluated on the TRECVID SIN 2013, the TRECVID AVS 2016, and other large-scale datasets presenting their effectiveness compared to other similar approaches.
    Authors
    Markatopoulou, Foteini
    URI
    http://qmro.qmul.ac.uk/xmlui/handle/123456789/44693
    Collections
    • Theses [3651]
    Licence information
    The copyright of this thesis rests with the author and no quotation from it or information derived from it may be published without the prior written consent of the author
    Twitter iconFollow QMUL on Twitter
    Twitter iconFollow QM Research
    Online on twitter
    Facebook iconLike us on Facebook
    • Site Map
    • Privacy and cookies
    • Disclaimer
    • Accessibility
    • Contacts
    • Intranet
    • Current students

    Modern Slavery Statement

    Queen Mary University of London
    Mile End Road
    London E1 4NS
    Tel: +44 (0)20 7882 5555

    © Queen Mary University of London.