Show simple item record

dc.contributor.authorMartinez-Alvarez, Miguel
dc.date.accessioned2017-10-09T13:21:10Z
dc.date.available2017-10-09T13:21:10Z
dc.date.issued2014-07-09
dc.date.submitted2017-10-09T11:20:45.768Z
dc.identifier.citationMartinez-Alvarez. M. 2014. Knowledge-Enhanced Text Classification: Descriptive Modelling and New Approaches. Queen Mary University of Londonen_US
dc.identifier.urihttp://qmro.qmul.ac.uk/xmlui/handle/123456789/27205
dc.descriptionPhDen_US
dc.description.abstractThe knowledge available to be exploited by text classification and information retrieval systems has significantly changed, both in nature and quantity, in the last years. Nowadays, there are several sources of information that can potentially improve the classification process, and systems should be able to adapt to incorporate multiple sources of available data in different formats. This fact is specially important in environments where the required information changes rapidly, and its utility may be contingent on timely implementation. For these reasons, the importance of adaptability and flexibility in information systems is rapidly growing. Current systems are usually developed for specific scenarios. As a result, significant engineering effort is needed to adapt them when new knowledge appears or there are changes in the information needs. This research investigates the usage of knowledge within text classification from two different perspectives. On one hand, the application of descriptive approaches for the seamless modelling of text classification, focusing on knowledge integration and complex data representation. The main goal is to achieve a scalable and efficient approach for rapid prototyping for Text Classification that can incorporate different sources and types of knowledge, and to minimise the gap between the mathematical definition and the modelling of a solution. On the other hand, the improvement of different steps of the classification process where knowledge exploitation has traditionally not been applied. In particular, this thesis introduces two classification sub-tasks, namely Semi-Automatic Text Classification (SATC) and Document Performance Prediction (DPP), and several methods to address them. SATC focuses on selecting the documents that are more likely to be wrongly assigned by the system to be manually classified, while automatically labelling the rest. Document performance prediction estimates the classification quality that will be achieved for a document, given a classifier. In addition, we also propose a family of evaluation metrics to measure degrees of misclassification, and an adaptive variation of k-NN.en_US
dc.language.isoenen_US
dc.publisherQueen Mary University of Londonen_US
dc.rightsThe copyright of this thesis rests with the author and no quotation from it or information derived from it may be published without the prior written consent of the author
dc.subjectElectronic Engineering and Computer Scienceen_US
dc.subjectInformation retrievalen_US
dc.subjecttext classificationen_US
dc.subjectSemi-Automatic Text Classificationen_US
dc.subjectDocument Performance Predictionen_US
dc.titleKnowledge-Enhanced Text Classification: Descriptive Modelling and New Approachesen_US
dc.typeThesisen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

  • Theses [4235]
    Theses Awarded by Queen Mary University of London

Show simple item record