Qmul-sds at exist: Leveraging pre-trained semantics and lexical features for multilingual sexism detection in social networks

Jiang, A; Zubiaga, A; IberLEF

dc.contributor.author	Jiang, A	en_US
dc.contributor.author	Zubiaga, A	en_US
dc.contributor.author	IberLEF	en_US
dc.date.accessioned	2023-07-20T11:27:34Z
dc.date.available	2021-06-21	en_US
dc.date.issued	2021-08-02	en_US
dc.identifier.issn	1613-0073	en_US
dc.identifier.uri	https://qmro.qmul.ac.uk/xmlui/handle/123456789/89671
dc.description.abstract	Online sexism is an increasing concern for those who experi- ence gender-based abuse in social media platforms as it has affected the healthy development of the Internet with negative impacts in society. The EXIST shared task proposes the first task on sEXism Identifica- tion in Social neTworks (EXIST) at IberLEF 2021 [30]. It provides a benchmark sexism dataset with Twitter and Gab posts in both English and Spanish, along with a task articulated in two subtasks consisting in sexism detection at different levels of granularity: Subtask 1 Sexism Iden- tification is a classical binary classification task to determine whether a given text is sexist or not, while Subtask 2 Sexism Categorisation is a finer-grained classification task focused on distinguishing different types of sexism. In this paper, we describe the participation of the QMUL-SDS team in EXIST. We propose an architecture made of the last 4 hidden states of XLM-RoBERTa and a TextCNN with 3 kernels. Our model also exploits lexical features relying on the use of new and existing lexicons of abusive words, with a special focus on sexist slurs and abusive words targeting women. Our team ranked 11th in Subtask 1 and 4th in Sub- task 2 among all the teams on the leaderboard, clearly outperforming the baselines offered by EXIST.	en_US
dc.format.extent	469 - 483	en_US
dc.rights	This item is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
dc.rights	Attribution 3.0 United States	*
dc.rights.uri	http://creativecommons.org/licenses/by/3.0/us/	*
dc.title	Qmul-sds at exist: Leveraging pre-trained semantics and lexical features for multilingual sexism detection in social networks	en_US
dc.type	Conference Proceeding
dc.rights.holder	© 2021 The Author(s). CEUR Workshop Proceedings
pubs.notes	Not known	en_US
pubs.publication-status	Published	en_US
pubs.volume	2943	en_US
dcterms.dateAccepted	2021-06-21	en_US
rioxxterms.funder	Default funder	en_US
rioxxterms.identifier.project	Default project	en_US

Files in this item

Name:: Zubiaga Qmul-sds at exist 2021 ...
Size:: 299.4Kb
Format:: application/
Description:: Published version

View/Open

Name:: license_rdf
Size:: 914bytes
Format:: application/rdf+xml

View/Open

This item appears in the following Collection(s)

Electronic Engineering and Computer Science [3475]

Show simple item record

This item is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Except where otherwise noted, this item's license is described as This item is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.