Lexicools at SemEval-2023 Task 10: Sexism Lexicon Construction via XAI

Nakwijit, P; Samir, M; Purver, M

dc.contributor.author	Nakwijit, P	en_US
dc.contributor.author	Samir, M	en_US
dc.contributor.author	Purver, M	en_US
dc.date.accessioned	2023-11-16T11:32:17Z
dc.date.issued	2023-01-01	en_US
dc.identifier.isbn	9781959429999	en_US
dc.identifier.uri	https://qmro.qmul.ac.uk/xmlui/handle/123456789/91950
dc.description.abstract	This paper presents our work on the SemEval-2023 Task 10 Explainable Detection of Online Sexism (EDOS) (Kirk et al., 2023) using lexicon-based models. Our approach consists of three main steps: lexicon construction based on Pointwise Mutual Information (PMI) and Shapley value, lexicon augmentation using an unannotated corpus and Large Language Models (LLMs), and, lastly, lexical incorporation for Bag-of-Word (BoW) logistic regression and fine-tuning LLMs. Our results demonstrate that our Shapley approach effectively produces a high-quality lexicon. We also show that simply counting the presence of certain words in our lexicons and comparing the count can outperform a BoW logistic regression in task B/C and fine-tuning BERT in task C. In the end, our classifier achieved F1-scores of 53.34% and 27.31% on the official blind test sets for tasks B and C, respectively. We, additionally, provide an in-depth analysis highlighting model limitations and bias. We also present our attempts to understand the model’s behavior based on our constructed lexicons. Our code and the resulting lexicons are open-sourced in our GitHub repository https://github.com/SirBadr/SemEval2023-Task10.	en_US
dc.format.extent	23 - 43	en_US
dc.title	Lexicools at SemEval-2023 Task 10: Sexism Lexicon Construction via XAI	en_US
dc.type	Conference Proceeding
pubs.notes	Not known	en_US
pubs.publication-status	Published	en_US

Files in this item

Files	Size	Format	View
There are no files associated with this item.

This item appears in the following Collection(s)

Electronic Engineering and Computer Science [3387]

Show simple item record