Word Sense Distance and Similarity Patterns in Regular Polysemy - Insights Gained from Human Annotations of Graded Word Sense Similarity and an Investigation of Contextualised Language Models
View/ Open
Metadata
Show full item recordAbstract
This thesis investigates the notion of distance between different interpretations of polysemic words. It presents a novel, large-scale dataset containing a total of close to 18,000 human annotations rating both the nuanced sense similarity in lexically ambiguous word forms as well as the acceptability of combining their different sense interpretations in a single co-predication structure. The collected data suggests that different polysemic sense extensions can be perceived as significantly dissimilar in meaning, forming patterns of word sense similarity in some types of regular metonymic alternations. These observations question traditional theories postulating a fully under-specified mental representation of polysemic sense. Instead, the collected data supports more recent hypotheses of a structured representation of polysemy in the mental lexicon, suggesting some form of sense grouping, clustering, or hierarchical ordering based on word sense similarity. The new dataset then also is used to evaluate the performance of a range of contextualised language models in predicting graded word sense similarity. Our findings suggest that without any dedicated fine-tuning, especially BERT Large shows a relatively high correlation with the collected judgements. The model however struggles to consistently reproduce the similarity patterns observed in the human data, or to cluster word senses solely based on their contextualised embeddings. Finally, this thesis presents a pilot algorithm for automatically detecting words that exhibit a given polysemic sense alternation. Formulated in an unsupervised fashion, this algorithm is intended to bootstrap the collection of an even larger dataset of ambiguous language use that could be used in the fine-tuning or evaluation of computational language models for (graded) word sense disambiguation tasks.
Authors
Haber, JCollections
- Theses [4490]