Geometric Methods for Context Sensitive Distributional Semantics
Publisher
Metadata
Show full item recordAbstract
This thesis describes a novel methodology, grounded in the distributional semantic paradigm,
for building context sensitive models of word meaning, affording an empirical exploration
of the relationship between words and concepts. Anchored in theoretical linguistic insight
regarding the contextually specified nature of lexical semantics, the work presented here
explores a range of techniques for the selection of subspaces of word co-occurrence dimensions
based on a statistical analysis of input terms as observed within large-scale textual
corpora. The relationships between word-vectors that emerge in the projected subspaces
can be analysed in terms of a mapping between their geometric features and their semantic
properties. The power of this modelling technique is its ability to generate ad hoc
semantic relationships in response to an extemporaneous linguistic or conceptual situation.
The product of this approach is a generalisable computational linguistic methodology,
capable of taking input in various forms, including word groupings and sentential context,
and dynamically generating output from a broad base model of word co-occurrence
data. To demonstrate the versatility of the method, this thesis will present competitive
empirical results on a range of established natural language tasks including word similarity
and relatedness rating, metaphor and metonymy detection, and analogy completion.
A range of techniques will be applied in order to explore the ways in which different
aspects of projected geometries can be mapped to different semantic relationships, allowing
for the discovery of a range of lexical and conceptual properties for any given input
and providing a basis for an empirical exploration of distinctions between the semantic
phenomena under analysis. The case made here is that the flexibility of these models
and their ability to extend output to evaluations of unattested linguistic relationships
constitutes the groundwork for a method for the extrapolation of dynamic conceptual
relationships from large-scale textual corpora.
This method is presented as a complement and a counterpoint to established distributional
methods for generating lexically productive word-vectors. Where contemporary
vector space models of distributional semantics have almost universally involved either
the factorisation of co-occurrence matrices or the incremental learning of abstract representations
using neural networks, the approach described in this thesis preserves the
connection between the individual dimensions of word-vectors and statistics pertaining
to observations in a textual corpus. The hypothesis tested here is that the maintenance
of actual, interpretable information about underlying linguistic data allows for the contextual
selection of non-normalised subspaces with more nuanced geometric features. In
addition to presenting competitive results for various computational linguistic targets,
the thesis will suggest that the transparency of its representations indicates scope for
the application of this model to various real-world problems where an interpretable relationship
between data and output is highly desirable. This, finally, demonstrates a way
towards the productive application of the theory and philosophy of language to computational
linguistic practice.
Authors
McGregor, StephenCollections
- Theses [4160]