Hyperdimensional utterance spaces

Jussi Karlgren and Pentti Kanerva

Human language has a large and varying number of features, both lexical items and constructions, which interact to represent various aspects of communicative information. High-dimensional semantic spaces have proven useful and effective for aggregating and processing lexical information for many language processing tasks. This paper describes a hyperdimensional processing model for language data, a straightforward extension of models previously used for words to handling utterance or text level information. A hyperdimensional model is able to represent a broad range of linguistic and extra-linguistic features in a common integral framework which is suitable as a bridge between symbolic and continuous representations, as an encoding scheme for symbolic information and as a basis for feature space exploration. This paper provides an overview of the framework and an example of how it is used in a pilot experiment.

Presented at the 1st Biennial Conference on Design of Experimental Search and Information Retrieval Systems (DESIRES), 2018