Random indexing of multidimensional data

Fredrik Sandin, Blerim Emruli, and Magnus Sahlgren

This paper gives a model for how to generalise random indexing to multidimensional arrays and therefore enable approximation of higher-order statistical relationships in data. The generalised method is a sparse implementation of random projections, which is the theoretical basis also for ordinary random indexing and other randomisation approaches to dimensionality reduction and data representation. We present numerical experiments which demonstrate that a multidimensional generalisation of random indexing is feasible, including comparisons with ordinary random indexing and principal component analysis. An open source implementation of generalised random indexing is provided.

Knowledge and Information Systems (2016). doi:10.1007/s10115-016-1012-2