• Corpus ID: 49564629

Neural Random Projections for Language Modelling

  title={Neural Random Projections for Language Modelling},
  author={Davide Nunes and Luis Antunes},
Neural network-based language models deal with data sparsity problems by mapping the large discrete space of words into a smaller continuous space of real-valued vectors. By learning distributed vector representations for words, each training sample informs the neural network model about a combinatorial number of other patterns. In this paper, we exploit the sparsity in natural language even further by encoding each unique input word using a fixed sparse random representation. These sparse… 

HyperEmbed: Tradeoffs Between Resources and Performance in NLP Tasks with Hyperdimensional Computing Enabled Embedding of n-gram Statistics

This paper investigates distributed representations of n-gram statistics of texts formed using hyperdimensional computing enabled embedding and investigates the applicability of the embedding on one large and three small standard datasets for classification tasks using nine classifiers.

Interpreting Pretrained Contextualized Representations via Reductions to Static Embeddings

Simple and fully general methods for converting from contextualized representations to static lookup-table embeddings are introduced which are applied to 5 popular pretrained models and 9 sets of pretrained weights and reveal that pooling over many contexts significantly improves representational quality under intrinsic evaluation.

High-dimensional statistical inference: Theoretical development to data analytics

  • D. Ayyala
  • Computer Science
    Handbook of Statistics
  • 2020



A Neural Probabilistic Language Model

This work proposes to fight the curse of dimensionality by learning a distributed representation for words which allows each training sentence to inform the model about an exponential number of semantically neighboring sentences.

A fast and simple algorithm for training neural probabilistic language models

This work proposes a fast and simple algorithm for training NPLMs based on noise-contrastive estimation, a newly introduced procedure for estimating unnormalized continuous distributions and demonstrates the scalability of the proposed approach by training several neural language models on a 47M-word corpus with a 80K-word vocabulary.

Distributed Representations of Words and Phrases and their Compositionality

This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.

Three new graphical models for statistical language modelling

It is shown how real-valued distributed representations for words can be learned at the same time as learning a large set of stochastic binary hidden features that are used to predict the distributed representation of the next word from previous distributed representations.

Hierarchical Probabilistic Neural Network Language Model

A hierarchical decomposition of the conditional probabilities that yields a speed-up of about 200 both during training and recognition, constrained by the prior knowledge extracted from the WordNet semantic hierarchy is introduced.

Character-Aware Neural Language Models

A simple neural language model that relies only on character-level inputs that is able to encode, from characters only, both semantic and orthographic information and suggests that on many languages, character inputs are sufficient for language modeling.

Linguistic Regularities in Sparse and Explicit Word Representations

It is demonstrated that analogy recovery is not restricted to neural word embeddings, and that a similar amount of relational similarities can be recovered from traditional distributional word representations.

Efficient Estimation of Word Representations in Vector Space

Two novel model architectures for computing continuous vector representations of words from very large data sets are proposed and it is shown that these vectors provide state-of-the-art performance on the authors' test set for measuring syntactic and semantic word similarities.

Exploring the Limits of Language Modeling

This work explores recent advances in Recurrent Neural Networks for large scale Language Modeling, and extends current models to deal with two key challenges present in this task: corpora and vocabulary sizes, and complex, long term structure of language.

Encoding syntactic dependencies by vector permutation

This work proposes an approach based on vector permutation and Random Indexing to encode several syntactic contexts in a single WordSpace in which words are represented by mathematical points in a geometric space.