• Corpus ID: 49564629

Neural Random Projections for Language Modelling

@article{Nunes2018NeuralRP,
  title={Neural Random Projections for Language Modelling},
  author={Davide Nunes and Luis Antunes},
  journal={ArXiv},
  year={2018},
  volume={abs/1807.00930}
}
Neural network-based language models deal with data sparsity problems by mapping the large discrete space of words into a smaller continuous space of real-valued vectors. By learning distributed vector representations for words, each training sample informs the neural network model about a combinatorial number of other patterns. In this paper, we exploit the sparsity in natural language even further by encoding each unique input word using a fixed sparse random representation. These sparse… 

HyperEmbed: Tradeoffs Between Resources and Performance in NLP Tasks with Hyperdimensional Computing Enabled Embedding of n-gram Statistics

TLDR
This paper investigates distributed representations of n-gram statistics of texts formed using hyperdimensional computing enabled embedding and investigates the applicability of the embedding on one large and three small standard datasets for classification tasks using nine classifiers.

Interpreting Pretrained Contextualized Representations via Reductions to Static Embeddings

TLDR
Simple and fully general methods for converting from contextualized representations to static lookup-table embeddings are introduced which are applied to 5 popular pretrained models and 9 sets of pretrained weights and reveal that pooling over many contexts significantly improves representational quality under intrinsic evaluation.

High-dimensional statistical inference: Theoretical development to data analytics

  • D. Ayyala
  • Computer Science
    Handbook of Statistics
  • 2020

References

SHOWING 1-10 OF 54 REFERENCES

A Neural Probabilistic Language Model

TLDR
This work proposes to fight the curse of dimensionality by learning a distributed representation for words which allows each training sentence to inform the model about an exponential number of semantically neighboring sentences.

A fast and simple algorithm for training neural probabilistic language models

TLDR
This work proposes a fast and simple algorithm for training NPLMs based on noise-contrastive estimation, a newly introduced procedure for estimating unnormalized continuous distributions and demonstrates the scalability of the proposed approach by training several neural language models on a 47M-word corpus with a 80K-word vocabulary.

Distributed Representations of Words and Phrases and their Compositionality

TLDR
This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.

Three new graphical models for statistical language modelling

TLDR
It is shown how real-valued distributed representations for words can be learned at the same time as learning a large set of stochastic binary hidden features that are used to predict the distributed representation of the next word from previous distributed representations.

Hierarchical Probabilistic Neural Network Language Model

TLDR
A hierarchical decomposition of the conditional probabilities that yields a speed-up of about 200 both during training and recognition, constrained by the prior knowledge extracted from the WordNet semantic hierarchy is introduced.

Character-Aware Neural Language Models

TLDR
A simple neural language model that relies only on character-level inputs that is able to encode, from characters only, both semantic and orthographic information and suggests that on many languages, character inputs are sufficient for language modeling.

Linguistic Regularities in Sparse and Explicit Word Representations

TLDR
It is demonstrated that analogy recovery is not restricted to neural word embeddings, and that a similar amount of relational similarities can be recovered from traditional distributional word representations.

Efficient Estimation of Word Representations in Vector Space

TLDR
Two novel model architectures for computing continuous vector representations of words from very large data sets are proposed and it is shown that these vectors provide state-of-the-art performance on the authors' test set for measuring syntactic and semantic word similarities.

Exploring the Limits of Language Modeling

TLDR
This work explores recent advances in Recurrent Neural Networks for large scale Language Modeling, and extends current models to deal with two key challenges present in this task: corpora and vocabulary sizes, and complex, long term structure of language.

Encoding syntactic dependencies by vector permutation

TLDR
This work proposes an approach based on vector permutation and Random Indexing to encode several syntactic contexts in a single WordSpace in which words are represented by mathematical points in a geometric space.
...