• Corpus ID: 686481

Character-Aware Neural Language Models

@inproceedings{Kim2016CharacterAwareNL,
  title={Character-Aware Neural Language Models},
  author={Yoon Kim and Yacine Jernite and David A. Sontag and Alexander M. Rush},
  booktitle={AAAI},
  year={2016}
}
We describe a simple neural language model that relies only on character-level inputs. [...] Key Result The results suggest that on many languages, character inputs are sufficient for language modeling. Analysis of word representations obtained from the character composition part of the model reveals that the model is able to encode, from characters only, both semantic and orthographic information.Expand
Character-level language modeling with hierarchical recurrent neural networks
  • Kyuyeon Hwang, Wonyong Sung
  • Computer Science, Mathematics
    2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2017
TLDR
This work proposes hierarchical RNN architectures, which consist of multiple modules with different timescales, and shows better perplexity than Kneser-Ney (KN) 5-gram WLMs on the One Billion Word Benchmark with only 2% of parameters.
Language Modeling for Morphologically Rich Languages: Character-Aware Modeling for Word-Level Prediction
TLDR
The main technical contribution of this work is a novel method for injecting subword-level information into semantic word vectors, integrated into the neural language modeling training, to facilitate word-level prediction.
Gated Word-Character Recurrent Language Model
TLDR
A recurrent neural network language model (RNN-LM) with long short-term memory (LSTM) units that utilizes both character-level and word-level inputs that outperforms word- level language models on several English corpora.
Enhancing recurrent neural network-based language models by word tokenization
TLDR
This paper presents a recurrent neural network language model based on the tokenization of words into three parts: the prefix, the stem, and the suffix that outperforms the baseline n-gram model, the basic recurrent Neural Network Language Model (RNNLM), and the GPU-based recurrent neuralnetwork language models (CUED-RNN LM).
Character and Subword-Based Word Representation for Neural Language Modeling Prediction
TLDR
This work investigates the effect of using subword units (character and factored morphological decomposition) to build output representations for neural language modeling and shows that augmenting the output word representations with character-based embeddings can significantly improve the performance of the model.
Character-based Neural Machine Translation
TLDR
A neural machine translation model that views the input and output sentences as sequences of characters rather than words, which alleviates much of the challenges associated with preprocessing/tokenization of the source and target languages.
A Character-Word Compositional Neural Language Model for Finnish
TLDR
A new Character- to-Word-to-Character (C2W2C) compositional language model that uses characters as input and output while still internally processing word level embeddings and can respond well to the challenges of morphologically rich languages.
Character-based Neural Machine Translation
TLDR
A neural MT system using character-based embeddings in combination with convolutional and highway layers to replace the standard lookup-based word representations to provide improved results even when the source language is not morphologically rich is proposed.
Comparing Character-level Neural Language Models Using a Lexical Decision Task
TLDR
The overall number of parameters in the network turns out to be the most important predictor of accuracy; in particular, there is little evidence that deeper networks are beneficial for this task.
Explaining Character-Aware Neural Networks for Word-Level Prediction: Do They Discover Linguistic Rules?
TLDR
This paper evaluates and compares convolutional neural networks for the task of morphological tagging on three morphologically different languages and shows that these models implicitly discover understandable linguistic rules.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 83 REFERENCES
SUBWORD LANGUAGE MODELING WITH NEURAL NETWORKS
We explore the performance of several types of language mode ls on the word-level and the character-level language modelin g tasks. This includes two recently proposed recurrent neural netwo rk
Learning Character-level Representations for Part-of-Speech Tagging
TLDR
A deep neural network is proposed that learns character-level representation of words and associate them with usual word representations to perform POS tagging and produces state-of-the-art POS taggers for two languages.
Boosting Named Entity Recognition with Neural Character Embeddings
TLDR
This work proposes a language-independent NER system that uses automatically learned features only and demonstrates that the same neural network which has been successfully applied to POS tagging can also achieve state-of-the-art results for language-independet NER, using the same hyperparameters, and without any handcrafted features.
LSTM Neural Networks for Language Modeling
TLDR
This work analyzes the Long Short-Term Memory neural network architecture on an English and a large French language modeling task and gains considerable improvements in WER on top of a state-of-the-art speech recognition system.
genCNN: A Convolutional Architecture for Word Sequence Prediction
TLDR
It is argued that the proposed novel convolutional architecture, named $gen$CNN, can give adequate representation of the history, and therefore can naturally exploit both the short and long range dependencies.
Context dependent recurrent neural network language model
TLDR
This paper improves recurrent neural network language models performance by providing a contextual real-valued input vector in association with each word to convey contextual information about the sentence being modeled by performing Latent Dirichlet Allocation using a block of preceding text.
Co-learning of Word Representations and Morpheme Representations
TLDR
This paper introduces the morphological knowledge as both additional input representation and auxiliary supervision to the neural network framework and will produce morpheme representations, which can be further employed to infer the representations of rare or unknown words based on their morphological structure.
Better Word Representations with Recursive Neural Networks for Morphology
TLDR
This paper combines recursive neural networks, where each morpheme is a basic unit, with neural language models to consider contextual information in learning morphologicallyaware word representations and proposes a novel model capable of building representations for morphologically complex words from their morphemes.
A Convolutional Neural Network for Modelling Sentences
TLDR
A convolutional architecture dubbed the Dynamic Convolutional Neural Network (DCNN) is described that is adopted for the semantic modelling of sentences and induces a feature graph over the sentence that is capable of explicitly capturing short and long-range relations.
Sequence to Sequence Learning with Neural Networks
TLDR
This paper presents a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure, and finds that reversing the order of the words in all source sentences improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.
...
1
2
3
4
5
...