THE IMPORTANCE OF SUBWORD EMBEDDINGS IN SENTENCE PAIR MODELING
@inproceedings{Lan2018THEIO, title={THE IMPORTANCE OF SUBWORD EMBEDDINGS IN SENTENCE PAIR MODELING}, author={Wuwei Lan and Wei Xu}, booktitle={NAACL 2018}, year={2018} }
Sentence pair modeling is critical for many NLP tasks, such as paraphrase identification, semantic textual similarity, and natural language inference. Most state-of-the-art neural models for these tasks rely on pretrained word embedding and compose sentence-level semantics in varied ways; however, few works have attempted to verify whether we really need pretrained embeddings in these tasks. In this paper, we study how effective subwordlevel (character and character n-gram) representations are…
2 Citations
Neural Network Models for Paraphrase Identification, Semantic Textual Similarity, Natural Language Inference, and Question Answering
- Computer ScienceCOLING
- 2018
It is shown that encoding contextual information by LSTM and inter-sentence interactions are critical and the Enhanced Sequential Inference Model is the best so far for larger datasets, while the Pairwise Word Interaction Model achieves the best performance when less data is available.
Co-Stack Residual Affinity Networks with Multi-level Attention Refinement for Matching Text Sequences
- Computer ScienceEMNLP
- 2018
This paper proposes Co-Stack Residual Affinity Networks (CSRAN), a new and universal neural architecture for this problem, and introduces a new bidirectional alignment mechanism that learns affinity weights by fusing sequence pairs across stacked hierarchies.
References
SHOWING 1-10 OF 29 REFERENCES
Inter-Weighted Alignment Network for Sentence Pair Modeling
- Computer ScienceEMNLP
- 2017
A model to measure the similarity of a sentence pair focusing on the interaction information is proposed and the word level similarity matrix is utilized to discover fine-grained alignment of two sentences.
Towards Universal Paraphrastic Sentence Embeddings
- Computer ScienceICLR
- 2016
This work considers the problem of learning general-purpose, paraphrastic sentence embeddings based on supervision from the Paraphrase Database, and compares six compositional architectures, finding that the most complex architectures, such as long short-term memory (LSTM) recurrent neural networks, perform best on the in-domain data.
ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs
- Computer ScienceTransactions of the Association for Computational Linguistics
- 2016
This work presents a general Attention Based Convolutional Neural Network (ABCNN) for modeling a pair of sentences and proposes three attention schemes that integrate mutual influence between sentences into CNNs; thus, the representation of each sentence takes into consideration its counterpart.
From Paraphrase Database to Compositional Paraphrase Model and Back
- Computer ScienceTransactions of the Association for Computational Linguistics
- 2015
This work proposes models to leverage the phrase pairs from the Paraphrase Database to build parametric paraphrase models that score paraphrase pairs more accurately than the PPDB’s internal scores while simultaneously improving its coverage.
Boosting Named Entity Recognition with Neural Character Embeddings
- Computer ScienceNEWS@ACL
- 2015
This work proposes a language-independent NER system that uses automatically learned features only and demonstrates that the same neural network which has been successfully applied to POS tagging can also achieve state-of-the-art results for language-independet NER, using the same hyperparameters, and without any handcrafted features.
Multilingual Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Models and Auxiliary Loss
- Computer ScienceACL
- 2016
This work presents a novel bi-LSTM model, which combines the POS tagging loss function with an auxiliary loss function that accounts for rare words, which obtains state-of-the-art performance across 22 languages, and works especially well for morphologically complex languages.
Character-Aware Neural Language Models
- Computer ScienceAAAI
- 2016
A simple neural language model that relies only on character-level inputs that is able to encode, from characters only, both semantic and orthographic information and suggests that on many languages, character inputs are sufficient for language modeling.
Neural Paraphrase Identification of Questions with Noisy Pretraining
- Computer ScienceSWCN@EMNLP
- 2017
A variant of the decomposable attention model is shown to results in accurate performance on the problem of paraphrase identification of questions, while being far simpler than many competing neural architectures.
From Characters to Words to in Between: Do We Capture Morphology?
- LinguisticsACL
- 2017
None of the character-level models match the predictive accuracy of a model with access to true morphological analyses, even when learned from an order of magnitude more data.
Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation
- Computer ScienceEMNLP
- 2015
A model for constructing vector representations of words by composing characters using bidirectional LSTMs that requires only a single vector per character type and a fixed set of parameters for the compositional model, which yields state- of-the-art results in language modeling and part-of-speech tagging.