Corpus ID: 351666

Natural Language Processing (Almost) from Scratch

@article{Collobert2011NaturalLP,
  title={Natural Language Processing (Almost) from Scratch},
  author={Ronan Collobert and J. Weston and L. Bottou and Michael Karlen and K. Kavukcuoglu and P. Kuksa},
  journal={J. Mach. Learn. Res.},
  year={2011},
  volume={12},
  pages={2493-2537}
}
We propose a unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including part-of-speech tagging, chunking, named entity recognition, and semantic role labeling. This versatility is achieved by trying to avoid task-specific engineering and therefore disregarding a lot of prior knowledge. Instead of exploiting man-made input features carefully optimized for each task, our system learns internal representations on the basis… Expand
Supervised Learning of Universal Sentence Representations from Natural Language Inference Data
TLDR
It is shown how universal sentence representations trained using the supervised data of the Stanford Natural Language Inference datasets can consistently outperform unsupervised methods like SkipThought vectors on a wide range of transfer tasks. Expand
Bidirectional Recursive Neural Networks for Token-Level Labeling with Structure
TLDR
This work proposes a novel architecture that aims to capture the structural information around an input, and use it to label instances, and applies it to the task of opinion expression extraction. Expand
Semi-supervised sequence tagging with bidirectional language models
TLDR
A general semi-supervised approach for adding pre- trained context embeddings from bidirectional language models to NLP systems and apply it to sequence labeling tasks, surpassing previous systems that use other forms of transfer or joint learning with additional labeled data and task specific gazetteers. Expand
Training neural word embeddings for transfer learning and translation
TLDR
This dissertation hypothesises that neural word embeddings, i.e. representations that use continuous values to represent words in a learned vector space of meaning, are a suitable and efficient approach for learning representations of natural languages that are useful for predicting various aspects related to their meaning, and presents several contributions which make inducing word representations faster and applicable for monolingual and various cross-lingual prediction tasks. Expand
Multilingual POS tagging by a composite deep architecture based on character-level features and on-the-fly enriched Word Embeddings
TLDR
A POS tagging system based on a deep neural network made of a static and task-independent pre-trained model for representing words semantics enriched by morphological information, by approximating the Word Embedding representation learned from an unlabelled corpus by the fastText model is proposed. Expand
Sparse Coding of Neural Word Embeddings for Multilingual Sequence Labeling
  • Gábor Berend
  • Computer Science
  • Transactions of the Association for Computational Linguistics
  • 2017
TLDR
A sequence labeling framework which solely utilizes sparse indicator features derived from dense distributed word representations, which obtains (near) state-of-the art performance for both part- of-speech tagging and named entity recognition for a variety of languages. Expand
Phrase Representations for Multiword Expressions
TLDR
A model that takes advantage of dense word representations to perform phrase tagging by directly identifying and classifying phrases is introduced and it is shown that the model outperforms the state of the art model for this task. Expand
Neural Networks Architecture for Amazigh POS Tagging
TLDR
Instead of extracting from the sentence a rich set of hand-crafted features which are the fed to a standard classification algorithm, this work drew its inspiration from recent papers about the automatic extraction of word embeddings from large unlabelled data sets to improve the Amazigh POS Tagging system performances. Expand
Semi-supervised Multitask Learning for Sequence Labeling
TLDR
A sequence labeling framework with a secondary training objective, learning to predict surrounding words for every word in the dataset, which incentivises the system to learn general-purpose patterns of semantic and syntactic composition, useful for improving accuracy on different sequence labeling tasks. Expand
A Unified Tagging Solution: Bidirectional LSTM Recurrent Neural Network with Word Embedding
TLDR
This work proposes to use BLSTM-RNN for a unified tagging solution that can be applied to various tagging tasks including part-of-speech tagging, chunking and named entity recognition, requiring no task specific knowledge or sophisticated feature engineering. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 107 REFERENCES
Early results for Named Entity Recognition with Conditional Random Fields, Feature Induction and Web-Enhanced Lexicons
TLDR
This work has shown that conditionally-trained models, such as conditional maximum entropy models, handle inter-dependent features of greedy sequence modeling in NLP well. Expand
Semi-Supervised Learning for Natural Language
TLDR
This thesis focuses on two segmentation tasks, named-entity recognition and Chinese word segmentation, and shows that features derived from unlabeled data substantially improves performance, both in terms of reducing the amount of labeled data needed to achieve a certain performance level and in termsof reducing the error using a fixed amount of labeling data. Expand
Deep Learning for Efficient Discriminative Parsing
We propose a new fast purely discriminative algorithm for natural language parsing, based on a “deep” recurrent convolutional graph transformer network (GTN). Assuming a decomposition of a parse treeExpand
Semi-Supervised Sequential Labeling and Segmentation Using Giga-Word Scale Unlabeled Data
TLDR
Evidence that the use of more unlabeled data in semi-supervised learning can improve the performance of Natural Language Processing tasks, such as part-of-speech tagging, syntactic chunking, and named entity recognition is provided. Expand
Distributional Representations for Handling Sparsity in Supervised Sequence-Labeling
TLDR
It is demonstrated that distributional representations of word types, trained on unannotated text, can be used to improve performance on rare words and reduces the sample complexity of sequence labeling. Expand
Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network
TLDR
A new part-of-speech tagger is presented that demonstrates the following ideas: explicit use of both preceding and following tag contexts via a dependency network representation, broad use of lexical features, and effective use of priors in conditional loglinear models. Expand
Word Representations: A Simple and General Method for Semi-Supervised Learning
TLDR
This work evaluates Brown clusters, Collobert and Weston (2008) embeddings, and HLBL (Mnih & Hinton, 2009) embeds of words on both NER and chunking, and finds that each of the three word representations improves the accuracy of these baselines. Expand
Joint Parsing and Semantic Role Labeling
TLDR
This paper jointly performs parsing and semantic role labeling, using a probabilistic SRL system to rerank the results of a ProbabilisticParser, because a locally-trained SRL model can return inaccurate probability estimates. Expand
Shallow Semantic Parsing using Support Vector Machines
TLDR
A machine learning algorithm for shallow semantic parsing based on Support Vector Machines which shows performance improvements through a number of new features and their ability to generalize to a new test set drawn from the AQUAINT corpus. Expand
Simple Semi-supervised Dependency Parsing
TLDR
This work focuses on the problem of lexical representation, introducing features that incorporate word clusters derived from a large unannotated corpus, and shows that the cluster-based features yield substantial gains in performance across a wide range of conditions. Expand
...
1
2
3
4
5
...