A unified architecture for natural language processing: deep neural networks with multitask learning


We describe a single convolutional neural network architecture that, given a sentence, outputs a host of language processing predictions: part-of-speech tags, chunks, named entity tags, semantic roles, semantically similar words and the likelihood that the sentence makes sense (grammatically and semantically) using a language model. The entire network is trained <i>jointly</i> on all these tasks using weight-sharing, an instance of <i>multitask learning</i>. All the tasks use labeled data except the language model which is learnt from unlabeled text and represents a novel form of <i>semi-supervised learning</i> for the shared tasks. We show how both <i>multitask learning</i> and <i>semi-supervised learning</i> improve the generalization of the shared tasks, resulting in state-of-the-art-performance.

DOI: 10.1145/1390156.1390177

Extracted Key Phrases

5 Figures and Tables

Showing 1-10 of 1,108 extracted citations
Citations per Year

1,679 Citations

Semantic Scholar estimates that this publication has received between 1,533 and 1,842 citations based on the available data.

See our FAQ for additional information.