• Corpus ID: 368182

Character-level Convolutional Networks for Text Classification

@article{Zhang2015CharacterlevelCN,
  title={Character-level Convolutional Networks for Text Classification},
  author={Xiang Zhang and Junbo Jake Zhao and Yann LeCun},
  journal={ArXiv},
  year={2015},
  volume={abs/1509.01626}
}
This article offers an empirical exploration on the use of character-level convolutional networks (ConvNets) for text classification. We constructed several large-scale datasets to show that character-level convolutional networks could achieve state-of-the-art or competitive results. Comparisons are offered against traditional models such as bag of words, n-grams and their TFIDF variants, and deep learning models such as word-based ConvNets and recurrent neural networks. 

Figures and Tables from this paper

Character-Level Attention Convolutional Neural Networks for Short-Text Classification
TLDR
The experimental results show that ACNN model significantly improves the short-text classification results and is compared with traditional model such as LSTM and CNN.
Character-Level neural networks for short text classification
TLDR
The evaluations showed that the proposed model outperforms the standard CNN and traditional models on short text classification mission and including the highway networks framework so that it can address the difficult of training and improve the accuracy of classification.
Do Convolutional Networks need to be Deep for Text Classification ?
TLDR
This work shows on 5 standard text classification and sentiment analysis tasks that deep models indeed give better performances than shallow networks when the text input is represented as a sequence of characters, but a simple shallow-and-wide network outperforms deep models such as DenseNet with word inputs.
Character-level Convolutional Network for Text Classification Applied to Chinese Corpus
TLDR
A large-scale Chinese language dataset is constructed, and the result shows that character-level convolutional neural network works better on Chinese corpus than its corresponding pinyin format dataset.
A Character-Level Method for Text Classification
TLDR
A language model of mix CNN with bi-RNN (Bi-directional Recurrent Neural Network) to classify the text at the character-level and has a better performance than the common CNN and LSTM(long short-term memory) classification methods.
Hierarchical Convolutional Attention Networks for Text Classification
TLDR
The method is named Hierarchical Convolutional Attention Networks and it is demonstrated by surpassing the accuracy of the current state-of-the-art on several classification tasks while being twice as fast to train.
Experiments on Character and Word Level Features for Text Classification Using Deep Neural Network
TLDR
CNN, Bi-RNN, and the combination of both are tested with character-level features and word- level features for text classification on English and Indonesian social media datasets and on small size datasets, word-level model outperformed character- level models.
Text Classification and Transfer Learning Based on Character-Level Deep Convolutional Neural Networks
TLDR
It is confirmed that meaningful representations are extracted by the ConvNets in English corpus and Japanese corpus and an attempt to reuse the meaningful representations that are learned in the convolutional Neural Network from a large-scale dataset in the form of transfer learning is attempted.
Character-level text classification via convolutional neural network and gated recurrent unit
TLDR
This work introduces fully convolutional layers to substantially reduce the number of parameters in the text classification model and incorporates error minimization extreme learning machine into the proposed model to improve the classification accuracy further.
Efficient Character-level Document Classification by Combining Convolution and Recurrent Layers
TLDR
A neural network architecture that utilizes both convolution and recurrent layers to efficiently encode character inputs is proposed and validated on eight large scale document classification tasks and compared with character-level convolution-only models.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 35 REFERENCES
Effective Use of Word Order for Text Categorization with Convolutional Neural Networks
TLDR
A straightforward adaptation of CNN from image to text, a simple but new variation which employs bag-of-word conversion in the convolution layer is proposed and an extension to combine multiple convolution layers is explored for higher accuracy.
Convolutional Neural Networks for Sentence Classification
TLDR
The CNN models discussed herein improve upon the state of the art on 4 out of 7 tasks, which include sentiment analysis and question classification, and are proposed to allow for the use of both task-specific and static vectors.
Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts
TLDR
A new deep convolutional neural network is proposed that exploits from characterto sentence-level information to perform sentiment analysis of short texts and achieves state-of-the-art results for single sentence sentiment prediction.
Learning Character-level Representations for Part-of-Speech Tagging
TLDR
A deep neural network is proposed that learns character-level representation of words and associate them with usual word representations to perform POS tagging and produces state-of-the-art POS taggers for two languages.
Natural Language Processing (Almost) from Scratch
We propose a unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including part-of-speech tagging, chunking, named entity
A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval
TLDR
A new latent semantic model that incorporates a convolutional-pooling structure over word sequences to learn low-dimensional, semantic vector representations for search queries and Web documents is proposed.
In Defense of Word Embedding for Generic Text Representation
TLDR
It is shown that by augmenting the word2vec representation with one of a few pooling techniques, results are obtained surpassing or comparable with the best literature algorithms.
Gradient-based learning applied to document recognition
TLDR
This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task, and Convolutional neural networks are shown to outperform all other techniques.
LSTM: A Search Space Odyssey
TLDR
This paper presents the first large-scale analysis of eight LSTM variants on three representative tasks: speech recognition, handwriting recognition, and polyphonic music modeling, and observes that the studied hyperparameters are virtually independent and derive guidelines for their efficient adjustment.
...
1
2
3
4
...