Character-level Convolutional Networks for Text Classification
@article{Zhang2015CharacterlevelCN, title={Character-level Convolutional Networks for Text Classification}, author={Xiang Zhang and Junbo Jake Zhao and Yann LeCun}, journal={ArXiv}, year={2015}, volume={abs/1509.01626} }
This article offers an empirical exploration on the use of character-level convolutional networks (ConvNets) for text classification. We constructed several large-scale datasets to show that character-level convolutional networks could achieve state-of-the-art or competitive results. Comparisons are offered against traditional models such as bag of words, n-grams and their TFIDF variants, and deep learning models such as word-based ConvNets and recurrent neural networks.
3,372 Citations
Character-Level Attention Convolutional Neural Networks for Short-Text Classification
- Computer ScienceHCC
- 2019
The experimental results show that ACNN model significantly improves the short-text classification results and is compared with traditional model such as LSTM and CNN.
Character-Level neural networks for short text classification
- Computer Science2017 International Smart Cities Conference (ISC2)
- 2017
The evaluations showed that the proposed model outperforms the standard CNN and traditional models on short text classification mission and including the highway networks framework so that it can address the difficult of training and improve the accuracy of classification.
Do Convolutional Networks need to be Deep for Text Classification ?
- Computer ScienceAAAI Workshops
- 2018
This work shows on 5 standard text classification and sentiment analysis tasks that deep models indeed give better performances than shallow networks when the text input is represented as a sequence of characters, but a simple shallow-and-wide network outperforms deep models such as DenseNet with word inputs.
Character-level Convolutional Network for Text Classification Applied to Chinese Corpus
- Computer ScienceArXiv
- 2016
A large-scale Chinese language dataset is constructed, and the result shows that character-level convolutional neural network works better on Chinese corpus than its corresponding pinyin format dataset.
A Character-Level Method for Text Classification
- Computer Science2018 2nd IEEE Advanced Information Management,Communicates,Electronic and Automation Control Conference (IMCEC)
- 2018
A language model of mix CNN with bi-RNN (Bi-directional Recurrent Neural Network) to classify the text at the character-level and has a better performance than the common CNN and LSTM(long short-term memory) classification methods.
Hierarchical Convolutional Attention Networks for Text Classification
- Computer ScienceRep4NLP@ACL
- 2018
The method is named Hierarchical Convolutional Attention Networks and it is demonstrated by surpassing the accuracy of the current state-of-the-art on several classification tasks while being twice as fast to train.
Experiments on Character and Word Level Features for Text Classification Using Deep Neural Network
- Computer Science2018 Third International Conference on Informatics and Computing (ICIC)
- 2018
CNN, Bi-RNN, and the combination of both are tested with character-level features and word- level features for text classification on English and Indonesian social media datasets and on small size datasets, word-level model outperformed character- level models.
Text Classification and Transfer Learning Based on Character-Level Deep Convolutional Neural Networks
- Computer ScienceICAART
- 2017
It is confirmed that meaningful representations are extracted by the ConvNets in English corpus and Japanese corpus and an attempt to reuse the meaningful representations that are learned in the convolutional Neural Network from a large-scale dataset in the form of transfer learning is attempted.
Character-level text classification via convolutional neural network and gated recurrent unit
- Computer ScienceInt. J. Mach. Learn. Cybern.
- 2020
This work introduces fully convolutional layers to substantially reduce the number of parameters in the text classification model and incorporates error minimization extreme learning machine into the proposed model to improve the classification accuracy further.
Efficient Character-level Document Classification by Combining Convolution and Recurrent Layers
- Computer ScienceArXiv
- 2016
A neural network architecture that utilizes both convolution and recurrent layers to efficiently encode character inputs is proposed and validated on eight large scale document classification tasks and compared with character-level convolution-only models.
References
SHOWING 1-10 OF 35 REFERENCES
Effective Use of Word Order for Text Categorization with Convolutional Neural Networks
- Computer ScienceNAACL
- 2015
A straightforward adaptation of CNN from image to text, a simple but new variation which employs bag-of-word conversion in the convolution layer is proposed and an extension to combine multiple convolution layers is explored for higher accuracy.
Convolutional Neural Networks for Sentence Classification
- Computer ScienceEMNLP
- 2014
The CNN models discussed herein improve upon the state of the art on 4 out of 7 tasks, which include sentiment analysis and question classification, and are proposed to allow for the use of both task-specific and static vectors.
Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts
- Computer ScienceCOLING
- 2014
A new deep convolutional neural network is proposed that exploits from characterto sentence-level information to perform sentiment analysis of short texts and achieves state-of-the-art results for single sentence sentiment prediction.
Learning Character-level Representations for Part-of-Speech Tagging
- Computer ScienceICML
- 2014
A deep neural network is proposed that learns character-level representation of words and associate them with usual word representations to perform POS tagging and produces state-of-the-art POS taggers for two languages.
Natural Language Processing (Almost) from Scratch
- Computer ScienceJ. Mach. Learn. Res.
- 2011
We propose a unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including part-of-speech tagging, chunking, named entity…
A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval
- Computer ScienceCIKM
- 2014
A new latent semantic model that incorporates a convolutional-pooling structure over word sequences to learn low-dimensional, semantic vector representations for search queries and Web documents is proposed.
In Defense of Word Embedding for Generic Text Representation
- Computer ScienceNLDB
- 2015
It is shown that by augmenting the word2vec representation with one of a few pooling techniques, results are obtained surpassing or comparable with the best literature algorithms.
Framewise phoneme classification with bidirectional LSTM and other neural network architectures
- Computer ScienceNeural Networks
- 2005
Gradient-based learning applied to document recognition
- Computer ScienceProc. IEEE
- 1998
This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task, and Convolutional neural networks are shown to outperform all other techniques.
LSTM: A Search Space Odyssey
- Computer ScienceIEEE Transactions on Neural Networks and Learning Systems
- 2017
This paper presents the first large-scale analysis of eight LSTM variants on three representative tasks: speech recognition, handwriting recognition, and polyphonic music modeling, and observes that the studied hyperparameters are virtually independent and derive guidelines for their efficient adjustment.