Regularizing Text Categorization with Clusters of Words

@inproceedings{Skianis2016RegularizingTC,
  title={Regularizing Text Categorization with Clusters of Words},
  author={Konstantinos Skianis and F. Rousseau and M. Vazirgiannis},
  booktitle={EMNLP},
  year={2016}
}
Regularization is a critical step in supervised learning to not only address overfitting, but also to take into account any prior knowledge we may have on the features and their dependence. In this paper, we explore stateof-the-art structured regularizers and we propose novel ones based on clusters of words from LSI topics, word2vec embeddings and graph-of-words document representation. We show that our proposed regularizers are faster than the state-of-the-art ones and still improve text… Expand
9 Citations
Boosting Tricks for Word Mover's Distance
  • PDF
Graph of Words: Boosting Text Mining Tasks with Graphs
  • 3
  • PDF
Graph Convolutional Networks for Text Classification
  • 349
  • PDF
Orthogonal Matching Pursuit for Text Classification
  • 2
  • PDF
Fusing Global Domain Information and Local Semantic Information to Classify Financial Documents

References

SHOWING 1-10 OF 54 REFERENCES
Linguistic Structured Sparsity in Text Categorization
  • 55
  • PDF
Topical Word Embeddings
  • 310
  • PDF
Regularized Learning with Networks of Features
  • 56
  • PDF
Graph-based term weighting for text categorization
  • 38
  • PDF
Efficient Estimation of Word Representations in Vector Space
  • 17,539
  • PDF
Graph-of-word and TW-IDF: new approach to ad hoc IR
  • 116
  • PDF
Latent Dirichlet Allocation
  • 26,746
  • PDF
Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification
  • 1,922
  • PDF
Software Framework for Topic Modelling with Large Corpora
  • 3,051
  • PDF
...
1
2
3
4
5
...