Skip to search formSkip to main contentSkip to account menu

Document classification

Known as: Topic spotting, Text categorisation, Classification 
Document classification or document categorization is a problem in library science, information science and computer science. The task is to assign a… 
Wikipedia (opens in a new tab)

Papers overview

Semantic Scholar uses AI to extract papers important to this topic.
Highly Cited
2017
Highly Cited
2017
Hierarchical attention networks have recently achieved remarkable performance for document classification in a given language… 
Highly Cited
2010
Highly Cited
2010
The regularization principals [31] lead approximation schemes to deal with various learning problems, e.g., the regularization of… 
Highly Cited
2008
Highly Cited
2008
The recent growth in network usage has motivated the creation of new malicious code for various purposes, including economic ones… 
Highly Cited
2007
Highly Cited
2007
We consider feature selection for text classification both theoretically and empirically. Our main result is an unsupervised… 
Highly Cited
2006
Highly Cited
2006
Spam filtering poses a special problem in text categorization, of which the defining characteristic is that filters face an… 
Highly Cited
2004
Highly Cited
2004
This paper deals with automatic classification of Arabic web documents. Such a classification is very useful for affording… 
Highly Cited
2004
Highly Cited
2004
Web-page classification is much more difficult than pure-text classification due to a large variety of noisy information embedded… 
Highly Cited
2003
Highly Cited
2003
We augment the naive Bayes model with an n-gram language model to address two shortcomings of naive Bayes text classifiers. The… 
Highly Cited
2001
Highly Cited
2001
The paper presents a novel approach to unsupervised text summarization. The novelty lies in exploiting the diversity of concepts… 
Highly Cited
1999
Highly Cited
1999
Grouping images into (semantically) meaningful categories using low level visual features is a challenging and important problem…