Skip to search form
Skip to main content
Skip to account menu
Semantic Scholar
Semantic Scholar's Logo
Search 212,538,611 papers from all fields of science
Search
Sign In
Create Free Account
Document classification
Known as:
Topic spotting
, Text categorisation
, Classification
Expand
Document classification or document categorization is a problem in library science, information science and computer science. The task is to assign a…
Expand
Wikipedia
(opens in a new tab)
Create Alert
Alert
Related topics
Related topics
48 relations
Artificial neural network
Categorization
Concept mining
Controlled vocabulary
Expand
Broader (2)
Machine learning
Natural language processing
Papers overview
Semantic Scholar uses AI to extract papers important to this topic.
Highly Cited
2016
Highly Cited
2016
Hierarchical Attention Networks for Document Classification
Zichao Yang
,
Diyi Yang
,
Chris Dyer
,
Xiaodong He
,
Alex Smola
,
E. Hovy
North American Chapter of the Association for…
2016
Corpus ID: 6857205
We propose a hierarchical attention network for document classification. Our model has two distinctive characteristics: (i) it…
Expand
Highly Cited
2014
Highly Cited
2014
Distributed Representations of Sentences and Documents
Quoc V. Le
,
Tomas Mikolov
International Conference on Machine Learning
2014
Corpus ID: 2407601
Many machine learning algorithms require the input to be represented as a fixed-length feature vector. When it comes to texts…
Expand
Review
2010
Review
2010
A Review of Machine Learning Algorithms for Text-Documents Classification
B. Baharudin
,
Lam Hong Lee
,
Khairullah Khan
2010
Corpus ID: 14774186
With the increasing availability of electronic documents and the rapid growth of the World Wide Web, the task of automatic…
Expand
Review
2010
Review
2010
A Survey on Transfer Learning
Sinno Jialin Pan
,
Qiang Yang
IEEE Transactions on Knowledge and Data…
2010
Corpus ID: 740063
A major assumption in many machine learning and data mining algorithms is that the training and future data must be in the same…
Expand
Highly Cited
2002
Highly Cited
2002
One-Class SVMs for Document Classification
L. Manevitz
,
M. Yousef
Journal of machine learning research
2002
Corpus ID: 15112547
We implemented versions of the SVM appropriate for one-class classification in the context of information retrieval. The…
Expand
Highly Cited
2001
Highly Cited
2001
Latent Dirichlet Allocation
D. Blei
,
A. Ng
,
Michael I. Jordan
Journal of machine learning research
2001
Corpus ID: 3177797
Highly Cited
2000
Highly Cited
2000
Text Classification from Labeled and Unlabeled Documents using EM
K. Nigam
,
A. McCallum
,
S. Thrun
,
Tom Michael Mitchell
Machine-mediated learning
2000
Corpus ID: 686980
This paper shows that the accuracy of learned text classifiers can be improved by augmenting a small number of labeled training…
Expand
Highly Cited
2000
Highly Cited
2000
An Introduction to Support Vector Machines and Other Kernel-based Learning Methods
N. Cristianini
,
J. Shawe-Taylor
2000
Corpus ID: 14727192
From the publisher: This is the first comprehensive introduction to Support Vector Machines (SVMs), a new generation learning…
Expand
Highly Cited
2000
Highly Cited
2000
Centroid-Based Document Classification: Analysis and Experimental Results
Eui-Hong Han
,
G. Karypis
European Conference on Principles of Data Mining…
2000
Corpus ID: 6340813
In this paper we present a simple linear-time centroid-based document classification algorithm, that despite its simplicity and…
Expand
Highly Cited
1998
Highly Cited
1998
Classification of text documents
Yonghong Li
,
Anil K. Jain
Proceedings. Fourteenth International Conference…
1998
Corpus ID: 8805137
We investigate four different classification methods for document classification: the naive Bayes classifier, nearest neighbor…
Expand
By clicking accept or continuing to use the site, you agree to the terms outlined in our
Privacy Policy
(opens in a new tab)
,
Terms of Service
(opens in a new tab)
, and
Dataset License
(opens in a new tab)
ACCEPT & CONTINUE