Document classification

Known as: Topic spotting, Text categorisation, Classification

Document classification or document categorization is a problem in library science, information science and computer science. The task is to assign a…

Wikipedia

Papers overview

Semantic Scholar uses AI to extract papers important to this topic.

2013

Finding Opinion Strength Using Rule-Based Parsing for Arabic Sentiment Analysis

Shereen OrabyY. El-SonbatyM. A. El-Nasr
Mexican International Conference on Artificial…
2013
Corpus ID: 14091437

With increasing interest in sentiment analysis research and opinionated web content always on the rise, focus on analysis of text…

Review

2010

Review

2010

Automated assessment of ESOL free text examinations

Ted BriscoeBen MedlockØistein E. Andersen
2010
Corpus ID: 16253657

In this report, we consider the task of automated assessment of English as a Second Language (ESOL) examination scripts written…

2009

Training Data Cleaning for Text Classification

Andrea EsuliF. Sebastiani
International Conference on the Theory of…
2009
Corpus ID: 14123558

In text classification (TC) and other tasks involving supervised learning, labelled data may be scarce or expensive to obtain…

2008

An Extended Document Frequency Metric for Feature Selection in Text Categorization

Feature selection plays an important role in text categorization. Many sophisticated feature selection methods such as…

2004

Spontaneous handwriting recognition and classification

A. RossiAlfons Juan-CíscarE. Vidal
Proceedings of the 17th International Conference…
2004
Corpus ID: 13386034

Finite-state models are used to implement a handwritten text recognition and classification system for a real application…

Highly Cited

2004

Highly Cited

2004

PoBOC: An Overlapping Clustering Algorithm, Application to Rule-Based Classification and Textual Data

Guillaume CleuziouLionel MartinChristel Vrain
European Conference on Artificial Intelligence
2004
Corpus ID: 15945079

This paper presents the clustering algorithm PoBOC (Pole-Based Overlapping Clustering). It has two main characteristics: the…

2003

A Corpus-Independent Feature Set for Style-Based Text Categorization

Moshe KoppelNavot AkivaIdo Dagan
2003
Corpus ID: 14441055

We suggest a corpus-independent feature set appropriate for style-based text categorization problems. To achieve this, we…

2003

Automatic Keyword Extraction for News Finder

J. Martínez-FernándezAna M. García-SerranoPaloma MartínezJulio Villena-Román
Adaptive Multimedia Retrieval
2003
Corpus ID: 87937

Newspapers are one of the most challenging domains for information retrieval systems: new articles appear everyday written in…

2002

Pourquoi les n-grammes permettent de classer des textes? Recherche de mots-clefs pertinents à l'aide des n-grammes caractéristiques

R. JalamJ.-H. Chauchat
2002
Corpus ID: 60245705

Why N-grams constitute an effective tool for the text categorization? How do we pass from the purely formal aspect of the text to…

2001

Research and Implementation of Text Categorization System Based on VSM

Pang Jian
2001
Corpus ID: 64309129

In recent years , information processing turns more and more important for us to get useful information . Text categorization…

Document classification

Related topics

Broader (2)

Papers overview