Corpus ID: 9975110

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

  title={Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task},
  author={Kotaro Hashimoto and Takashi Yukawa},
In the present paper, a term weighting classification method using the chi-square statistic is proposed and evaluated in the classification subtask at NTCIR-6 patent retrieval task. In this task, large numbers of patent applications are classified into Fterm categories. Therefore, a patent classification system requires high classification speed, as well as high classification accuracy. The chi-square statistic can calculate the frequency of word appearance in the F-term and the frequency of… Expand
Adapting Support Vector Machines for F-term-based Classification of Patents
This article describes a SVM-based system and several techniques developed successfully to adapt SVM for the specific features of the F-term patent classification task, and presents the experimental results that demonstrate the benefits of the latest approach. Expand
Integrating Query Translation and Text Classification in a Cross-Language Patent Access System
A cross-language patent retrieval and classification system is presented to integrate the query translation using various free web translators on the internet and the document classification to indicate that the performance of the cross-lingual text classification reached almost the same level of the mono-lingUAL text classification. Expand
Automated Multi-Label Classification of the Dutch Speeches from the Throne
This thesis presents an automated Dutch multi-label classification system that uses a traditional bag-of-words document representation, an extensive set of human coded examples, and an exhaustive topic coding system to automatically classify each sentence into none, one or several categories. Expand
Patent-Related Tasks at NTCIR
This chapter provides a reference summary of the efforts undertaken in NTCIR, helping the reader understand the challenges addressed, the datasets created and the solutions observed. Expand


Overview of Classification Subtask at NTCIR-6 Patent Retrieval Task
This paper describes Classification Subtask at NTCIR-5 Patent Retrieval Task. We perform two subtasks for patent classification using a multi-dimensional classification structure called “F-term (FileExpand
A re-examination of text categorization methods
The results show that SVM, kNN and LLSF signi cantly outperform NNet and NB when the number of positive training instances per category are small, and that all the methods perform comparably when the categories are over 300 instances. Expand
Term-Weighting Approaches in Automatic Text Retrieval
This paper summarizes the insights gained in automatic term weighting, and provides baseline single term indexing models with which other more elaborate content analysis procedures can be compared. Expand
An Introduction to Support Vector Machines and Other Kernel-based Learning Methods
This is the first comprehensive introduction to Support Vector Machines (SVMs), a new generation learning system based on recent advances in statistical learning theory, and will guide practitioners to updated literature, new applications, and on-line software. Expand
Transversing itemset lattices with statistical metric pruning
A method of estimating a tight upper bound on the statistical metric associated with any superset of an itemset, as well as the novel use of the resulting information of upper bounds to prune unproductive supersets while traversing itemset lattices is presented. Expand
A Mathematical Theory of Communication
This paper opened the new area the information theory. Before this paper, most people believed that the only way to make the error probability of transmission as small as desired is to reduce theExpand
Overview of Classification Subtask at NTCIR-5
  • Patent Retrieval Task, Proc. NTCIR-5 Workshop Meeting
  • 2005
Japan Patent Office. Administration of Patent
  • 2006