• Publications
  • Influence
An information-theoretic perspective of tf-idf measures
This paper presents a mathematical definition of the "probability-weighted amount of information" (PWI), a measure of specificity of terms in documents that is based on an information-theoretic viewExpand
  • 547
  • 41
Scheduling of Genetic Algorithms in a Noisy Environment
In this paper, we develop new methods for adjusting configuration parameters of genetic algorithms operating in a noisy environment. Such methods are related to the scheduling of resources for testsExpand
  • 118
  • 9
A Fast Linkage Detection Scheme for Multi-Source Information Integration
Record linkage refers to techniques for identifying records associated with the same real-world entities. Record linkage is not only crucial in integrating multi-source databases that have beenExpand
  • 97
  • 8
A Conditional Variational Framework for Dialog Generation
Deep latent variable models have been shown to facilitate the response generation for open-domain dialog systems. However, these latent variables are highly randomized, leading to uncontrollableExpand
  • 75
  • 8
NTCIR-12 MathIR Task Overview
We present an overview of the NTCIR-12 MathIR Task, dedicated to information access for mathematical content. The MathIR task makes use of two corpora. The first corpus contains excerpts fromExpand
  • 50
  • 8
Linguistic Techniques to Improve the Performance of Automatic Text Categorization
This paper presents a method for incorporating natural language processing into existing text categorization procedures. Three aspects are considered in the investigation: (i) a method for weightingExpand
  • 73
  • 7
Dynamic Control of Genetic Algorithms in a Noisy Environment
In this paper, we present e cient algorithms for adjusting con guration parameters of genetic algorithms that operate in a noisy environment. Assuming that the population size is given, we addressExpand
  • 52
  • 7
A Language Model based Evaluator for Sentence Compression
We herein present a language-modelbased evaluator for deletion-based sentence compression, and viewed this task as a series of deletion-and-evaluation operations using the evaluator. MoreExpand
  • 15
  • 7
MCAT Math Retrieval System for NTCIR-12 MathIR Task
This paper describes the participation of our MCAT search system in the NTCIR-12 MathIR Task. We introduce three granularity levels of textual information, new approach for generating dependencyExpand
  • 16
  • 6
NTCIR-10 Math Pilot Task Overview
This paper presents an overview of a new pilot task, the NTCIR Math Task, which is specifically dedicated to information access to mathematical content. In particular, the paper summarizes theExpand
  • 54
  • 5