Learn More
Human evaluations of machine translation are extensive but expensive. Human evaluations can take months to finish and involve human labor that can not be reused. We propose a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation , and that has little marginal(More)
We define a new image feature called the color correlo-gram and use it for image indexing and comparison. This feature distills the spatial correlation of colors, and is both effective and inexpensive for content-based image retrieval. The correlogram robustly tolerates large changes in appearance and shape caused by changes in viewing positions, camera(More)
We define a new image feature called the color correlogram and use it for image indexing and comparison. This feature distills the spatial correlation of colors and when computed efficiently, turns out to be both effective and inexpensive for content-based image retrieval. The correlogram is robust in tolerating large changes in appearance and shape caused(More)
Much is known about the design of automated systems to search broadcast news, but it has only recently become possible to apply similar techniques to large collections of spontaneous speech. This paper presents initial results from experiments with speech recognition, topic segmentation, topic categorization, and named entity detection using a large(More)
OBJECTIVE To investigate the relationship between beta(2)-adrenoceptor (beta(2)-AR) expression in inflammatory cells and airflow limitation in patients with chronic obstructive pulmonary disease (COPD). METHODS According to the severity of COPD, 37 patients with stable COPD were divided into three groups. Samples of peripheral blood and induced sputum(More)
Our English-Chinese cross-language IR system is trained from parallel corpora; we investigate its performance as a function of training corpus size for three different training corpora. We find that the performance of the system as trained on the three parallel corpora can be related by a simple measure, namely the out-of-vocabulary rate of query words.
Automatic Topic Segmentation is an important t e c hnology for multimedia archival and retrieval systems. In this paper we present an algorithm for topic segmentation which uses a combination of machine learning, statistical natural language processing, and information retrieval techniques. The performance of this algorithm is measured by considering the(More)
We investigate important differences between two styles of document clustering in the context of Topic Detection and Tracking. Converting a Topic Detection system into a Topic Tracking system exposes fundamental differences between these two tasks that are important to consider in both the design and the evaluation of TDT systems. We also identify features(More)