Learn More
Combining multiple low-level visual features is a proven and effective strategy for a range of computer vision tasks. However, limited attention has been paid to combining such features with information from other modalities, such as audio and videotext, for large scale analysis of web videos. In our work, we rigorously analyze and combine a large set of(More)
In this letter, we propose a so-called probabilistic non-local means (PNLM) method for image denoising. Our main contributions are: 1) we point out defects of the weight function used in the classic NLM; 2) we successfully derive all theoretical statistics of patch-wise differences for Gaussian noise; and 3) we employ this prior information and formulate(More)
Production of parallel training corpora for the development of statistical machine translation (SMT) systems for resource-poor languages usually requires extensive manual effort. Active sample selection aims to reduce the labor , time, and expense incurred in producing such resources, attaining a given performance benchmark with the smallest possible(More)
Many feature extraction approaches for off-line handwriting recognition (OHR) rely on accurate binarization of gray-level images. However, high-quality binarization of most real-world documents is extremely difficult due to varying characteristics of noises artifacts common in such documents. Unlike most of these features, Gabor features do not require(More)
In this paper we discuss the design and performance of the BBN Call Director product for automatic call routing and the methodology for its deployment. The component technologies for the BBN Call Director are a statistical n-gram speech recognizer and a statistical topic identification system that, together, provide the framework for processing natural(More)
We present a novel architecture for providing automated telephone Directory Assistance (DA). The architecture couples a large-vocabulary, statistical n-gram, speech recognition engine with a statistical retrieval system. The use of a statistical n-gram allows for the recognition of unconstrained spoken queries while the statistical retrieval engine allows(More)
Offline handwriting recognition of free-flowing Arabic text is a challenging task due to the plethora of factors that contribute to the variability in the data. In this paper, we address some of these sources of variability, and present experimental results on a large corpus of handwritten documents. Specific techniques such as the application of(More)