Shin'ichi Satoh

Learn More
Recently, similarity queries on feature vectors have been widely used to perform content-based retrieval of images. To apply this technique to large databases, it is required to develop multidimensional index structures supporting nearest neighbor queries efficiently. The SS-tree had been proposed for this purpose and is known to outperform other index(More)
The automatic extraction and recognition of news captions and annotations can be of great help locating topics of interest in digital news video libraries. To achieve this goal, we present a technique, called Video OCR (Optical Character Reader), which detects, extracts, and reads text areas in digital video data. In this paper, we address problems,(More)
We have developed Name-It, a system that associates faces and names in news videos. The system is given news videos, which include image sequences and transcripts obtained from audio tracks or closed caption texts. The system can then either infer possible name candidates for a given face, or locate a face in news videos by name. To accomplish this task,(More)
We are developing a cooking navigation system, which helps even a novice user to cook several recipes in parallel without failure, while improving an advanced user's skill further. To realize this, the system optimizes the cooking procedure considering the following restrictions: (1) Duration of cooking, (2) Accuracy of cooking, and (3) Learning effect, by(More)
Visual object retrieval aims at retrieving, from a collection of images, all those in which a given query object appears. It is inherently asymmetric: the query object is mostly included in the database image, while the converse is not necessarily true. However, existing approaches mostly compare the images with symmetrical measures, without considering the(More)
Eye location is an important visual cue for face image processing such as alignment before face recognition, gaze tracking, expression analysis, etc. In this paper a novel eye detection algorithm is presented, which integrates the characteristics of single eye and eye-pair images to develop a hybrid classifier under the learning paradigm. The low(More)
Image representations derived from pre-trained Convolutional Neural Networks (CNNs) have become the new state of the art in computer vision tasks such as instance retrieval. This work explores the suitability for instance retrieval of image-and region-wise representations pooled from an object detection CNN such as Faster R-CNN. We take advantage of the(More)
Affective Impact of Movies task aims to detect violent videos and affective impact on viewers of that videos [9]. This is a challenging task not only because of the diversity of video content but also due to the subjectiveness of human emotion. In this paper, we present a unified framework that can be applied to both subtasks: (i) induce affect detection,(More)