Ryota Hinami

  • Citations Per Year
Learn More
This paper addresses the problem of joint detection and recounting of abnormal events in videos. Recounting of abnormal events, i.e., explaining why they are judged to be abnormal, is an unexplored but critical task in video surveillance, because it helps human observers quickly judge if they are false alarms or not. To describe the events in the(More)
Text recognition in natural scene images is a challenging task that has recently been garnering increased research attention. In this paper, we propose a method for recognizing text by utilizing the layout consistency of a text string. We estimate the layout (four lines of a text string) using initial character extraction and recognition result. On the(More)
We address the problem of open-vocabulary object retrieval and localization, which is to retrieve and localize objects from a very large-scale image database immediately by a textual query (e.g., a word or phrase). We first propose Query-Adaptive R-CNN, a simple yet strong framework for open-vocabulary object detection. Query-Adaptive RCNN is a simple(More)
TV ratings play an important role in the analysis of advertising, risk management, and social trends. The ratings reflect the interests of audiences, so valuable knowledge could be discovered by analyzing ratings in combination with multimedia content, such as broadcast video and transcripts. This article establishes a general framework for mining audience(More)
  • 1