Speaker Retrieval for Tv Show Videos by Associating Audio Speaker Recognition Result to Visual Faces∗


Person retrieval based on solely visual face recognition is hard because of the well known problems of illumination, pose, size and expression variation, which can exceed those due to identity. Fortunately, videos often accompanied with other modalities, like audio, text, etc. In this paper, we propose a framework to associate who and when information… (More)


