Nommage non supervisé des personnes dans les émissions de télévision. Utilisation des noms écrits, des noms prononcés ou des deux ?

@article{Poignant2014NommageNS,
  title={Nommage non supervis{\'e} des personnes dans les {\'e}missions de t{\'e}l{\'e}vision. Utilisation des noms {\'e}crits, des noms prononc{\'e}s ou des deux ?},
  author={Johann Poignant and Laurent Besacier and Georges Qu{\'e}not},
  journal={Document Num{\'e}rique},
  year={2014},
  volume={17},
  pages={37-60}
}
L'identification de personnes dans les emissions de television est un outil precieux pour l'indexation de ce type de videos mais l'utilisation de modeles biometriques n'est pas une option viable sans connaissance a priori des personnes presentes dans les videos. Les noms prononces ou ecrits peuvent nous fournir une liste de noms hypotheses. Nous proposons une comparaison du potentiel de ces deux modalites (noms prononces ou ecrits) afin d'extraire le nom des personnes parlant et/ou apparaissant… 

Figures from this paper

Multimodal person discovery in broadcast TV: lessons learned from MediaEval 2015

TLDR
Quantitative and qualitative comparisons of participants submissions are provided and it is investigated why all systems failed for particular shots, paving the way for future promising research directions.

Naming multi-modal clusters to identify persons in TV broadcast

TLDR
This paper proposes a method to take advantage of written names during the diarization process, in order to both name clusters and prevent the fusion of two clusters named differently.

References

SHOWING 1-10 OF 39 REFERENCES

Nommage non-supervisé des personnes dans les émissions de télévision : une revue du potentiel de chaque modalité

TLDR
La comparaison du potentiel de ces deux modalites (noms prononces ou ecrits) afin d’extraire le nom des personnes parlant et/ou apparaissant n’est pas une option viable sans connaissance a priori des personne presentes dans les videos.

Reconnaissance Automatique de Locuteurs à l'aide de Fonctions de Croyance

Le theme de cet article est l'extraction automatique de l'identite du locuteur (prenom et patronyme) presente dans des enregistrements sonores. ` A partir des resultats d'un systeme de transcription

Partitioning and transcription of broadcast news data

TLDR
This paper reports on the recent work in transcribing broadcast news data, including the problem of partitioning the data into homogeneous segments prior to word recognition, using a Gaussian mixture models and an agglomerative clustering algorithm.

Naming every individual in news video monologues

TLDR
The person-naming problem is formulated into a learning framework which predicts the most likely name for each person based on the features, and refines the predictions using the constraints, and outperforms a non-learning alternative by a large amount.

Identification of Speakers by Name Using Belief Functions

TLDR
Improvements are presented for a method which allows to extract speaker identities from automatic transcripts and to assign them to speaker turns and the detected full names are chosen as potential candidates for these assignments.

Speaker Diarization: About whom the Speaker is Talking ?

TLDR
A solution to identify speakers by extracting their full names pronounced in French broadcast news by using a merging method to associate a full name to a speaker cluster instead of an anonymous label provided by the diarization.

Unsupervised Speaker Identification using Overlaid Texts in TV Broadcast

TLDR
Three methods for the propagation of the overlaid names to the speech turns are compared, taking into account the co-occurence duration between the speaker clusters and the names provided by the video OCR and using a task-adapted variant of the TF-IDF information retrieval coefficient.

Name-It: Naming and Detecting Faces in Video by the Integration of Image and Natural Language Processing

TLDR
The proposed Name-It system, a system that associates faces and names in news videos, takes full advantage of advanced image and natural language processing and effectively extracts names by using lexical/grammatical analysis and knowledge of the news video topics structure.

Training and Evaluation of POS Taggers on the French MULTITAG Corpus

TLDR
Three standard POS taggers are trained and evaluated in the same conditions on the French MULTITAG corpus and this POS-tagged corpus provides a tagset richer than the usual ones, including gender and number distinctions, for example.

Naming faces in broadcast news video by image google

TLDR
A novel approach to name the faces by exploring extra knowledge obtained from image google is presented, which assumes that the faces of those important persons will turn out many times in the web images and could be retrieved from imageGoogle easily.