Dominique Genoud

Learn More
This paper describes a multimodal approach for speaker verification. The system consists of two classifiers, one using visual features, the other using acoustic features. A lip tracker is used to extract visual information from the speaking face which provides shape and intensity features. We describe an approach for normalizing and mapping different(More)
A key problem for field applications in speaker verification is the issue of a priori threshold setting. In the context of the CAVE project several methods for estimating speaker-independent and speaker-dependent decision thresholds were compared. Relevant parameters are estimated from development data only, i.e. without resorting to additional client data.(More)
Different from big cities, small towns call for culture preservation in addition to revitalization. IoT technologies could potentially serve this need. This article develops an IoT architecture, and choose best IoT enabling technologies, and IoT services, applications, and standards, towards this goal. In this article, we focus on the opportunities and(More)
R ´ ESUMÉ Cet article présente une description de la base de données POLYCOST qui est dédiée aux applications de reconnaissance du locuteurà travers les lignes téléphoniques. Les car-actéristiques de la base de données sont : large corpusà con-tenu varié (> 100 locuteurs), anglais parlé par desétrangers, chiffres lus et parole libre, enregistrementà travers(More)
The aim of this paper is to describe how the combination of speaker verication algorithms with a priori decision thresholds can improve the overall robustness of a real application. The evaluation is performed in the context of a eld application where each client i s v eried from a 7 digit pin code. This paper demonstrate that it is possible to increase the(More)
We propose multi-modal person veriication using voice and images as a solution to the secured access problem. The necessary i/o devices are now standard, cheaply available and, most importantly, constitute the two most important human communication modalities. The visual part currently involves i) matching of a coarse grid containing Gabor phase information(More)
This paper describes a multimodal approach for speaker verication. The system consists of two classiers, one using visual features and the other using acoustic features. A lip tracker is used to extract visual information from the speaking face which provides shape and intensity features. We describe an approach for normalizing and mapping dierent(More)
The issue of a priori threshold setting in speaker veri-cation is a key problem for eld applications. In the context of the CAVE project, we compared several methods for estimating speaker-independent and speaker-dependent decision thresholds. Relevant parameters are estimated from development data only, i.e. without resorting to additional client data. The(More)
This paper describes a multimodal approach for speaker veriication. The system consists of two classiiers, one using visual features and the other using acoustic features. A lip tracker is used to extract visual information from the speaking face which provides shape and intensity features. We describe an approach for normalizing and mapping diierent(More)