Multilingual Audio-Visual Smartphone Dataset And Evaluation

@article{Mandalapu2021MultilingualAS,
  title={Multilingual Audio-Visual Smartphone Dataset And Evaluation},
  author={Hareesh Mandalapu and N AravindaReddyP and Raghavendra Ramachandra and K. Sreenivasa Rao and Pabitra Mitra and S. R. Mahadeva Prasanna and Christoph Busch},
  journal={IEEE Access},
  year={2021},
  volume={PP},
  pages={1-1}
}
Smartphones have been employed with biometric-based verification systems to provide security in highly sensitive applications. Audio-visual biometrics are getting popular due to the usability and also it will be challenging to spoof because of multi-modal nature. In this work, we present an audio-visual smartphone dataset captured in five different recent smartphones. This new dataset contains 103 subjects captured in three different sessions considering the different real-world scenarios… 

RUSAVIC Corpus: Russian Audio-Visual Speech in Cars

This work presents a new audio-visual speech corpus recorded in a car environment and designed for noise-robust speech recognition, with the use of automated lip-reading to improve the performance of audio-based speech recognition in the presence of severe acoustic noise caused by road traffic.

References

SHOWING 1-10 OF 48 REFERENCES

Smartphone Multi-modal Biometric Authentication: Database and Evaluation

This work presents a new multimodal biometric dataset (face, voice, and periocular) acquired using a smartphone and reports the performance evaluation of the baseline biometric verification and Presentation Attack Detection (PAD) on the newly collected dataset.

Audio-Visual Biometric Recognition and Presentation Attack Detection: A Comprehensive Survey

A comprehensive survey on existing state-of-the-art audio-visual recognition techniques, publicly available databases for benchmarking, and Presentation Attack Detection (PAD) algorithms is made.

VALID: A New Practical Audio-Visual Database, and Comparative Results

The new large and realistic multi-modal (audio-visual) VALID database was acquired in a noisy “real world” office scenario with no control on illumination or acoustic noise to highlight the degrading effect of an uncontrolled illumination environment and the importance of this database for deploying real world applications.

A Survey Of mobile face biometrics

A comparison of features for synthetic speech detection

Comparative results indicate that features representing spectral information in high-frequency region, dynamic information of speech, and detailed information related to subband characteristics are considerably more useful in detecting synthetic speech detection task.

MobBIO: A multimodal database captured with a portable handheld device

The MobBIO database was created using an Asus EeePad Transformer tablet, with mobile biometric systems in mind, and is composed by three modalities: iris, face and voice.

The BANCA Database and Evaluation Protocol

A protocol for evaluating verification algorithms on the BANCA database, a new large, realistic and challenging multi-modal database intended for training and testing multi- modal verification systems, is described.

Bi-modal authentication in mobile environments using session variability modelling

This work examines inter-session variability modelling (ISV) and joint factor analysis (JFA) for both face and speaker authentication and evaluates the system on the largest bi-modal mobile authentication database available, the MOBIO database, with over 61 hours of audio-visual data captured by 150 people in uncontrolled environments on a mobile phone.

Deep Multimodal Speaker Naming

This paper proposes a novel convolutional neural networks (CNN) based learning framework to automatically learn the fusion function of both face and audio cues and shows that without using face tracking, facial landmark localization or subtitle/transcript, the system with robust multimodal feature extraction is able to achieve state-of-the-art speaker naming performance.

Presentation Attack Detection Methods for Face Recognition Systems

This paper describes the various aspects of face presentation attacks, including different types of face artifacts, state-of-the-art PAD algorithms and an overview of the respective research labs working in this domain, vulnerability assessments and performance evaluation metrics, the outcomes of competitions, the availability of public databases for benchmarking new P AD algorithms in a reproducible manner, and a summary of the relevant international standardization in this field.