Learn More
This paper presents an analysis of the speaker discrimination power of vocal source related features, in comparison to the conventional vocal tract related features. The vocal source features, named wavelet octave coefficients of residues (WOCOR), are extracted by pitch-synchronous wavelet transform of the linear predictive (LP) residual signals. Using a(More)
This paper addresses the problem of speaker segmentation in telephone conversation. The segmentation is done in three steps: 1) preliminary segmentation to hypothesize speaker turning points; 2) clustering of segments; and 3) re-segmentation to determine speaker identity of each segment. It is found that vocal source related features are more(More)
As the low-cost video transmission becomes popular, video-based bi-modal (audio and visual) authentication has great potential in various applications that require access control over handheld terminals. In this paper, we propose to use the averaged mouth image (AMI) for speaker verification. The AMI is computed by averaging properly aligned mouth images(More)
  • 1