Combining cross-stream and time dimensions in phonetic speaker recognition


Recent studies show that phonetic sequences from multiple languages can provide effective features for speaker recognition. So far, only pronunciation dynamics in the time dimension, i.e., n-gram modeling on each of the phone sequences, have been examined. In the JHU 2002 Summer Workshop, we explored modeling the statistical pronunciation dynamics across… (More)
DOI: 10.1109/ICASSP.2003.1202764


