M.Grimaldi, F.Cummings. Speaker identification using instantaneous frequencies
- Grimaldi, Cummings
- IEEE Transactions On Audio, Speech And Language…
The topic of the paper is to assess the stability of the human voice on the basic of results of Fast Fourier Transform and cepstrogram, which are subjected to statistical analysis. The presented results are the first part of the study on the usefulness of these analyses in voice quality assessment and identification of persons and voice commands in a noisy environment. The purpose of the study is to select the indicators, characteristic for the voice of the one particular person. In the paper is described the data selection algorithm for testing purposes from a single recording of a human voice, and the results of statistical analysis of the stability of expression of voiced vowel. Proposed algorithm allows extracting voiced elements of a desired length from the recordings with the exception of noise and silence. Then, to assess stability of waveforms, the recording is divided into several to several tens of fragments a length of few tens milliseconds. Each fragment is analysed independently of the other, and the result is a measure of the error of inference algorithms for identifying the person and voice commands. Particular attention is given to comparing the results of statistical analyses, after the splitting into blocks of the same duration and the same number of micro phonemes in the waveform. Due to fact, that in the studies have used a frequency analysis, it is possible to determine the stability of both the fundamental frequency and formants using the same statistic apparatus.