• Corpus ID: 44470289

Pitch and spectral analysis of speech based on an auditory synchrony model

@inproceedings{Seneff1985PitchAS,
  title={Pitch and spectral analysis of speech based on an auditory synchrony model},
  author={Stephanie Seneff},
  year={1985}
}
  • S. Seneff
  • Published 1985
  • Engineering, Computer Science
There has been a substantial interest in the last few decades in the problem of training computers to recognize human speech. In spite of the concentrated efforts of conscientious teams of researchers, however, the solution remains elusive, unless the task is kept so restricted as to be uninteresting. These discouraging results may be due in part to the fact that researchers in the past paid little attention to models for human processing of auditory signals to guide in the design of speech… 
A joint synchrony/mean-rate model of auditory speech processing
TLDR
A bank of critical-band filters defines the initial spectral analysis, and filter outputs are processed by a model of the nonlinear transduction stage in the cochlea, which accounts for such features as saturation, adaptation and forward masking.
The acoustic features of speech sounds in a model of auditory processing: vowels and voiceless fricatives
TLDR
The acoustic features of three classes of complex sounds (complex tones, vowels and voiceless fricatives) were analyzed and the model suggests that the most distinctive acoustic feature is the location of the high-frequency edge of the signal spectrum.
A computational model for the peripheral auditory system: Application of speech recognition research
  • S. Seneff
  • Computer Science
    ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing
  • 1986
TLDR
A computer system for speech analysis based on human auditory processing is described, featuring a bank of 40 independent channels that includes a linear critical band filter followed by a model for the transformation from basilar membrane motion to nerve fiber response, incorporating such nonlinear effects as half-wave rectification, adaptation, spontaneous response, and saturation.
A Comparative Study of Computational Models of Auditory Peripheral System
A deep study about the computational models of the auditory peripheral system from three different research groups: Carney, Meddis and Hemmert, is presented here. The aim is to find out which model
A Comparative Study of Computational Models of Peripheral Auditory System
A deep study about the computational models of the auditory peripheral system from three different research groups: Carney, Meddis and Hemmert, is presented here. The aim is to find out which model
Analysis of dynamics of vocal tract system using zero time windowing method
Introduction to the research problem Speech signals are output of a dynamic production mechanism varying continuously with time. The process of speech production is dictated by the linguistic and
Auditory-based speech processing based on the average localized synchrony detection
  • A. Ali, J. V. Spiegel, P. Mueller
  • Computer Science
    2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100)
  • 2000
TLDR
A new auditory-based speech processing system based on the biologically rooted property of average localized synchrony detection (ALSD), which detects periodicity in the speech signal at Bark-scaled frequencies while reducing the response's spurious peaks and sensitivity to implementation mismatches, presents a consistent and robust representation of the formants.
Robust auditory-based speech processing using the average localized synchrony detection
A new auditory-based speech processing system based on the biologically rooted property of the average localized synchrony detection (ALSD) is proposed. The system detects periodicity in the speech
Robust classification of stop consonants using auditory-based speech processing
  • A. Ali, J. V. Spiegel, P. Mueller
  • Computer Science
    2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221)
  • 2001
TLDR
The results showed a consistent improvement of 3% in the place detection over the generalized synchrony detector system under identical circumstances on clean and noisy speech, illustrating the superior ability of the ALSD to suppress the spurious peaks and produce a consistent and robust formant (peak) representation.
A PERCEPTUAL REPRESENTATION OF AUDIO
TLDR
A transformation of sound into a representation with various properties specifically oriented towards simulations of source separation, which explains how the principles of sources separation will be applied as the next step towards a fully functional source separator.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 91 REFERENCES
Measurement of pitch in speech: an implementation of Goldstein's theory of pitch perception.
TLDR
A harmonics sieve was introduced to determine whether components are rejected or accepted at a candidate pitch, and a simple criterion, based on the components accepted and rejected, led to the decision on which candidate pitch was to be finally selected.
“Periodicity” Pitch and “Place” Pitch
In the year of the founding of the Acoustical Society of America, Wever and Bray observed periodic “volleys” of impulses in the auditory nerve. Since that time, the theory of pitch
On the use of autocorrelation analysis for pitch detection
TLDR
Several types of (nonlinear) preprocessing which can be used to effectively spectrally flatten the speech signal are presented and an algorithm for adaptively choosing a frame size for an autocorrelation pitch analysis is discussed.
Speech coding in the auditory nerve: II. Processing schemes for vowel-like sounds.
  • B. Delgutte
  • Mathematics, Medicine
    The Journal of the Acoustical Society of America
  • 1984
Several processing schemes by which phonetically important information for vowels can be extracted from responses of auditory-nerve fibers are analyzed. The schemes are based on power spectra of
Speech coding in the auditory nerve: I. Vowel-like sounds.
Discharge patterns of auditory-nerve fibers in anesthetized cats were recorded in response to a set of nine steady-state, two-formant vowels presented at 60 and 75 dB SPL. The largest components in
A computational model of filtering, detection, and compression in the cochlea
TLDR
This model cleanly separates these effects into time-invariant linear filtering based on a simple cascade/parallel filterbank network of second-order sections, plus transduction and compression based on half-wave rectification with a nonlinear coupled automatic gain control network.
Representation of speech-like sounds in the discharge patterns of auditory-nerve fibers.
TLDR
Results demonstrate that a conceptualization of some basic properties of responses to simple acoustic stimuli is useful in interpreting qualitatively how certain characteristics of speech-like sounds can be coded.
Verification of the optimal probabilistic basis of aural processing in pitch of complex tones.
TLDR
Optimum processor theory fully accounts for the multicomponent pitch data on the basis of similar errors in estimating component stimulus frequencies as reported earlier, thus providing further evidence for the optimum probabilistic basis of aural signal processing in pitch of complex tones.
A functional model of peripheral auditory system in speech processing
TLDR
This model reproduces, with a good approximation, the harmonic and transient behavior of the peripheral auditory system and allows one to study the transmission of acoustical information towards the higher-level brain centers.
On the “Residue” and Auditory Pitch Perception
The present chapter on the residue phenomenon is set up in an unorthodox way. Instead of giving a cursory review it presents the historical development in great detail. It is more a textbook on the
...
1
2
3
4
5
...