Eric D. Scheirer

Learn More
We report on the construction of a real-time computer system capable of distinguishing speech signals from music signals over a wide range of digital audio input. We have examined 13 features intended to measure conceptually distinct properties of speech and/or music signals, and combined them in several multidimensional classification frameworks. We(More)
A method is presented for using a small number of bandpass filters and banks of parallel comb filters to analyze the tempo of, and extract the beat from, musical signals of arbitrary polyphonic complexity and containing arbitrary timbres. This analysis is performed causally, and can be used predictively to guess when beats will occur in the future. Results(More)
When human listeners are confronted with musical sounds, they rapidly and automatically orient themselves in the music. Even musically untrained listeners have an exceptional ability to make rapid judgments about music from very short examples, such as determining the music’s style, performer, beat, complexity, and emotional impact. However, there are(More)
The MPEG-4 standard, which will be published in October 1998, contains extensive provisions for sound synthesis as well as traditional methods of audio compression. At the heart of MPEG-4 Structured Audio, the sound-synthesis framework, is a new music-synthesis language called SAOL. This language, based on the Music-N model, is specified and defined fully(More)
While previous generations of the MPEG multimedia standard have focused primarily on coding and transmission of content digitally sampled from the real world, MPEG-4 contains extensive support for structured, synthetic and synthetic/natural hybrid coding methods. An overview is presented of the “Structured Audio” and “AudioBIFS” components of MPEG-4, which(More)
A computer system is described which performs polyphonic transcription of known solo piano music by using high-level musical information to guide a signal-processing system. This process, which we term expressive performance extraction, maps a digital audio representation of a musical performance to a MIDI representation of the same performance using the(More)
A computer system is described which performs polyphonic transcription of known solo pianomusic by using high-level musical information to guide a signal-processing system. This process, which we term expressive performance extraction, maps a digital audio representation of amusical performance to a MIDI representation of the same performance using the(More)
The application of a new technique for sound-scene analysis to the segmentation of complex musical signals is presented. This technique operates by discovering common modulation behavior among groups of frequency subbands in the autocorrelogram domain. The algorithm can be demonstrated to locate perceptual events in time and frequency when it is executed on(More)
The MPEG-4 standard defines numerous tools that represent the state-of-the-art in representation, transmission, and decoding of multimedia data. Among these is a new type of audio standard, termed “Structured Audio”. The MPEG-4 standard for structured audio allows for the efficient, flexible description of synthetic music and sound effects, and the use of(More)
We present an overview of the AudioBIFS system, part of the Binary Format for Scene Description (BIFS) tool in the MPEG-4 International Standard. AudioBIFS is the tool that integrates the synthetic and natural sound coding functions in MPEG-4. It allows the flexible construction of soundtracks and sound scenes using compressed sound, sound synthesis,(More)