• Corpus ID: 53842346

Making music through real-time voice timbre analysis: machine learning and timbral control

  title={Making music through real-time voice timbre analysis: machine learning and timbral control},
  author={Dan Stowell},
People can achieve rich musical expression through vocal sound – see for example human beatboxing, which achieves a wide timbral variety through a range of extended techniques. Yet the vocal modality is under-exploited as a controller for music systems. If we can analyse a vocal performance suitably in real time, then this information could be used to create voice-based interfaces with the potential for intuitive and fulfilling levels of expressive control. Conversely, many modern techniques… 
Vocal Control of Sound Synthesis Personalized by Unsupervised Machine Listening and Learning
A user-driven adaptive method to control the sonic response of digital musical instruments using information extracted from the timbre of the human voice is described using a minimum-loss mapping that hides any synthesizer-specific parameters and maps the vocal interaction directly to perceptual characteristics of the generated sound.
A Voice Interface for Sound Generators: adaptive and automatic mapping of gestures to sound
This paper proposes the exploitation of vocal gesture as an extension or alternative to traditional physical controllers, which uses dynamic aspects of vocal sound to control variations in the timbre of the synthesized sound.
Vocal imitation for query by vocalisation
The ability of musicians to vocalise synthesised and percussive sounds is investigated, and the suitability of different audio features for predicting the perceptual similarity between vocal imitations and imitated sounds is evaluated.
Hands-Free Accessible Digital Musical Instruments: Conceptual Framework, Challenges, and Perspectives
A modular and adaptable conceptual framework is discussed for the design of accessible digital musical instruments targeted at performers with motor impairments and the use of unconventional interaction channels, musical control mappings, multisensory feedback, design, evaluation, and adaptation are discussed.
VocalSketch: Vocally Imitating Audio Concepts
This work collects thousands of crowd-sourced vocal imitations of a large set of diverse sounds, along with data on the crowd's ability to correctly label these vocal imitates, to help the research community understand which audio concepts can be effectively communicated with this approach.
Paralinguistic mechanisms of production in human "beatboxing": a real-time magnetic resonance imaging study.
Real-time magnetic resonance imaging was used to examine mechanisms of sound production by an American male beatbox artist, offering insights into the paralinguistic use of phonetic primitives and the ways in which they are coordinated in this style of musical performance.
Vocal imitation of synthesised sounds varying in pitch, loudness and spectral centroid.
The results show that experienced musicians can imitate pitch, loudness, and spectral centroid accurately, and that imitation accuracy is generally preserved when the imitated stimuli combine two, non-necessarily congruent features.
A Self-Organizing Gesture Map for a Voice-Controlled Instrument Interface
An unsupervised offline method that learns how to reduce and map the gestural data to a generic instrument parameter control space and applies this method to the existing Voice-Controlled Interface for musical instruments, obtaining sensible usability improvements.
Interactive music: Balancing creative freedom with musical development.
This thesis is about interactive music—a musical experience that involves participation from the listener but is itself a composed piece of music—and the Interactive Music Systems (IMSs) that create
Vocal Vibrations: A Multisensory Experience of the Voice
The motivations, the technical implementation, and the experience design of the Vocal Vibrations public installation are described, which encourages visitors to explore the physicality and expressivity of their voices in a rich musical context.


Timbre Space as a Musical Control Structure
A system for taking subjective measures of perceptual contrast between sound objects and using this data as input to some computer programs for additive synthesis and allow for the manipulation of the evolving spectral energy distribution and various temporal features of the tones.
Control parameters for musical instruments: a foundation for new mappings of gesture to sound
A new way of thinking about musical tones is described, specifically in the context of how features of a sound might be controlled by computer musicians, and how those features might be most appropriately mapped onto musical controllers.
Larynxophone: using voice as a wind controller
This paper presents a straightforward method to create convincing wind instrument audio tracks avoiding intermediate MIDI layers and easing expression control by relying on the musician ability to mimic, by singing or humming, the desired wind instrument performance.
Extending voice-driven synthesis to audio mosaicing
A system for controlling audio mosaicing with a voice signal, which can be interpreted as a further step in voice-driven sound synthesis, and a system working in pseudo-realtime, which analyzes voice input and sends control messages to the concatenative synthesis module.
A Smarter Way to Find pitch
A fast, accurate and robust method for finding the continuous pitch in monophonic musical sounds using a special normalised version of the Squared Difference Function coupled with a peak picking algorithm is described.
Exploring Vibrato-Motivated Acoustic Features for Singer Identification
  • T. Nwe, Haizhou Li
  • Physics
    IEEE Transactions on Audio, Speech, and Language Processing
  • 2007
The acoustic features that reflect vibrato information in order to identify singers of popular music are explored and an enhanced vocal detection method that allows to select vocal segments with high confidence is started.
Towards autonomous agents for live computer music : realtime machine listening and interactive music systems
A set of interactive systems developed for a range of musical styles and instruments, all of which attempt to participate in a concert by means of audio signal analysis alone are described, and the possibilities for future advances allied to artificial intelligence and signal processing technology are considered.
Vocal music is characterized by a melodically salient singing voice accompanied by one or more instruments. With a pitched instrument background, multiple periodicities are simultaneously present and
We propose a real-time system which is targeted to music production in the context of vocal recordings. The aim is to transform the singer’s voice characteristics in order to achieve a sweet sounding
This work presents a novel framework for music synthesis, based on the perceptual structure analysis of pre-existing musical signals, for example taken from a personal MP3 database, and proposes novel applications, such as music cross-synthesis, or time-domain audio compression, enabled through simple sound similarity measures, and clustering.