Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription
@article{BoulangerLewandowski2012ModelingTD, title={Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription}, author={Nicolas Boulanger-Lewandowski and Yoshua Bengio and Pascal Vincent}, journal={arXiv: Learning}, year={2012} }
We investigate the problem of modeling symbolic sequences of polyphonic music in a completely general piano-roll representation. We introduce a probabilistic model based on distribution estimators conditioned on a recurrent neural network that is able to discover temporal dependencies in high-dimensional sequences. Our approach outperforms many traditional models of polyphonic music on a variety of realistic datasets. We show how our musical language model can serve as a symbolic prior to…
601 Citations
High-dimensional sequence transduction
- Computer Science2013 IEEE International Conference on Acoustics, Speech and Signal Processing
- 2013
A probabilistic model based on a recurrent neural network that is able to learn realistic output distributions given the input is introduced and an efficient algorithm to search for the global mode of that distribution is devised.
Coupled Recurrent Models for Polyphonic Music Composition
- Computer ScienceISMIR
- 2019
This paper introduces a novel recurrent model for music composition that is tailored to the structure of polyphonic music, and borrows ideas from both convolutional and recurrent neural models for the conditional distributions.
Polyphonic Music Generation by Modeling Temporal Dependencies Using a RNN-DBN
- Computer ScienceICANN
- 2014
The technique, RNN-DBN, is an amalgamation of the memory state of the RNN that allows it to provide temporal information and a multi-layer DBN that helps in high level representation of the data, making it ideal for sequence generation.
A hybrid recurrent neural network for music transcription
- Computer Science2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2015
This work uses recurrent neural networks and their variants as music language models and presents a generative architecture for combining these models with predictions from a frame level acoustic classifier and compares different neural network architectures for acoustic modeling.
Polyphonic Music Sequence Transduction with Meter-Constrained LSTM Networks
- Computer Science2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2018
This paper proposes a new method to post-process the output of a multi-pitch detection model using recurrent neural networks, and shows that using musically-relevant time steps improves system performance despite the choice of a basic representation.
A Dual Classification Approach to Music Language Modelingl
- Computer Science
- 2016
An original architecture is introduced that poses the problem of modeling symbolic sequences of polyphonic music in a completely general piano-roll representation as a dual-classification task rather than one with a multimodal probability distribution.
Modelling Symbolic Music: Beyond the Piano Roll
- Computer ScienceACML
- 2016
A representation which reduces polyphonic music to a univariate categorical sequence is introduced, which is able to apply state of the art natural language processing techniques, namely the long short-term memory sequence model.
Sequence Generation using Deep Recurrent Networks and Embeddings: A study case in music
- Computer ScienceArXiv
- 2020
The proposed approach considers music theory concepts such as transposition, and uses data transformations to introduce semantic meaning and improve the quality of the generated melodies, measuring the tonality of the musical compositions.
Rethinking Recurrent Latent Variable Model for Music Composition
- Computer Science2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP)
- 2018
This work presents a model for capturing musical features and creating novel sequences of music, called the Convolutional-Variational Recurrent Neural Network, which uses an encoder-decoder architecture with latent probabilistic connections to capture the hidden structure of music.
Generating Polyphonic Music Using Tied Parallel Networks
- Computer ScienceEvoMUSART
- 2017
A neural network architecture which enables prediction and composition of polyphonic music in a manner that preserves translation-invariance of the dataset and attains high performance at a musical prediction task and successfully creates note sequences which possess measure-level musical structure.
References
SHOWING 1-10 OF 29 REFERENCES
Finding temporal structure in music: blues improvisation with LSTM recurrent networks
- Computer ScienceProceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing
- 2002
Long short-term memory (LSTM) has succeeded in similar domains where other RNNs have failed, such as timing and counting and the learning of context sensitive languages, and it is shown that LSTM is also a good mechanism for learning to compose music.
A Discriminative Model for Polyphonic Piano Transcription
- Computer ScienceEURASIP J. Adv. Signal Process.
- 2007
A discriminative model for polyphonic piano transcription is presented and a frame-level transcription accuracy of 68% was achieved on a newly generated test set, and direct comparisons to previous approaches are provided.
Neural Network Music Composition by Prediction: Exploring the Benefits of Psychoacoustic Constraints and Multi-scale Processing
- Computer ScienceConnect. Sci.
- 1994
An extension of this transition-table approach is described, using a recurrent autopredictive connectionist network called CONCERT, which is trained on a set of pieces with the aim of extracting stylistic regularities and incorporation of psychologically grounded representations of pitch, duration and harmonic structure.
A Classification-Based Polyphonic Piano Transcription Approach Using Learned Feature Representations
- Computer ScienceISMIR
- 2011
This paper applies deep belief networks to musical data and evaluates the learned feature representations on classification-based polyphonic piano transcription and suggests a way of training classifiers jointly for multiple notes to improve training speed and classification performance.
Polyphonic music modeling with random fields
- Computer ScienceMULTIMEDIA '03
- 2003
This paper discusses an application of Random Fields to the problem of creating accurate yet flexible statistical models of polyphonic music, and shows that random fields not only outperform Markov chains, but are much more robust in terms of overfitting.
Bayesian Music Transcription
- Computer Science
- 1997
The aim of this thesis is to integrate this vast amount of prior knowledge in a consistent and transparent computational framework and to demonstrate the feasibility of such an approach in moving us closer to a practical solution to music transcription.
Learning Multilevel Distributed Representations for High-Dimensional Sequences
- Computer ScienceAISTATS
- 2007
A new family of non-linear sequence models that are substantially more powerful than hidden Markov models or linear dynamical systems are described, and their performance is demonstrated using synthetic video sequences of two balls bouncing in a box.
Pitch Detection in Polyphonic Music using Instrument Tone Models
- Computer Science2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07
- 2007
A hidden Markov model (HMM) based system to detect the pitch of an instrument in polyphonic music using an instrument tone model and a hypothesis selection method to choose pitch hypotheses with sufficiently high salience as pitch candidates is proposed.
A hierarchy of recurrent networks for speech recognition
- Computer ScienceNIPS 2009
- 2009
This approach unifies RBM-based approaches for sequential data modeling and the Echo State Network, a powerful approach for black-box system identification.