• Corpus ID: 175089

Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription

@article{BoulangerLewandowski2012ModelingTD,
  title={Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription},
  author={Nicolas Boulanger-Lewandowski and Yoshua Bengio and Pascal Vincent},
  journal={arXiv: Learning},
  year={2012}
}
We investigate the problem of modeling symbolic sequences of polyphonic music in a completely general piano-roll representation. We introduce a probabilistic model based on distribution estimators conditioned on a recurrent neural network that is able to discover temporal dependencies in high-dimensional sequences. Our approach outperforms many traditional models of polyphonic music on a variety of realistic datasets. We show how our musical language model can serve as a symbolic prior to… 

Figures and Tables from this paper

High-dimensional sequence transduction
TLDR
A probabilistic model based on a recurrent neural network that is able to learn realistic output distributions given the input is introduced and an efficient algorithm to search for the global mode of that distribution is devised.
Coupled Recurrent Models for Polyphonic Music Composition
TLDR
This paper introduces a novel recurrent model for music composition that is tailored to the structure of polyphonic music, and borrows ideas from both convolutional and recurrent neural models for the conditional distributions.
Polyphonic Music Generation by Modeling Temporal Dependencies Using a RNN-DBN
TLDR
The technique, RNN-DBN, is an amalgamation of the memory state of the RNN that allows it to provide temporal information and a multi-layer DBN that helps in high level representation of the data, making it ideal for sequence generation.
A hybrid recurrent neural network for music transcription
TLDR
This work uses recurrent neural networks and their variants as music language models and presents a generative architecture for combining these models with predictions from a frame level acoustic classifier and compares different neural network architectures for acoustic modeling.
Polyphonic Music Sequence Transduction with Meter-Constrained LSTM Networks
TLDR
This paper proposes a new method to post-process the output of a multi-pitch detection model using recurrent neural networks, and shows that using musically-relevant time steps improves system performance despite the choice of a basic representation.
A Dual Classification Approach to Music Language Modelingl
TLDR
An original architecture is introduced that poses the problem of modeling symbolic sequences of polyphonic music in a completely general piano-roll representation as a dual-classification task rather than one with a multimodal probability distribution.
Modelling Symbolic Music: Beyond the Piano Roll
TLDR
A representation which reduces polyphonic music to a univariate categorical sequence is introduced, which is able to apply state of the art natural language processing techniques, namely the long short-term memory sequence model.
Sequence Generation using Deep Recurrent Networks and Embeddings: A study case in music
TLDR
The proposed approach considers music theory concepts such as transposition, and uses data transformations to introduce semantic meaning and improve the quality of the generated melodies, measuring the tonality of the musical compositions.
Rethinking Recurrent Latent Variable Model for Music Composition
TLDR
This work presents a model for capturing musical features and creating novel sequences of music, called the Convolutional-Variational Recurrent Neural Network, which uses an encoder-decoder architecture with latent probabilistic connections to capture the hidden structure of music.
Generating Polyphonic Music Using Tied Parallel Networks
TLDR
A neural network architecture which enables prediction and composition of polyphonic music in a manner that preserves translation-invariance of the dataset and attains high performance at a musical prediction task and successfully creates note sequences which possess measure-level musical structure.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 29 REFERENCES
Finding temporal structure in music: blues improvisation with LSTM recurrent networks
  • D. Eck, J. Schmidhuber
  • Computer Science
    Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing
  • 2002
TLDR
Long short-term memory (LSTM) has succeeded in similar domains where other RNNs have failed, such as timing and counting and the learning of context sensitive languages, and it is shown that LSTM is also a good mechanism for learning to compose music.
Probabilistic models for melodic prediction
A Discriminative Model for Polyphonic Piano Transcription
TLDR
A discriminative model for polyphonic piano transcription is presented and a frame-level transcription accuracy of 68% was achieved on a newly generated test set, and direct comparisons to previous approaches are provided.
Neural Network Music Composition by Prediction: Exploring the Benefits of Psychoacoustic Constraints and Multi-scale Processing
  • M. Mozer
  • Computer Science
    Connect. Sci.
  • 1994
TLDR
An extension of this transition-table approach is described, using a recurrent autopredictive connectionist network called CONCERT, which is trained on a set of pieces with the aim of extracting stylistic regularities and incorporation of psychologically grounded representations of pitch, duration and harmonic structure.
A Classification-Based Polyphonic Piano Transcription Approach Using Learned Feature Representations
TLDR
This paper applies deep belief networks to musical data and evaluates the learned feature representations on classification-based polyphonic piano transcription and suggests a way of training classifiers jointly for multiple notes to improve training speed and classification performance.
Polyphonic music modeling with random fields
TLDR
This paper discusses an application of Random Fields to the problem of creating accurate yet flexible statistical models of polyphonic music, and shows that random fields not only outperform Markov chains, but are much more robust in terms of overfitting.
Bayesian Music Transcription
TLDR
The aim of this thesis is to integrate this vast amount of prior knowledge in a consistent and transparent computational framework and to demonstrate the feasibility of such an approach in moving us closer to a practical solution to music transcription.
Learning Multilevel Distributed Representations for High-Dimensional Sequences
TLDR
A new family of non-linear sequence models that are substantially more powerful than hidden Markov models or linear dynamical systems are described, and their performance is demonstrated using synthetic video sequences of two balls bouncing in a box.
Pitch Detection in Polyphonic Music using Instrument Tone Models
  • Yipeng Li, Deliang Wang
  • Computer Science
    2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07
  • 2007
TLDR
A hidden Markov model (HMM) based system to detect the pitch of an instrument in polyphonic music using an instrument tone model and a hypothesis selection method to choose pitch hypotheses with sufficiently high salience as pitch candidates is proposed.
A hierarchy of recurrent networks for speech recognition
TLDR
This approach unifies RBM-based approaches for sequential data modeling and the Echo State Network, a powerful approach for black-box system identification.
...
1
2
3
...