MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment

@inproceedings{Dong2017MuseGANMS,
  title={MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment},
  author={Hao-Wen Dong and Wen-Yi Hsiao and Li-Chia Yang and Yi-Hsuan Yang},
  booktitle={AAAI Conference on Artificial Intelligence},
  year={2017}
}
Generating music has a few notable differences from generating images and videos. First, music is an art of time, necessitating a temporal model. Second, music is usually composed of multiple instruments/tracks with their own temporal dynamics, but collectively they unfold over time interdependently. Lastly, musical notes are often grouped into chords, arpeggios or melodies in polyphonic music, and thereby introducing a chronological ordering of notes is not naturally suitable. In this… 

Music Generation Using Generative Adversarial Networks

By means of an user study it is concluded that the music segments generated by the implemented system are not noise, and are actually musically pleasing.

Coarse-To-Fine Framework For Music Generation via Generative Adversarial Networks

Under such a two-step procedure, the chords generated in the first step formulate a basic framework of the music, which can theoretically and practically improve the performance of melody generation in the second step.

Quantized GAN for Complex Music Generation from Dance Videos

We present Dance2Music-GAN (D2M-GAN), a novel adversarial multi-modal framework that generates complex musical samples conditioned on dance videos. Our proposed framework takes dance video frames and

Multi-Genre Music Transformer - Composing Full Length Musical Piece

The objective of the project is to implement a Multi-Genre Transformer which learns to produce music pieces through more adaptive learning process involving more challenging task where genres or form of the composition is also considered.

Musical Composition Style Transfer via Disentangled Timbre Representations

This paper presents the first deep learning models for rearranging music of arbitrary genres that take a piece of polyphonic musical audio as input and predict as output its musical score and investigates disentanglement techniques such as adversarial training to separate latent factors.

POP909: A Pop-song Dataset for Music Arrangement Generation

POP909, a dataset which contains multiple versions of the piano arrangements of 909 popular songs created by professional musicians, and provides the annotations of tempo, beat, key, and chords, where the tempo curves are hand-labeled and others are done by MIR algorithms.

MuseMorphose: Full-Song and Fine-Grained Piano Music Style Transfer with One Transformer VAE

Experiments show that MuseMorphose outperforms recurrent neural network based baselines on numerous widely-used metrics for style transfer tasks, and is interested in bringing the two together to construct a single model that exhibits both strengths.

Musicality-Novelty Generative Adversarial Nets for Algorithmic Composition

A new model called novelty game is presented to maximize the minimal distance between the machine-composed music sample and any human-compose music sample in the novelty space, where all well-known human composed music products are far from each other.

Learning Style-Aware Symbolic Music Representations by Adversarial Autoencoders

The empirical analysis on a large scale benchmark shows that the model has a higher reconstruction accuracy than state-of-the-art models based on standard variational autoencoders and is also able to create realistic interpolations between two musical sequences, smoothly changing the dynamics of the different tracks.

MuseBar: Alleviating Posterior Collapse in VAE Towards Music Generation

The proposed MuseBar proposes a cost-effective bar-wise regulation schema (MuseBar), which does not require extra parameters and additional modules to be implemented and can improve the quality of latent space in terms of Mutual Information and Kullback–Leibler divergence.
...

References

SHOWING 1-10 OF 35 REFERENCES

MidiNet: A Convolutional Generative Adversarial Network for Symbolic-Domain Music Generation

This work proposes a novel conditional mechanism to exploit available prior knowledge, so that the model can generate melodies either from scratch, by following a chord sequence, or by conditioning on the melody of previous bars, making it a generative adversarial network (GAN).

Deep Learning Techniques for Music Generation - A Survey

This paper is a survey and an analysis of different ways of using deep learning (deep artificial neural networks) to generate musical content, based on the analysis of many existing deep-learning based systems for music generation selected from the relevant literature.

DeepBach: a Steerable Model for Bach Chorales Generation

DeepBach, a graphical model aimed at modeling polyphonic music and specifically hymn-like pieces, is introduced, which is capable of generating highly convincing chorales in the style of Bach.

Song From PI: A Musically Plausible Network for Pop Music Generation

We present a novel framework for generating pop music. Our model is a hierarchical Recurrent Neural Network, where the layers and the structure of the hierarchy encode our prior knowledge about how

Music transcription modelling and composition using deep learning

This work builds and train LSTM networks using approximately 23,000 music transcriptions expressed with a high-level vocabulary (ABC notation), and uses them to generate new transcriptions to create music transcription models useful in particular contexts of music composition.

Temporal Generative Adversarial Nets with Singular Value Clipping

A generative model which can learn a semantic representation of unlabeled videos, and is capable of generating videos, is proposed, and a novel method to train it stably in an end-to-end manner is proposed.

WaveNet: A Generative Model for Raw Audio

WaveNet, a deep neural network for generating raw audio waveforms, is introduced; it is shown that it can be efficiently trained on data with tens of thousands of samples per second of audio, and can be employed as a discriminative model, returning promising results for phoneme recognition.

MoCoGAN: Decomposing Motion and Content for Video Generation

This work introduces a novel adversarial learning scheme utilizing both image and video discriminators and shows that MoCoGAN allows one to generate videos with same content but different motion as well as videos with different content and same motion.

MorpheuS: Generating Structured Music with Constrained Patterns and Tension

The MorpheuS music generation system, presented, has the ability to generate polyphonic pieces with a given tension profile and long- and short-term repeated pattern structures and is particularly useful in a game or film music context.

A Neural Parametric Singing Synthesizer

A new model for singing synthesis based on a modified version of the WaveNet architecture is presented, which allows conveniently modifying pitch to match any target melody, facilitates training on more modest dataset sizes, and significantly reduces training and generation times.