Symbolic Music Genre Transfer with CycleGAN

  title={Symbolic Music Genre Transfer with CycleGAN},
  author={Gino Brunner and Yuyi Wang and Roger Wattenhofer and Sumu Zhao},
  journal={2018 IEEE 30th International Conference on Tools with Artificial Intelligence (ICTAI)},
  • Gino Brunner, Yuyi Wang, Sumu Zhao
  • Published 20 September 2018
  • Computer Science
  • 2018 IEEE 30th International Conference on Tools with Artificial Intelligence (ICTAI)
Deep generative models such as Variational Autoencoders (VAEs) and Generative Adversarial Networks (GANs) have recently been applied to style and domain transfer for images, and in the case of VAEs, music. [] Key Method In order to improve the fidelity of the transformed music, we add additional discriminators that cause the generators to keep the structure of the original music mostly intact, while still achieving strong genre transfer. Visual and audible results further show the potential of our approach…

Figures and Tables from this paper

Music Style Transfer with Vocals Based on CycleGAN
This paper extracts the CQT features and Mel spectrogram features of music, and then uses CycleGAN to transfer the styles of the C qt features andMel spectrogram mapping pictures, and finally realizes the style transfer of music.
A GAN Model With Self-attention Mechanism To Generate Multi-instruments Symbolic Music
A new GAN model with self-attention mechanism, DMB-GAN, which can extract more temporal features of music to generate multi-instruments music stably and introduce switchable normalization to stabilize network training is proposed.
Groove2Groove: One-Shot Music Style Transfer With Supervision From Synthetic Data
Groove2Groove is presented, a one-shot style transfer method for symbolic music, focusing on the case of accompaniment styles in popular music and jazz, using an encoder-decoder neural network for the task, along with a synthetic data generation scheme to supply it with parallel training examples.
Supervised Symbolic Music Style Translation Using Synthetic Data
This study focuses on symbolic music with the goal of altering the 'style' of a piece while keeping its original 'content', and develops the first fully supervised algorithm for this task.
ChordGAN: Symbolic Music Style Transfer with Chroma Feature Extraction
ChordGAN seeks to learn the rendering of harmonic structures into notes by embedding chroma feature extraction within the training process, and can be utilized as a tool for musicians to study compositional techniques for different styles using same chords and automatically generate music from lead sheets.
Crossing You in Style: Cross-modal Style Transfer from Music to Visual Arts
It is demonstrated that the proposed framework can generate diverse image style representations from a music piece, and these representations can unveil certain art forms of the same era.
Attributes-Aware Deep Music Transformation
This work proposes a novel method that enables attributes-aware music transformation from any set of musical annotations, without requiring complicated derivative implementation, and can provide explicit control over any continuous or discrete annotation.
An Unsupervised Methodology for Musical Style Translation
This paper presents an unsupervised methodology for musical style transfer, which is capable of translating the style of symbolic music from the source domain to the target domain while mostly preserving the content and structure of input data.
A Comprehensive Survey on Deep Music Generation: Multi-level Representations, Algorithms, Evaluations, and Future Directions
This paper attempts to provide an overview of various composition tasks under different music generation levels, covering most of the currently popular music generation tasks using deep learning.
Neural Symbolic Music Genre Transfer Insights
The preliminary results show that spectral normalization improves audible quality, while self-attention hurts content retention due to its non-locality.


MidiNet: A Convolutional Generative Adversarial Network for Symbolic-Domain Music Generation
This work proposes a novel conditional mechanism to exploit available prior knowledge, so that the model can generate melodies either from scratch, by following a chord sequence, or by conditioning on the melody of previous bars, making it a generative adversarial network (GAN).
MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment
Three models for symbolic multi-track music generation under the framework of generative adversarial networks (GANs), which differ in the underlying assumptions and accordingly the network architectures are referred to as the jamming model, the composer model and the hybrid model are proposed.
MIDI-VAE: Modeling Dynamics and Instrumentation of Music with Applications to Style Transfer
We introduce MIDI-VAE, a neural network model based on Variational Autoencoders that is capable of handling polyphonic music with multiple instrument tracks, as well as modeling the dynamics of music
Hierarchical Variational Autoencoders for Music
This work develops recurrent variational autoencoders trained to reproduce short musical sequences and demonstrates their use as a creative device both via random sampling and data interpolation and the effectiveness of scheduled sampling in significantly improving the reconstruction accuracy.
A Universal Music Translation Network
This method is based on a multi-domain wavenet autoencoder, with a shared encoder and a disentangled latent space that is trained end-to-end on waveforms, allowing it to translate even from musical domains that were not seen during training.
Modeling Temporal Tonal Relations in Polyphonic Music Through Deep Networks With a Novel Image-Based Representation
Experimental results show that the tonnetz representation produces musical sequences that are more tonally stable and contain more repeated patterns than sequences generated by pianoroll-based models, a finding that is directly useful for tackling current challenges in music and AI such as smart music generation.
Generating Polyphonic Music Using Tied Parallel Networks
A neural network architecture which enables prediction and composition of polyphonic music in a manner that preserves translation-invariance of the dataset and attains high performance at a musical prediction task and successfully creates note sequences which possess measure-level musical structure.
DeepBach: a Steerable Model for Bach Chorales Generation
DeepBach, a graphical model aimed at modeling polyphonic music and specifically hymn-like pieces, is introduced, which is capable of generating highly convincing chorales in the style of Bach.
StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation
A unified model architecture of StarGAN allows simultaneous training of multiple datasets with different domains within a single network, which leads to StarGAN's superior quality of translated images compared to existing models as well as the novel capability of flexibly translating an input image to any desired target domain.
A First Look at Music Composition using LSTM Recurrent Neural Networks
Long Short-Term Memory is shown to be able to play the blues with good timing and proper structure as long as one is willing to listen, and once the network has found the relevant structure it does not drift from it.