AI Song Contest: Human-AI Co-Creation in Songwriting

  title={AI Song Contest: Human-AI Co-Creation in Songwriting},
  author={Cheng-Zhi Anna Huang and Hendrik Vincent Koops and Ed Newton-Rex and Monica Dinculescu and Carrie J. Cai},
Machine learning is challenging the way we make music. Although research in deep generative models has dramatically improved the capability and fluency of music models, recent work has shown that it can be challenging for humans to partner with this new class of algorithms. In this paper, we present findings on what 13 musician/developer teams, a total of 61 users, needed when co-creating a song with AI, the challenges they faced, and how they leveraged and repurposed existing characteristics… 

Figures and Tables from this paper

I Keep Counting: An Experiment in Human/AI Co-creative Songwriting
Musical co-creativity aims at making humans and computers collaborate to compose music. As an MIR team in computational musicology, we experimented with co-creativity when writing our entry to the
Deep Learning Tools for Audacity: Helping Researchers Expand the Artist's Toolkit
Digital Audio Workstations (DAWs) such as Audacity, ProTools and GarageBand are the primary platforms used in the recording, editing, mixing, and producing of sound art, including music. Making deep
A Comprehensive Survey on Deep Music Generation: Multi-level Representations, Algorithms, Evaluations, and Future Directions
This paper attempts to provide an overview of various composition tasks under different music generation levels, covering most of the currently popular music generation tasks using deep learning.
An interactive music infilling interface for pop music composition
This work builds a max patch for interactive music infilling application with different levels of control, including track density/polyphony/occupation rate and bar tonal tension control.
What to Play and How to Play it: Guiding Generative Music Models with Multiple Demonstrations
This work proposes and evaluates an approach to incorporating multiple user-provided inputs, each demonstrating a complementary set of musical characteristics, to guide the output of a generative model for synthesizing short music performances or loops, and argues for the interaction paradigm of mapping by demonstration as a promising approach to working with deep learning models.
AI as Social Glue: Uncovering the Roles of Deep Generative AI during Social Music Composition
The findings reveal that AI may play important roles in influencing human social dynamics during creativity, including: 1) implicitly seeding a common ground at the start of collaboration, 2) acting as a psychological safety net in creative risk-taking, 3) providing a force for group progress, and 5) altering users’ collaborative and creative roles.
Wordcraft: Story Writing With Large Language Models
This work built Wordcraft, a text editor in which users collaborate with a generative language model to write a story, and shows that large language models enable novel co-writing experiences.
Compositional Steering of Music Transformers
This paper builds on lightweight fine-tuning methods, such as prefix tuning and bias tuning, to propose a novel contrastive loss that enables us to steer music transformers over arbitrary combinations of logical features, with a relatively small number of extra parameters.
Artsheets for Art Datasets
A checklist of questions customized for use with art datasets is provided in order to help guide assessment of the ways that dataset design may either perpetuate or shift exclusions found in repositories of art data.
Image Co-Creation by Non-Programmers and Generative Adversarial Networks
A new course intended for non-programmer MA students in human-computer interaction, aimed at training them in authoring content using generative models, finds ways to obtain their creative needs by mostly exploring the dataset level (as opposed to the model architecture).


Novice-AI Music Co-Creation via AI-Steering Tools for Deep Generative Models
AI-steering tools were developed that increased users' trust, control, comprehension, and sense of collaboration with the AI, but also contributed to a greater sense of self-efficacy and ownership of the composition relative to the AI.
Approachable Music Composition with Machine Learning at Scale
The first AI-powered Google Doodle, the Bach Doodle, where users can create their own melody and have it harmonized by a machine learning model Coconet in the style of Bach is designed, and a simplified sheet-music based interface is designed.
Pop Music Transformer: Beat-based Modeling and Generation of Expressive Pop Piano Compositions
A Pop Music Transformer is built that composes Pop piano music with better rhythmic structure than existing Transformer models, when the way a musical score is converted into the data fed to a Transformer model is improved.
Enabling Factorized Piano Music Modeling and Generation with the MAESTRO Dataset
By using notes as an intermediate representation, a suite of models capable of transcribing, composing, and synthesizing audio waveforms with coherent musical structure on timescales spanning six orders of magnitude are trained, a process the authors call Wave2Midi2Wave.
Conditional LSTM-GAN for Melody Generation from Lyrics
A novel deep generative model, conditional Long Short-Term Memory (LSTM)–Generative Adversarial Network for melody generation from lyrics is proposed, which contains a deep LSTM generator and a deep DSTM discriminator both conditioned on lyrics.
BandNet: A Neural Network-based, Multi-Instrument Beatles-Style MIDI Music Composition Machine
A recurrent neural network (RNN)-based MIDI music composition machine that is able to learn musical knowledge from existing Beatles' songs and generate music in the style of the Beatles with little human intervention is proposed.
Learning to Generate Music With Sentiment
A generative Deep Learning model that can be directed to compose music with a given sentiment that is able to obtain good prediction accuracy and can be used for sentiment analysis of symbolic music.
Magenta Studio: Augmenting Creativity with Deep Learning in Ableton Live
The field of Musical Metacreation (MuMe) has produced impressive results for both autonomous and interactive creativity, recently aided by modern deep learning frameworks. However, there are few
Counterpoint by Convolution
This model is an instance of orderless NADE, which allows more direct ancestral sampling, and finds that Gibbs sampling greatly improves sample quality, which is demonstrated to be due to some conditional distributions being poorly modeled.
Music Transformer
It is demonstrated that a Transformer with the modified relative attention mechanism can generate minute-long compositions with compelling structure, generate continuations that coherently elaborate on a given motif, and in a seq2seq setup generate accompaniments conditioned on melodies.