Conditional End-to-End Audio Transforms

@inproceedings{Haque2018ConditionalEA,
  title={Conditional End-to-End Audio Transforms},
  author={Albert Haque and Michelle Guo and Prateek Kishor Verma},
  booktitle={INTERSPEECH},
  year={2018}
}
  • Albert Haque, Michelle Guo, Prateek Kishor Verma
  • Published in INTERSPEECH 2018
  • Computer Science, Engineering
  • We present an end-to-end method for transforming audio from one style to another. [...] Key Method Architecturally, our method is a fully-differentiable sequence-to-sequence model based on convolutional and hierarchical recurrent neural networks. It is designed to capture long-term acoustic dependencies, requires minimal post-processing, and produces realistic audio transforms. Ablation studies confirm that our model can separate speaker and instrument properties from acoustic content at different receptive…Expand Abstract
    4
    Twitter Mentions

    Figures, Tables, and Topics from this paper.

    Citations

    Publications citing this paper.
    SHOWING 1-10 OF 15 CITATIONS

    SING: Symbol-to-Instrument Neural Generator

    VIEW 10 EXCERPTS
    CITES METHODS
    HIGHLY INFLUENCED

    Speech-to-Singing Conversion in an Encoder-Decoder Framework

    VIEW 1 EXCERPT
    CITES BACKGROUND

    A Universal Music Translation Network

    VIEW 1 EXCERPT

    Blow: a single-scale hyperconditioned flow for non-parallel raw-audio voice conversion

    VIEW 1 EXCERPT
    CITES METHODS

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 47 REFERENCES

    Tacotron: Towards End-to-End Speech Synthesis

    VIEW 6 EXCERPTS
    HIGHLY INFLUENTIAL

    Listen, attend and spell: A neural network for large vocabulary conversational speech recognition

    VIEW 5 EXCERPTS
    HIGHLY INFLUENTIAL

    Adam: A Method for Stochastic Optimization

    VIEW 3 EXCERPTS
    HIGHLY INFLUENTIAL

    A Clockwork RNN

    VIEW 8 EXCERPTS
    HIGHLY INFLUENTIAL

    A Universal Music Translation Network

    VIEW 1 EXCERPT

    On Using Backpropagation for Speech Texture Generation and Voice Conversion

    VIEW 2 EXCERPTS

    State-of-the-Art Speech Recognition with Sequence-to-Sequence Models

    VIEW 1 EXCERPT

    Audio Set: An ontology and human-labeled dataset for audio events

    VIEW 1 EXCERPT