Publication

Creative Text-to-Audio Generation via Synthesizer Programming

Dec. 16, 2023

Topics

People

Groups

Share this publication

Singh*, N., Cherep*, M., & Shand, J. (2023, December). Creative Text-to-Audio Generation via Synthesizer Programming. In NeurIPS Machine Learning for Audio Workshop

Abstract

Sound designers have long harnessed the power of abstraction to distill and highlight the semantic essence of real-world auditory phenomena, akin to how simple sketches can vividly convey visual concepts. However, current neural audio synthesis methods lean heavily towards capturing acoustic realism. We introduce an open-source novel method centered on meaningful abstraction. Our approach takes a text prompt and iteratively refines the parameters of a virtual modular synthesizer to produce sounds with high semantic alignment, as predicted by a pretrained audio-language model. Our results underscore the distinctiveness of our method compared with both real recordings and state-of-the-art generative models.

via NeurIPS Machine Learning for Audio (Workshop)

Creative Text-to-Audio Generation via Synthesizer Programming

Topics

People

Groups

Abstract

SynthAX: A Fast Modular Synthesizer in JAX

The Sound Sketchpad: Expressively Combining Large and Diverse Audio Collections

Image2Reverb: Cross-Modal Reverb Impulse Response Synthesis

Jessica Shand named 2023 Steve Jobs Archive Fellow

Creative Text-to-Audio Generation via Synthesizer Programming

Topics

People

Groups

Share this publication

Abstract

SynthAX: A Fast Modular Synthesizer in JAX

The Sound Sketchpad: Expressively Combining Large and Diverse Audio Collections

Image2Reverb: Cross-Modal Reverb Impulse Response Synthesis

Jessica Shand named 2023 Steve Jobs Archive Fellow