Riffusion

Riffusion
Developer(s)	Seth Forsgren; Hayk Martiros;
Initial release	December 15, 2022
Repository	github.com/hmartiro/riffusion-inference
Written in	Python
Type	Text-to-image model
License	MIT License
Website	riffusion.com

Riffusion is a neural network, designed by Seth Forsgren and Hayk Martiros, that generates music using images of sound rather than audio.^[1]

Quick Facts Developer(s), Initial release ...

Close

Generated spectrogram from the prompt "bossa nova with electric guitar" (top), and the resulting audio after conversion (bottom)

The resulting music has been described as "de otro mundo" (otherworldly),^[2] although unlikely to replace man-made music.^[2] The model was made available on December 15, 2022, with the code also freely available on GitHub.^[3] It was one of many models derived from Stable Diffusion.^[4]

Riffusion is classified within a subset of AI text-to-music generators. In December 2022, Mubert^[5] similarly used Stable Diffusion to turn descriptive text into music loops. In January 2023, Google published a paper on their own text-to-music generator called MusicLM.^[6]^[7]

Seth Forsgren and Hayk Martiros formed a startup, also called Riffusion, and raised $4 million in venture capital funding in October 2023.^[8]^[9]

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

Riffusion

References

Wikiwand - on