Riffusion
Music-generating machine learning model From Wikipedia, the free encyclopedia
Riffusion is a neural network, designed by Seth Forsgren and Hayk Martiros, that generates music using images of sound rather than audio.[1]
Developer(s) |
|
---|---|
Initial release | December 15, 2022 |
Repository | github |
Written in | Python |
Type | Text-to-image model |
License | MIT License |
Website | riffusion |
Generated spectrogram from the prompt "bossa nova with electric guitar" (top), and the resulting audio after conversion (bottom)
The resulting music has been described as "de otro mundo" (otherworldly),[2] although unlikely to replace man-made music.[2] The model was made available on December 15, 2022, with the code also freely available on GitHub.[3] It was one of many models derived from Stable Diffusion.[4]
Riffusion is classified within a subset of AI text-to-music generators. In December 2022, Mubert[5] similarly used Stable Diffusion to turn descriptive text into music loops. In January 2023, Google published a paper on their own text-to-music generator called MusicLM.[6][7]
Seth Forsgren and Hayk Martiros formed a startup, also called Riffusion, and raised $4 million in venture capital funding in October 2023.[8][9]
References
Wikiwand - on
Seamless Wikipedia browsing. On steroids.