Audio-Diffusion
Visit Toolaudio-diffusion is an Open Source AI tool that applies diffusion models to synthesize music instead of images. It uses the Hugging Face diffusers package to enable music generation.
At a glance
Trending
audio-diffusion is an Open Source AI tool that applies diffusion models to synthesize music instead of images. It uses the Hugging Face diffusers package to enable music generation.
Trending
About
audio-diffusion is an open-source project that leverages diffusion models, specifically the Hugging Face diffusers package, to synthesize music. Unlike traditional applications of diffusion models for image generation, this tool focuses on creating audio by transforming mel spectrograms into sound. Users can train models conditional on text or audio encodings, generate variations of existing audio, and even 'remix' tracks through a form of style transfer. It supports DDPM and DDIM models, including latent audio diffusion for faster training and inference, and allows for interpolation between audios in latent 'noise' space. The project provides scripts for generating mel spectrogram datasets, training models, and encoding audio for conditional generation.
Capabilities
Pricing & Plans
Open Source
Free
FAQs
Trending