European Open Source AI Index
DatabaseNewsGuidesAboutContribute

AudioLDM

by Centre for Vision, Speech and Signal Processing

General audio generation model.
Audio
Full
https://huggingface.co/cvssp/audioldm2
(undefined)
AudioLDM2
CC-BY-NC-SA-4.0
UK-based research centre.
https://www.surrey.ac.uk/centre-vision-speech-signal-processing
August 2023
Availability
Base Model Data
Datasets mentioned in source paper.
https://arxiv.org/pdf/2308.05734
End User Model Data
Datasets mentioned in source paper.
https://arxiv.org/pdf/2308.05734
Base Model Weights
Weights made available on HuggingFace.
https://huggingface.co/cvssp/audioldm2
End User Model Weights
Weights made available on HuggingFace.
https://huggingface.co/cvssp/audioldm2
Training Code
Repo with training code available.
https://github.com/haoheliu/audioldm2
Documentation
Code Documentation
Repo primarily contains information regarding inference.
https://github.com/haoheliu/audioldm2
Hardware Architecture
Hardware setup describedd on a high level in source paper.
https://arxiv.org/pdf/2308.05734
Preprint
Preprint available on arXiv.
https://arxiv.org/pdf/2308.05734
Paper
Paper published in IEEE.
https://ieeexplore.ieee.org/abstract/document/10530074
Modelcard
Model card primarily contains inference information.
https://huggingface.co/cvssp/audioldm2
Datasheet
All datasets sourced.
https://arxiv.org/pdf/2308.05734
Access
Licenses
CC-BY-NC-SA-4.0
https://huggingface.co/cvssp/audioldm2
Is this information not up to date?
Contribute here ->
Supported by the Centre for Language Studies and the Dutch Research Council. Website design & development © 2024 by BSTN. This version of the index generated 09 April 2026, website content last updated 11 March 2026.