European Open Source AI Index
Menu
Database
News
Guides
About
Contribute
AudioLDM
by Centre for Vision, Speech and Signal Processing
About the model:
General audio generation model.
Model type:
Audio
Model performance class:
Full
Link to the model:
https://huggingface.co/cvssp/audioldm2
Base models:
(undefined)
End model:
AudioLDM2
End model license:
CC-BY-NC-SA-4.0
About the organisation:
UK-based research centre.
Link to the organisation:
https://www.surrey.ac.uk/centre-vision-speech-signal-processing
Model release date:
August 2023
Availability
Base Model Data
Datasets mentioned in source paper.
https://arxiv.org/pdf/2308.05734
End User Model Data
Datasets mentioned in source paper.
https://arxiv.org/pdf/2308.05734
Base Model Weights
Weights made available on HuggingFace.
https://huggingface.co/cvssp/audioldm2
End User Model Weights
Weights made available on HuggingFace.
https://huggingface.co/cvssp/audioldm2
Training Code
Repo with training code available.
https://github.com/haoheliu/audioldm2
Documentation
Code Documentation
Repo primarily contains information regarding inference.
https://github.com/haoheliu/audioldm2
Hardware Architecture
Hardware setup describedd on a high level in source paper.
https://arxiv.org/pdf/2308.05734
Preprint
Preprint available on arXiv.
https://arxiv.org/pdf/2308.05734
Paper
Paper published in IEEE.
https://ieeexplore.ieee.org/abstract/document/10530074
Modelcard
Model card primarily contains inference information.
https://huggingface.co/cvssp/audioldm2
Datasheet
All datasets sourced.
https://arxiv.org/pdf/2308.05734
Access
Licenses
CC-BY-NC-SA-4.0
https://huggingface.co/cvssp/audioldm2
Is this information not up to date?
Contribute here ->
Supported by the Centre for Language Studies and the Dutch Research Council. Website design & development © 2024 by
BSTN
. This version of the index generated
09 April 2026
, website content last updated
11 March 2026
.