AudioLDM

by Centre for Vision, Speech and Signal Processing

About the model:

General audio generation model.

Model type:

Audio

Model performance class:

Full

Link to the model:

https://huggingface.co/cvssp/audioldm2

Base models:

AudioLDM2

End model:

AudioLDM2

End model license:

CC-BY-NC-SA-4.0

About the organisation:

UK-based research centre.

Link to the organisation:

https://www.surrey.ac.uk/centre-vision-speech-signal-processing

Model release date:

August 2023

Availability

Base Model Data

Datasets mentioned in source paper.

https://arxiv.org/pdf/2308.05734

End User Model Data

Datasets mentioned in source paper.

https://arxiv.org/pdf/2308.05734

Base Model Weights

Weights made available on HuggingFace.

https://huggingface.co/cvssp/audioldm2

End User Model Weights

Weights made available on HuggingFace.

https://huggingface.co/cvssp/audioldm2

Training Code

Repo with training code available.

https://github.com/haoheliu/audioldm2

Documentation

Code Documentation

Repo primarily contains information regarding inference.

https://github.com/haoheliu/audioldm2

Hardware Architecture

Hardware setup describedd on a high level in source paper.

https://arxiv.org/pdf/2308.05734

Preprint

Preprint available on arXiv.

https://arxiv.org/pdf/2308.05734

Paper

Paper published in IEEE.

https://ieeexplore.ieee.org/abstract/document/10530074

Modelcard

Model card primarily contains inference information.

https://huggingface.co/cvssp/audioldm2

Datasheet

All datasets sourced.

https://arxiv.org/pdf/2308.05734

Access

Licenses

Weights: CC-BY-NC-SA-4.0. Code: CC-BY-NC-SA-4.0. Data: variable licensing. Looks to all be permissively licensed.

https://huggingface.co/cvssp/audioldm2 https://github.com/haoheliu/audioldm2 https://arxiv.org/pdf/2308.05734

Is this information not up to date?

Contribute here ->

Supported by the Centre for Language Studies and the Dutch Research Council. Website design & development © 2024 by BSTN. This version of the index generated 12 June 2026, website content last updated 12 June 2026.