European Open Source AI Index
DatabaseNewsGuidesAboutContribute

SantaCoder

by BigCode

Early coder model.
Code
Full
https://huggingface.co/bigcode/santacoder
(undefined)
SantaCoder
BigCode OpenRAIL-M v1 license agreement
Open scientific collaboration for creating LLMs.
https://huggingface.co/bigcode
December 2022
Availability
Base Model Data
Model trained from filtered data from The Stack v1.1.
https://arxiv.org/pdf/2301.03988https://huggingface.co/datasets/bigcode/the-stack
End User Model Data
Model trained from filtered data from The Stack v1.1.
https://arxiv.org/pdf/2301.03988https://huggingface.co/datasets/bigcode/the-stack
Base Model Weights
Weights made available on HuggingFace.
https://huggingface.co/bigcode/santacoder
End User Model Weights
Weights made available on HuggingFace.
https://huggingface.co/bigcode/santacoder
Training Code
Training done using an open-source project.
https://github.com/bigcode-project/Megatron-LM
Documentation
Code Documentation
Project is well-documented.
https://github.com/bigcode-project/Megatron-LM
Hardware Architecture
Training setup described in detail.
https://huggingface.co/bigcode/santacoder#training
Preprint
Preprint made available on arXiv.
https://arxiv.org/pdf/2301.03988
Paper
Paper presented at ICLR 2023.
https://iclr.cc/virtual/2023/14995
Modelcard
Model card contains the requisite information.
https://huggingface.co/bigcode/santacoder
Datasheet
Detailed data sheet available.
https://huggingface.co/datasets/bigcode/the-stack
Access
Package
No package found.
API and Meta Prompts
HuggingFace inference API available.
https://huggingface.co/bigcode/santacoder
Licenses
BigCode OpenRAIL-M v1 license agreement, not an OSI-approved license.
https://huggingface.co/bigcode/santacoder#license
Is this information not up to date?
Contribute here ->
Supported by the Centre for Language Studies and the Dutch Research Council. Website design & development © 2024 by BSTN. This version of the index generated 10 March 2026, website content last updated 11 March 2026.