European Open Source AI Index
Menu
Database
News
Guides
About
Contribute
AlchemistCoder
by Shanghai AI Laboratory
About the model:
Open model trained by harmonizing different data sources. Multiple versions exist with different base models.
Model type:
Code
Model performance class:
Full
Link to the model:
https://huggingface.co/internlm/AlchemistCoder-DS-6.7B
Base models:
DeepSeek-Coder-6.7B-Base
End model:
AlchemistCoder-DS-6.7B
End model license:
Apache-2.0
About the organisation:
National-level Chinese research institute.
Link to the organisation:
https://www.shlab.org.cn/
Model release date:
May 2024
Availability
Base Model Data
GitHub is mentioned as a primary source for code data. For the rest the data mixture is left abstract.
https://arxiv.org/pdf/2401.14196
End User Model Data
The model makes use of both regular open-source data and synthetic data. Though the open-source data is outlined in the paper, the synthetic data generated is not provided.
https://arxiv.org/pdf/2405.19265
Base Model Weights
Weights available through HuggingFace.
https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-base
End User Model Weights
Weights available through HuggingFace.
https://huggingface.co/internlm/AlchemistCoder-DS-6.7B
Training Code
A repository exists which purportedly contains source code. However, this repository contains no code.
https://github.com/InternLM/AlchemistCoder/
Documentation
Code Documentation
No code available.
https://github.com/InternLM/AlchemistCoder/
Hardware Architecture
No hardware architecture outlined.
Preprint
Preprint made available on arXiv.
https://arxiv.org/pdf/2405.19265
Paper
Paper published in NIPS.
https://dl.acm.org/doi/abs/10.5555/3737916.3737987
Modelcard
Model card contains some information, mainly describing the model and providing usage instructions.
https://huggingface.co/internlm/AlchemistCoder-DS-6.7B
Datasheet
Data sheets available for some data sources, however synthetic data is not made publicly available.
https://huggingface.co/datasets/nickrosh/Evol-Instruct-Code-80k-v1
https://huggingface.co/datasets/codefuse-ai/CodeExercise-Python-27k
https://huggingface.co/datasets/theblackcat102/evol-codealpaca-v1
Access
Licenses
Model licensed under Apache-2.0.
https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/apache-2.0.md
Is this information not up to date?
Contribute here ->
Supported by the Centre for Language Studies and the Dutch Research Council. Website design & development © 2024 by
BSTN
. This version of the index generated
09 April 2026
, website content last updated
11 March 2026
.