Proprietary generative AI models like ChatGPT are easy to access, but designed in ways that make transparent and responsible use impossible. Widely advertised "open" solutions like Llama are open in weights only, providing no access to training code or to the all-important instruction-tuning data. This guide offers some recommendations for models that can be used in open scholarship and teaching.
Parameter descriptions:
Base Model Data
Are datasources for training the base model comprehensively documented and freely made available?
End User Model Data
Are datasources for training the model that the enduser interacts with comprehensively documented and freely made available?
Base Model Weights
Are the weights of the base models made freely available?
End User Model Weights
Are the weights of the model that the enduser interacts with made freely available?
Training Code
Is the source code of datasource processing, model training and tuining comprehensively and freely made available?
Model Data
Are datasources for training the model comprehensively documented and freely made available?
Model Weights
Are the weights of the model that the enduser interacts with made freely available?
Watermarking
Are watermarking techniques comprehensively documented and shared?
Prompt Moderation
Is prompt moderation comprehensively documented and shared?
Code Documentation
Is the source code of datasource processing, model training and tuning comprehensively documented?
Architecture Documentation
Is the hardware architecture used for datasource processing and model training comprehensively documented?
Preprint
Are archived preprint(s) are available that detail all major parts of the system including datasource processing, model training and tuning steps?
Paper
Are peer-reviewed scientific publications available that detail all major parts of the system including datasource processing, model training and tuning steps?
Modelcard
Is a model card in standardized format available that provides comprehensive insight on model architecture, training, fine-tuning, and evaluation are available?
Datasheet
Is a datasheet as defined in "Datasheets for Datasets" (Gebru et al. 2021) available?
Package
Is a packaged release of the model available on a software repository (e.g. a Python Package Index, Homebrew)?
API
Is an API available that provides unrestricted access to the model (other than security and CDN restrictions)?
Licenses
Is the project fully covered by Open Source Initiative (OSI)-approved licenses, including all data sources and training pipeline code?
WizardLM 13B v1.2 by Microsoft & Peking University
LLaMA2-13B
Qwen 1.5 by Alibaba Cloud
QwenLM
Phi 3 Instruct by Microsoft
Phi3
Mistral NeMo Instruct by Mistral AI
Mistral NeMo
DeepSeek R1 by DeepSeek
DeepSeek-V3-Base
Falcon-40B-instruct by Technology Innovation Institute
Falcon 40B
BELLE by KE Technologies
LLaMA & BLOOMZ
WizardLM-7B by Microsoft & Peking University
LLaMA-7B
Minerva-7B by Sapienza Natural Language Processing Group
Minerva-7B-base-v1.0
Geitje Ultra 7B by Bram van Roy
Mistral 7B
Falcon-180B-chat by Technology Innovation Institute
Falcon 180B
Yi 34B Chat by 01.AI
Yi 34B
Mixtral 8x7B Instruct by Mistral AI
Mistral
UltraLM by OpenBMB
LLaMA2
Llama 3.1 by Facebook Research
Meta Llama 3
Orca 2 by Microsoft Research
LLaMA2
Koala 13B by BAIR
unspecified
Stanford Alpaca by Stanford University CRFM
LLaMA
Xwin-LM by Xwin-LM
LLaMA2
Gemma 7B Instruct by Google DeepMind
Gemma
StableVicuna-13B by CarperAI
LLaMA
Nanbeige2-Chat by Nanbeige LLM lab
Unknown
Command R+ by Cohere AI
unspecified
Stable Beluga 2 by Stability AI
LLaMA2
Llama 3.3 by Meta Llama
Llama 3.3 70B
LLaMA2 Chat by Facebook Research
LLaMA2
Solar 70B by Upstage AI
LLaMA2
Llama 3 Instruct by Facebook Research
Meta Llama 3
One of the most important LLM-related skills students need today is critical AI literacy. Anyone can follow a 10 minute Youtube tutorial on prompt engineering, or read the latest research papers on jailbreaking ChatGPT, Gemini, or similar proprietary models. But for critical AI literacy, more is needed. It should be possible to inspect the training data of a model; to understand how exactly its fine-tuning makes it appear so docile and helpful; and to test prompts and output in easily accessible ways.
The Venn diagrams of truly open models and models with utility in education overlap to a large degree, but not fully. For instance, a model like BloomZ is admirably open on all fronts, but at 175B parameters it can also be prohibitively heavy to deploy in an educational setting. Three models stand out currently for their high degree of openness, exemplary documentation, and ease of access for demonstration purposes: OlMO Instruct by Allen AI, Amber Chat by LLM360, and Pythia Chat by TogetherComputer. All three are small to mid-range models that nonetheless provide the basic behaviour that users have come to expect from instruction-tuned LLMs.
These 7B models are relatively "small" in terms of parameters, but make up for it in terms of openness and accessibility. Running the largest models typically takes a lot of compute, which is why they tend to be provided only through (paid) APIs. Smaller models are easier to deploy in educational and research settings. The three models featured in this guide stand out in terms of the available documentation and code for doing so, as well as in being relatively light weight and easily deployable locally.
There are countless guides online for running LLMs locally using command line tools. Depending on the educational setting and the level of students, this may be all you need. Since command line users are typically savvy enough to figure out their preferred setup, we won't provide instructions here. All three models highlighted here can be easily run through ollama or llama.cpp.
There are also some solutions for educational settings more geared towards point-and-click interfaces. We can recommend LM Studio, available for Mac, Linux and Windows, as a quick way to get you started. LM Studio makes downloading models very easy: you can search for model names, pick the version you want, and download it. After downloading, the model becomes locally available.
LM Studio offers tools for a range of users, from novices to developers. For novices, it will be useful to play with some basic settings like temperature and top n sampling, and to test the effect of different system prompts.