On HuggingFace "Viking is being trained on a 2 trillion token mixed dataset of English, Finnish, Swedish, Danish, Norwegian, Icelandic and code. Full details will be published soon."
Access
Licenses
Apache 2.0, unclear if both weights and code are under it though.