A new family of Norwegian-centric models

May 26, 2026.

Today we are happy to announce Borealis, a new family of Norwegian-centric instruction-tuned language models from the National Library of Norway.

Borealis is released in five sizes, from 270M to 27B parameters, with both full-release and open-release variants. The models are based on the Gemma 3 family and are tuned for Norwegian, Bokmal, Nynorsk, and English assistant use cases, including writing, summarization, question answering, and language-quality assessment.

You can try the demo here.

Why Borealis?

The goal of Borealis is simple: make useful Norwegian language models available in several practical sizes, with transparent documentation, reproducible model artifacts, and formats that people can actually run.

This release also marks an important step for lawful Norwegian language-model development. The full Borealis models are the first Borealis release to incorporate a small amount of data made available through the agreement between rights-holder organizations in Norway and the Norwegian government. So far, we use only a limited supervised fine-tuning subset from this material: around 10,000 tasks for title and ingress generation.

The open models do not include material from that agreement. Their SFT dataset is available as NbAiLab/aurora-sft-open. The full models use NbAiLab/aurora-sft, whose only difference from the open dataset is the addition of those 10k newspaper-derived tasks.

Models

All models are available on Hugging Face. GGUF repositories are available for llama.cpp, Ollama, and other local inference tools.

The full models are released under NB-license, an adaptation of Apache 2.0 with additional use-based restrictions related to training-data recreation and end-user access to licensed press publications. The open models are released under the Gemma license and do not include material from the press-publication agreement.

The release collection is here: NbAiLab/borealis.

Evaluation

Borealis evaluation results on selected tasks (best scoring prompt among zero to 5-shot).

We evaluate Borealis with NorEvalMMLU-English, and our own nb-gpt-bench suite, which will be described in an upcoming paper. The table above shows selected results using the best score among 0- to 5-shot settings.

The short version is that Borealis is strongest where we wanted it to be strong: Norwegian language tasks, Norwegian knowledge, instruction following, and safety-aligned assistant behavior.

Some things stand out to me:

Alignment and safety

Borealis models are aligned using prompt baking and weighted merging of SFT and aligned models. In practice, we distill the behavior induced by a system prompt into the model weights, then merge the resulting adapter with a scaling factor that preserves usefulness while improving safety behavior. The tradeoff is that, in most cases, the performance of the released aligned models is slightly worse than their unaligned counterparts.

This matters because we want the models to be pleasant and useful to run without requiring everyone to carry around the exact same system prompt. It is not a guarantee of perfect behavior. Borealis can still hallucinate, be wrong, or produce inappropriate output, and it should not be used in safety-critical settings without additional evaluation and safeguards.

Running the models

The safetensors repositories work with Transformers and vLLM. The GGUF repositories work with llama.cpp and Ollama.

For Transformers:

import torch from transformers
import AutoProcessor, Gemma3ForConditionalGeneration
model_id = "NbAiLab/borealis-27b" processor = AutoProcessor.from_pretrained(model_id) model = Gemma3ForConditionalGeneration.from_pretrained( model_id, device_map="auto", torch_dtype=torch.bfloat16, )

For vLLM:

vllm serve NbAiLab/borealis-27b --served-model-name borealis-27b

For llama.cpp:

llama-server -hf NbAiLab/borealis-27b-gguf --port 8080

For Ollama:

ollama run hf.co/NbAiLab/borealis-27b-gguf

You can replace 27b with any of the other sizes, or use the borealis-open-* repositories if you want the open-data variant.

Documentation, license, and authenticity

Each model repository includes a model card, a Model Documentation Form, a License FAQ, and signed release artifacts.

The full models use NB’s Borealis license, which is adapted from Apache 2.0 with additional use-based restrictions. In particular, users must not intentionally use the model to recreate training data, and must not use the model or its output to provide end-user services whose primary purpose is to give access to licensed press publications in the training data.

The open models use the Gemma license. They are intended for users who want the same Borealis recipe without the additional newspaper-derived SFT tasks and without the full-model license restrictions tied to the press-publication agreement.

The model-runtime artifacts are signed by the National Library of Norway. After downloading a repository, authenticity and file integrity can be checked with:

More verification instructions are available at ai.nb.no/verify.

Thanks

Borealis is a joint effort across several teams at the National Library of Norway. Javier de la Rosa led the release, but this work depends on many people: Rolv-Arild Braaten, Magnus Breder Birkenes, Lucas Charpentier, Pawel Cyrta, Tita Enstad, Markus Sverdvik Heiervang, Arne Martinus Lindstad, Marthe Loken Midtgaard, Marie Roald, Marie Rosok, Thea Tollersrud, Angelina Zanardi, Olaus Ingskog Bergstrom, Yngvil Beyer, Svein Arne Brygfjeld, and Wilfred Ostgulen.

Thanks also to the Gemma team at Google for releasing Gemma 3, to Sigma2 for facilitating access to compute, to those who provided feedback on the preview release, and to everyone working to make Norwegian language technology more open, useful, and legally robust.