The NB AI-lab trains models for various purposes. They are most often based on the combination of NB’s digital collection.
Pre-trained language models
Pre-trained semi-supervisely on enormous datasets, modern language models offer the possibility of adjusting their weights for specific supervised downstream tasks at a fraction of the cost with astonishing results. The NB AI-lab has released one of the best performing models for Norwegian and other Scandinavian languages yet.
NB-BERT-base is a general BERT-base model built on the large digital collection at the National Library of Norway.
This model is based on the same structure as BERT Cased multilingual model, and is trained on a wide variety of Norwegian text (both bokmål and nynorsk) from the last 200 years.
Based on NB-BERT-base, the NB AI-lab has also created fine-tuned versions for a variety of tasks.
NB-BERT-base model fine-tuned on the Named Entity Recognition task using the NorNE dataset.