Retrain the multi language NER model(ner_ontonotes_bert_mult) with a dataset in a different language

I have successfully installed the multi-language NER model(ner_ontonotes_bert_mult).
I want to retrain this model with new data(in the same format as you suggest in the documentation)that are in the Albanian language. Is this possible(to retrain the multi-language NER model from DeepPavlov with data in a different language), or the retrain works only if we have English data??

Yes, you can fine-tune the model on any language that was used for Multilingual BERT training (bert/multilingual.md at master · google-research/bert · GitHub).

Great! Thanks for the response. Also is there a problem if in my dataset I don’t have all the tags.
I am facing the same error as in this case. In my dataset, I will have only a subset from the list of 18 available tags listed here:

UPDATE: I successfully retrained the ner_ontonotes_bert_mult model with a dataset in Albanian language. Since I didn’t have the tags in my dataset I removed the "fit_on": ["y"], line from the config description of the tag_vocab component.

Great! Glad you succeeded.