How was ner_ontonotes_bert_mult built?

I’m trying to use your model ner_ontonotes_bert_mult to implement a few-shot tranfer to russian language (literature topic), and I’m having trouble understand how your model was built originally.
If I’m not mistaking I understand it was built by fine-tuning multilingual BERT model on english part of ontonotes dataset. but how it was generalized to contain 104 languages?
I would be grateful if you would share me the paper that you relied on? it would be very helpful for my research.

Thank you

Hey @alhassanha, Thank you very much for your interest!

More or less the idea behind multilingual BERT is described here

In short, M-BERT allows you to enable cross-lingual transfer between languages. More information can be found in our paper EXPLORING THE BERT CROSS‑LINGUAL TRANSFER FOR READING COMPREHENSION.

1 Like

Thank you @Vasily :raised_hands:. I was looking for this paper for days.
It will very helpfull