NER, fine-tuning

How much data do I need to fine tune NER model, based on multilingual BERT?

At the moment we did not perform extensive study on the size of the training dataset. However, I would start trying this model from dozens examples per class.