BERT SQUAD no answer

Hi!

Yes, squad_bert_infer.json model was not trained on data with no answer, but examples with no answer could appear during training process: if paragraph is too long we cut it to 384 subtokens (question + paragraph + special tokens). So, squad_bert model is able to deal with no answer questions (it uses [CLS] token as no answer) , but it is better to specially train it on such kind of data.

If you need model trained on data with no answer you can try multi_squad_noans_infer.json config. This model is based on R-Net and data used for training is described here: http://docs.deeppavlov.ai/en/master/features/models/squad.html#squad-with-contexts-without-correct-answers . You can also train BERT-based model on this data.

We don’t have pre-trained BERT model on SQuAD 2.0 dataset, but you can train such model by yourself: all you need is to code dataset_reader for SQuAD 2.0 dataset or convert SQuAD 2.0 dataset to the same format as SQuAD 1.1 and use squad_bert.json config for training.