How to validate a trained BioBert model with squad dataset?

Shafaq · February 2, 2021, 6:17pm

Hi

I am trying to train BioBert model on squad dataset. I am facing 2 problems:
1- The answers to context are not accurate.
2-I am not sure if the model is properly trained?

Following are my questions/contexts and answers from the newly trained model.

x=bot([‘Coronavirus disease 2019 (COVID-19) is a contagious disease caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The first case was identified in Wuhan, China, in December 2019. It has since spread worldwide, leading to an ongoing pandemic.’], [‘What is coronavirus?’])
print(x)

y=bot([‘DeepPavlov is an open-source conversational AI library built on TensorFlow and Keras. DeepPavlov is designed for development of production ready chatbots and complex conversational systems, research in the area of NLP and, particularly, of dialog systems.’’], [‘What is deeppavlov?’])
print(y)

z=bot([‘Machine learning is a method of data analysis that automates analytical model building. It is a branch of artificial intelligence based on the idea that systems can learn from data, identify patterns and make decisions with minimal human intervention.’], [‘What is machine learning?’])
print(z)

a=bot([‘Diabetes is a disease in which your blood glucose, or blood sugar, levels are too high. Glucose comes from the foods you eat. Insulin is a hormone that helps the glucose get into your cells to give them energy.’], [‘What is diabetes?’])
print(a)
Results:
[[‘ongoing pandemic’], [242], [1.2818810939788818]]

[[‘built on TensorFlow and Keras’], [55], [1.2005016803741455]]

[[‘It is a branch of artificial intelligence based on’], [88], [1.1758620738983154]]

[[‘levels are too high. Glucose’], [67], [1.076728105545044]]

Any feedback will be of great help!
Thank you!

Shafaq · February 2, 2021, 6:19pm

This is the config file:

{
“dataset_reader”: {
“class_name”: “squad_dataset_reader”,
“data_path”: “{DOWNLOADS_PATH}/squad/”
},
“dataset_iterator”: {
“class_name”: “squad_iterator”,
“seed”: 1337,
“shuffle”: true
},
“chainer”: {
“in”: [“context_raw”, “question_raw”],
“in_y”: [“ans_raw”, “ans_raw_start”],
“pipe”: [
{
“class_name”: “bert_preprocessor”,
“vocab_file”: “{DOWNLOADS_PATH}/biobert_models/biobert_v1.1_pubmed/vocab.txt”,
“do_lower_case”: false,
“max_seq_length”: 384,
“in”: [“question_raw”, “context_raw”],
“out”: [“bert_features”]
},
{
“class_name”: “squad_bert_mapping”,
“do_lower_case”: false,
“in”: [“context_raw”, “bert_features”],
“out”: [“subtok2chars”, “char2subtoks”]
},
{
“class_name”: “squad_bert_ans_preprocessor”,
“do_lower_case”: false,
“in”: [“ans_raw”, “ans_raw_start”,“char2subtoks”],
“out”: [“ans”, “ans_start”, “ans_end”]
},
{
“class_name”: “squad_bert_model”,
“bert_config_file”: “{DOWNLOADS_PATH}/biobert_models/biobert_v1.1_pubmed/bert_config.json”,
“pretrained_bert”: “{DOWNLOADS_PATH}/biobert_models/biobert_v1.1_pubmed/model.ckpt”,
“save_path”: “{MODELS_PATH}/squad_biobert/model”,
“load_path”: “{MODELS_PATH}/squad_biobert/model”,
“keep_prob”: 0.5,
“learning_rate”: 2e-05,
“learning_rate_drop_patience”: 2,
“learning_rate_drop_div”: 2.0,
“in”: [“bert_features”],
“in_y”: [“ans_start”, “ans_end”],
“out”: [“ans_start_predicted”, “ans_end_predicted”, “logits”]
},
{
“class_name”: “squad_bert_ans_postprocessor”,
“in”: [“ans_start_predicted”, “ans_end_predicted”, “context_raw”, “bert_features”, “subtok2chars”],
“out”: [“ans_predicted”, “ans_start_predicted”, “ans_end_predicted”]
}
],
“out”: [“ans_predicted”, “ans_start_predicted”, “logits”]
},
“train”: {
“show_examples”: false,
“test_best”: false,
“validate_best”: true,
“log_every_n_batches”: 250,
“val_every_n_batches”: 500,
“batch_size”: 10,
“pytest_max_batches”: 2,
“pytest_batch_size”: 5,
“validation_patience”: 10,
“metrics”: [
{
“name”: “squad_v1_f1”,
“inputs”: [“ans”, “ans_predicted”]
},
{
“name”: “squad_v1_em”,
“inputs”: [“ans”, “ans_predicted”]
},
{
“name”: “squad_v2_f1”,
“inputs”: [“ans”, “ans_predicted”]
},
{
“name”: “squad_v2_em”,
“inputs”: [“ans”, “ans_predicted”]
}
],
“tensorboard_log_dir”: “{MODELS_PATH}/squad_biobert/logs”
},
“metadata”: {
“variables”: {
“ROOT_PATH”: “C:/Users/Amjad Enterprises/.deeppavlov”,
“DOWNLOADS_PATH”: “{ROOT_PATH}/downloads”,
“MODELS_PATH”: “{ROOT_PATH}/models”
},
“requirements”: [
“{DEEPPAVLOV_PATH}/requirements/tf.txt”,
“{DEEPPAVLOV_PATH}/requirements/bert_dp.txt”
],
“download”: [
{
“url”: “https:/github.com/naver/biobert-pretrained/releases/download/v1.1-pubmed/biobert_v1.1_pubmed.tar.gz”,
“subdir”: “{DOWNLOADS_PATH}/biobert_models”
},
{
“url”: “https:/github.com/naver/biobert-pretrained/releases/download/v1.1-pubmed/biobert_v1.1_pubmed.tar.gz”,
“subdir”: “{MODELS_PATH}”
}
]
}
}

yurakuratov · February 3, 2021, 10:53am

During the training model reports EM and F1 metrics on the validation set. Did you check them?
On SQuAD dataset model should have about 80 EM and 88 F-1 (docs).

Shafaq · February 3, 2021, 4:55pm

Here are the numbers.
{“valid”: {“eval_examples_count”: 10570, “metrics”: {“squad_v1_f1”: 88.4918, “squad_v1_em”: 80.8828, “squad_v2_f1”: 88.2996, “squad_v2_em”: 80.7001}

They are same as you said!

yurakuratov · February 5, 2021, 10:20am

These numbers are suspiciously close to the numbers that we get with default BERT-base. Are these numbers for BioBERT after training or for BERT-base?

Shafaq · February 5, 2021, 11:03am

I apologise for this inconvenience. I have mistakenly copied from the BERT base.

Give me some time I will copy the scores for BIOBERT model.

Shafaq · February 6, 2021, 5:37pm

These are the scores for BioBert model.

{“valid”: {“eval_examples_count”: 10570, “metrics”: {“squad_v1_f1”: 6.6215, “squad_v1_em”: 0.4841, “squad_v2_f1”: 6.6023, “squad_v2_em”: 0.4825}, “time_spent”: “0:06:37”, “epochs_done”: 0, “batches_seen”: 0, “train_examples_seen”: 0, “impatience”: 0, “patience_limit”: 10}}

Topic		Replies	Views
Training Pre-trained Model in Deeppavlov	2	311	February 2, 2021
Integrating custom BERT model and training model with csv dataset Models	18	1298	May 11, 2023
Complete guide on mulilingual QA model implementation Tutorials & Guidelines	1	388	August 12, 2021
Differences between Squad models DeepPavlov Library	1	667	October 29, 2019
How to change dataset for the demo Simple intent recognition question answering bot? DeepPavlov Library	2	334	April 13, 2022

How to validate a trained BioBert model with squad dataset?

Related topics