I finetuned NER model on some data, with this config:
config_for_train['dataset_reader']['data_path'] = TRAIN_FILES_DIR #train, val, test
config_for_train['chainer']['pipe'][1]['save_path'] = tag_save_path #new model dir
config_for_train['chainer']['pipe'][2]['return_probas'] = False
config_for_train['chainer']['pipe'][2]['save_path'] = model_save_path #new model dir
config_for_train['chainer']['pipe'][2]['out'] = ['y_pred_ind']
new model folder:
So i changed standard ontonotes BERT config with code below, and built model with the resulting config
self.config_dict["chainer"]["pipe"][1]["load_path"] = str(MODELS_DIR) + \
f"/{model_id}/tag.dict"
self.config_dict["chainer"]["pipe"][2]["load_path"] = str(MODELS_DIR) + \
f"/{model_id}/model"
When i trying to find entities in "United States, country in North America that is a federal republic of 50 states and was founded in 1776"
it returns something strange:
[[['United', 'States', ',', 'country', 'in', 'North', 'America', 'that', 'is', 'a', 'federal', 'republic', 'of', '50', 'states', 'and', 'was', 'founded', 'in', '1776']], [['I-EVENT', 'I-CARDINAL', 'I-CARDINAL', 'I-LAW', 'I-EVENT', 'I-EVENT', 'I-EVENT', 'I-EVENT', 'I-MONEY', 'I-CARDINAL', 'I-EVENT', 'I-EVENT', 'I-MONEY', 'I-EVENT', 'I-CARDINAL', 'I-MONEY', 'I-EVENT', 'I-EVENT', 'I-EVENT', 'I-EVENT']]]
Standatd OntoNotes NER BERT returns:
[[['United', 'States', ',', 'country', 'in', 'North', 'America', 'that', 'is', 'a', 'federal', 'republic', 'of', '50', 'states', 'and', 'was', 'founded', 'in', '1776']], [['B-GPE', 'I-GPE', 'O', 'O', 'O', 'B-LOC', 'I-LOC', 'O', 'O', 'O', 'O', 'O', 'O', 'B-CARDINAL', 'O', 'O', 'O', 'O', 'O', 'B-DATE']]]
I think problem may be related to tag.dict