I don’t understand how to train NER model on my own tags. It seems that I miss some steps. I followed recommendations from here ner_few_shot_ru | Fine-tuning the model · Issue #1071 · deeppavlov/DeepPavlov · GitHub
This is my code
import json
from deeppavlov import configs, build_model, train_model
with configs.ner.ner_rus_bert.open(encoding=‘utf8’) as f:
ner_config = json.load(f)
ner_config[‘dataset_reader’][‘data_path’] = ‘contents/my_data/’ # directory with train.txt, valid.txt and test.txt files
ner_config[‘metadata’][‘variables’][‘NER_PATH’] = ‘contents/’
ner_config[‘metadata’][‘download’] = [ner_config[‘metadata’][‘download’][-1]] # do not download the pretrained ontonotes model
ner_model = train_model(ner_config, download=True)
I, nevertheless, obtain this error:
2024-05-14 11:17:30.315 INFO in ‘deeppavlov.core.data.utils’[‘utils’] at line 97: Downloading from http://files.deeppavlov.ai/v1/ner/ner_rus_bert_torch_new.tar.gz to /root/.deeppavlov/models/ner_rus_bert_torch_new.tar.gz
INFO:deeppavlov.core.data.utils:Downloading from http://files.deeppavlov.ai/v1/ner/ner_rus_bert_torch_new.tar.gz to /root/.deeppavlov/models/ner_rus_bert_torch_new.tar.gz
100%|██████████| 1.44G/1.44G [01:13<00:00, 19.6MB/s]
2024-05-14 11:18:44.831 INFO in ‘deeppavlov.core.data.utils’[‘utils’] at line 284: Extracting /root/.deeppavlov/models/ner_rus_bert_torch_new.tar.gz archive into /root/.deeppavlov/models/ner_rus_bert_torch
INFO:deeppavlov.core.data.utils:Extracting /root/.deeppavlov/models/ner_rus_bert_torch_new.tar.gz archive into /root/.deeppavlov/models/ner_rus_bert_torch
2024-05-14 11:19:22.229 WARNING in ‘deeppavlov.core.trainers.fit_trainer’[‘fit_trainer’] at line 66: TorchTrainer got additional init parameters [‘pytest_max_batches’, ‘pytest_batch_size’] that will be ignored:
WARNING:deeppavlov.core.trainers.fit_trainer:TorchTrainer got additional init parameters [‘pytest_max_batches’, ‘pytest_batch_size’] that will be ignored:
2024-05-14 11:19:23.721 INFO in ‘deeppavlov.core.data.simple_vocab’[‘simple_vocab’] at line 104: [saving vocabulary to /root/.deeppavlov/models/ner_rus_bert_torch/tag.dict]
INFO:deeppavlov.core.data.simple_vocab:[saving vocabulary to /root/.deeppavlov/models/ner_rus_bert_torch/tag.dict]
Some weights of the model checkpoint at DeepPavlov/rubert-base-cased were not used when initializing BertForTokenClassification: [‘cls.seq_relationship.bias’, ‘cls.seq_relationship.weight’, ‘cls.predictions.bias’, ‘cls.predictions.transform.dense.weight’, ‘cls.predictions.decoder.bias’, ‘cls.predictions.decoder.weight’, ‘cls.predictions.transform.LayerNorm.weight’, ‘cls.predictions.transform.dense.bias’, ‘cls.predictions.transform.LayerNorm.bias’]
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForTokenClassification were not initialized from the model checkpoint at DeepPavlov/rubert-base-cased and are newly initialized: [‘classifier.weight’, ‘classifier.bias’]
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
2024-05-14 11:19:29.348 WARNING in ‘deeppavlov.core.models.torch_model’[‘torch_model’] at line 96: Unable to place component TorchTransformersSequenceTagger on GPU, since no CUDA GPUs are available. Using CPU.
WARNING:deeppavlov.core.models.torch_model:Unable to place component TorchTransformersSequenceTagger on GPU, since no CUDA GPUs are available. Using CPU.
2024-05-14 11:19:30.838 ERROR in ‘deeppavlov.core.common.params’[‘params’] at line 108: Exception in <class ‘deeppavlov.models.torch_bert.torch_transformers_sequence_tagger.TorchTransformersSequenceTagger’>
Traceback (most recent call last):
File “/usr/local/lib/python3.10/dist-packages/deeppavlov/core/common/params.py”, line 102, in from_params
component = obj(**dict(config_params, **kwargs))
File “/usr/local/lib/python3.10/dist-packages/deeppavlov/models/torch_bert/torch_transformers_sequence_tagger.py”, line 173, in init
super().init(model, **kwargs)
File “/usr/local/lib/python3.10/dist-packages/deeppavlov/core/models/torch_model.py”, line 84, in init
self.load()
File “/usr/local/lib/python3.10/dist-packages/deeppavlov/models/torch_bert/torch_transformers_sequence_tagger.py”, line 253, in load
super().load(fname)
File “/usr/local/lib/python3.10/dist-packages/deeppavlov/core/models/torch_model.py”, line 144, in load
self.model.load_state_dict(model_state)
File “/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py”, line 1671, in load_state_dict
raise RuntimeError(‘Error(s) in loading state_dict for {}:\n\t{}’.format(
RuntimeError: Error(s) in loading state_dict for BertForTokenClassification:
size mismatch for classifier.weight: copying a param with shape torch.Size([7, 768]) from checkpoint, the shape in current model is torch.Size([2, 768]).
size mismatch for classifier.bias: copying a param with shape torch.Size([7]) from checkpoint, the shape in current model is torch.Size([2]).
ERROR:deeppavlov.core.common.params:Exception in <class ‘deeppavlov.models.torch_bert.torch_transformers_sequence_tagger.TorchTransformersSequenceTagger’>
Traceback (most recent call last):
File “/usr/local/lib/python3.10/dist-packages/deeppavlov/core/common/params.py”, line 102, in from_params
component = obj(**dict(config_params, **kwargs))
File “/usr/local/lib/python3.10/dist-packages/deeppavlov/models/torch_bert/torch_transformers_sequence_tagger.py”, line 173, in init
super().init(model, **kwargs)
File “/usr/local/lib/python3.10/dist-packages/deeppavlov/core/models/torch_model.py”, line 84, in init
self.load()
File “/usr/local/lib/python3.10/dist-packages/deeppavlov/models/torch_bert/torch_transformers_sequence_tagger.py”, line 253, in load
super().load(fname)
File “/usr/local/lib/python3.10/dist-packages/deeppavlov/core/models/torch_model.py”, line 144, in load
self.model.load_state_dict(model_state)
File “/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py”, line 1671, in load_state_dict
raise RuntimeError(‘Error(s) in loading state_dict for {}:\n\t{}’.format(
RuntimeError: Error(s) in loading state_dict for BertForTokenClassification:
size mismatch for classifier.weight: copying a param with shape torch.Size([7, 768]) from checkpoint, the shape in current model is torch.Size([2, 768]).
size mismatch for classifier.bias: copying a param with shape torch.Size([7]) from checkpoint, the shape in current model is torch.Size([2]).
RuntimeError Traceback (most recent call last)
in <cell line: 11>()
9 ner_config[‘metadata’][‘download’] = [ner_config[‘metadata’][‘download’][-1]] # do not download the pretrained ontonotes model
10
—> 11 ner_model = train_model(ner_config, download=True)
9 frames
/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in load_state_dict(self, state_dict, strict)
1669
1670 if len(error_msgs) > 0:
→ 1671 raise RuntimeError(‘Error(s) in loading state_dict for {}:\n\t{}’.format(
1672 self.class.name, “\n\t”.join(error_msgs)))
1673 return _IncompatibleKeys(missing_keys, unexpected_keys)
RuntimeError: Error(s) in loading state_dict for BertForTokenClassification:
size mismatch for classifier.weight: copying a param with shape torch.Size([7, 768]) from checkpoint, the shape in current model is torch.Size([2, 768]).
size mismatch for classifier.bias: copying a param with shape torch.Size([7]) from checkpoint, the shape in current model is torch.Size([2]).