I have trained a my model from scratch with my custom dataset. Now I want to ad some more tags and I don’t want to build a new model from scratch. Is there any way to retrain my previous model with the new dataset??
As number of tags has changed (increased) you will have to re-train model from scratch (from pre-trained BERT weights). You can mix your previous dataset and a new one and merge their sets of labels.
Suppose my previous model is trained on 10K unique tags and I just one to add more 1K new tags. Do I have to train it again with 10K + 1K tags ?? Is there any way to train it just only with 1K new tags.?
It is possible to do (in theory) but is not supported by DeepPavlov. You could reset classification head and initialize it randomly and then start training on 1K new tags. This way requires some manipulations with model checkpoint and is not straightforward.
Can you give me an idea how to do it?@yurakuratov
Ok, the model consists of its body (transformers layers) and classification head.
I need to note that this will break prediction power of your model for all your previous 10K tags. It will re-use some knowledge from body, but will completely forget how to predict them.
Here is an example of how to do something like this:
from deeppavlov import build_model, configs model = build_model(configs.ner.ner_ontonotes_bert_torch, download=True) # take pytorch part of the model pt_model = model.pipe.model # pt_model has classifier head pt_model.classifier # which is Linear(in_features=768, out_features=YOUR_10K_TAGS, bias=True)
So, you can set random classification head
pt_model.classifier = torch.nn.Linear(...) and save
pt_model separately. Then you can use this model as initialization for your training.
But I would definitely suggest you to train the model on combined dataset.