How to add the Gradient Accumulation to my model?

MuhammedTech · April 20, 2021, 5:24pm

I am following this tutorial of yours dp_tutorials/Tutorial_3_RU_Fine_tuning_BERT_classifier.ipynb at master · deepmipt/dp_tutorials · GitHub

However, I am constantly running out of memory so I decided to use Gradient Accumulation to reduce the memory size. How can I add it to my model? I can’t find any tutorial that explains it well

Thank you

yurakuratov · April 21, 2021, 7:42am

Hi!

We do not have built-in gradient accumulation for our models, but it is straightforward to add.
You should pass full batch to model.train_on_batch / model.__call__ methods and split it on sub-batches in these methods. I would suggest to inherit from TorchTransformersClassifierModel with your implementation of train_on_batch and __call__.

Topic		Replies	Views
Found BERT extremely hard to train	0	268	April 22, 2021
Running out of memory	0	259	April 24, 2021
Training classification models with several GPUs Models	2	408	February 18, 2020
OOM error with GPU DeepPavlov Library	1	423	February 26, 2020
Repeat: Ner ontonotes Bert model training with ontonotes dataset doesn't finish even after 4 days	6	453	July 14, 2021

How to add the Gradient Accumulation to my model?

Related topics