I am following this tutorial of yours dp_tutorials/Tutorial_3_RU_Fine_tuning_BERT_classifier.ipynb at master · deepmipt/dp_tutorials · GitHub
However, I am constantly running out of memory so I decided to use Gradient Accumulation to reduce the memory size. How can I add it to my model? I can’t find any tutorial that explains it well
Thank you