You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am fine-tuning a model on a custom dataset. At training start, I get the warning "None of the inputs have requires_grad=True. Gradients will be None". I made this warning disappear by adding use_reentrant=False in the checkpoint function in the following three lines in transformer.py:
If use_reentrant=True is specified, at least one of the inputs needs to have requires_grad=True if grads are needed for model inputs, otherwise the checkpointed part of the model won’t have gradients. At least one of the outputs needs to have requires_grad=True as well. Note that this does not apply if use_reentrant=False is specified.
Do you know what the underlying issue is?
The text was updated successfully, but these errors were encountered:
hmm, I would have thought this works as long as you don't lock the full image or text towers... but perhaps not, it may not be good idea to checkpoint the parts of the model that have gradients disabled.
Should probably set use_reentrant=False but it's never been clear to me what the downside to that is, the PT docs mention many pluses of =False, but why was =True the default, hohumm
I am fine-tuning a model on a custom dataset. At training start, I get the warning "None of the inputs have requires_grad=True. Gradients will be None". I made this warning disappear by adding
use_reentrant=False
in the checkpoint function in the following three lines intransformer.py
:Interestingly, this also increased performance in the train/val loss and cross-modal retrieval, simply by setting
use_reentrant=False
!My training command is:
The problem is not occurring when removing the following arguments from the training command
It might be related to the following warning from the PyTorch docs (https://pytorch.org/docs/stable/checkpoint.html):
Do you know what the underlying issue is?
The text was updated successfully, but these errors were encountered: