You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Just for curiosity, I've noticed that in your implementation you are using nn.LayerNorm with the standard denominator constant eps=1e-5, whereas in other implementations (DINO [here] and ViT in timm[here]) this parameter is explicitly set to eps=1e-6.
I know that it is a small detail, but details sometimes are super-important for having better models.
Do you think the model is sensitive to this kind of parameter change? Have you ever tried/noticed it?
Thanks!
The text was updated successfully, but these errors were encountered:
Hi!
thanks for this little piece of juicy code!
Just for curiosity, I've noticed that in your implementation you are using
nn.LayerNorm
with the standard denominator constanteps=1e-5
, whereas in other implementations (DINO
[here] andViT
intimm
[here]) this parameter is explicitly set toeps=1e-6
.I know that it is a small detail, but details sometimes are super-important for having better models.
Do you think the model is sensitive to this kind of parameter change? Have you ever tried/noticed it?
Thanks!
The text was updated successfully, but these errors were encountered: