-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hyperparameter Sweep and NaNs #16
Comments
Hi Joakim,
Feel free to reach out with more details about your training setup if the issue persists! Best regards, |
Thank you for the insights Yifei! When comparing my config with yours they are pretty much identical. Best |
Im working on reproducing your results for EViT, ATS, DyanmicVIT etc. However, I find that I often run into NaNs about 1/3-1/2 through the training. It doesnt matter if I reserve the prior features or scatter onto a zero-matrix.. I use the config from SViT with no adjustmetns to the optimizer.
Did you observe similar behavior, and what hyperparameters did you use to train the different models; just one fixed set (ie lr = 1e-5) or did you do a sweep per method?
The text was updated successfully, but these errors were encountered: