-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LearnableSqueezer #94
base: master
Are you sure you want to change the base?
Conversation
Codecov ReportAttention:
📢 Thoughts on this report? Let us know!. |
|
||
# Gradient test (parameters) | ||
T = Float64 | ||
C = LearnableSqueezer(k[1:N]...) |> device; C.stencil_pars.data = InvertibleNetworks.CUDA.cu(C.stencil_pars.data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those explicit calls to CUDA will break ci. Need to use device here even if not as optimal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One issue that I have with |> device is the automatic conversion to Float32; here I was using Float64 to avoid numerical cancellation spoiling the gradient test. Anyway, I will rewrite the test so that we are more consistent with the rest of the tests.
Related to this but perhaps for later, I think it might be a good idea to make every InvertibleNetwork subtype parametric. It might make these conversions and include half-precision nets more easily. Unless there are more important reasons not to do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related to this but perhaps for later, I think it might be a good idea to make every InvertibleNetwork subtype parametric.
Yes completely agree!
@grizzuti in my experience sometimes the random seed for the gradient test can be finicky so I have set the tests to rerun a few times before calling it a failure. I think this particularily relevant if the test fails on a single Julia version. I will try this on this branch and if that 1.6 version passes we should merge |
Added a "learnable squeezer" layer, typically used in invertible U-nets (see Etmann, et al., 2020, https://arxiv.org/abs/2005.05220).
Some other very minor changes, the most important of which is having removed the type "InvertibleLayer" and changed it to "InvertibleNetwork". Didn't really see a need for having a separate "InvertibleLayer" type.