Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about the scale and shift operation in Instance normalization layer. #6

Open
vd001 opened this issue Aug 31, 2018 · 2 comments

Comments

@vd001
Copy link

vd001 commented Aug 31, 2018

Hi, I tried to replace the Instance normalization (IN) layer with "MVN_layer + Scale_layer" in caffe as in issue (#4), but found the network hard to converge. When i remove every scale layer following MVN(i.e. use MVN layer only), the network converges.
My question is:
If i replace IN with MVN layers only in caffe, does it hurt the generalization or transfer ability of IBN-Net, or is the scale layer really important?
what makes the net hard to converge when MVN layer is followed by scale layers?
thanks again!

@XingangPan
Copy link
Owner

@vd001 I may not be able to answer your question since I haven't try IBN-Net without scale layers. In pytorch the scale layer does not interfere convergence.
You may have a try and see if the model work well without scale layers.
BTW, you may check if the settings of the scale layers are correct. For example, the 'scale' and 'shift' should be initialized with 1 and 0, and they should have proper learning rate.

@lihui52
Copy link

lihui52 commented Feb 19, 2019

@vd001 I have encountered the same problem with you, have you solved this problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants