You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I tried to replace the Instance normalization (IN) layer with "MVN_layer + Scale_layer" in caffe as in issue (#4), but found the network hard to converge. When i remove every scale layer following MVN(i.e. use MVN layer only), the network converges.
My question is:
If i replace IN with MVN layers only in caffe, does it hurt the generalization or transfer ability of IBN-Net, or is the scale layer really important?
what makes the net hard to converge when MVN layer is followed by scale layers?
thanks again!
The text was updated successfully, but these errors were encountered:
@vd001 I may not be able to answer your question since I haven't try IBN-Net without scale layers. In pytorch the scale layer does not interfere convergence.
You may have a try and see if the model work well without scale layers.
BTW, you may check if the settings of the scale layers are correct. For example, the 'scale' and 'shift' should be initialized with 1 and 0, and they should have proper learning rate.
Hi, I tried to replace the Instance normalization (IN) layer with "MVN_layer + Scale_layer" in caffe as in issue (#4), but found the network hard to converge. When i remove every scale layer following MVN(i.e. use MVN layer only), the network converges.
My question is:
If i replace IN with MVN layers only in caffe, does it hurt the generalization or transfer ability of IBN-Net, or is the scale layer really important?
what makes the net hard to converge when MVN layer is followed by scale layers?
thanks again!
The text was updated successfully, but these errors were encountered: