Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix incorrect initialization of GenericFusedScaleMaskSoftmax #1586

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

qmdnls
Copy link

@qmdnls qmdnls commented Feb 15, 2023

Summary

Fixes an error in the initialization of GenericFusedScaleMaskSoftmax which leads to generic_scaled_masked_softmax never actually being called.

Fixes import error in GenericScaledMaskedSoftmax.backward which had also gone unnoticed.

Also adds a test for GenericFusedScaleMaskSoftmax similar to the one for FusedScaleMaskSoftmax to prevent future errors.

Details

Currently when GenericFusedScaleMaskSoftmax is initialized we set self.scaled_masked_softmax_fusion to to the generic_scaled_masked_softmax function. I believe this is done in error as the function called by the fused forward is self.fused_softmax_func and not self.scaled_masked_softmax_fusion (which is supposed to be the flag that indicates whether to use the fused version).

In practice this means that when GenericFusedScaleMaskSoftmax is initialized it actually does not use generic_scaled_masked_softmax at all and is_kernel_available will return True even when initialized with scaled_masked_softmax_fusion = False.

GenericFusedScaleMaskSoftmax incorrectly sets
self.scaled_mask_softmax_fusion to the generic_scaled_mask_softmax
function. Its forward function calls self.fused_softmax_function
however which means generic_scaled_mask_softmax is never called
and standard FusedScaleMaskSoftmax will be used instead.

Also fixed wrong import in GenericScaledMaskedSoftmax.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant