-
Notifications
You must be signed in to change notification settings - Fork 12.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AMDGPU] [SelectionDAG] [GlobalIsel] select with constant combine into binaryOp with zext/sext #121145
Comments
@llvm/issue-subscribers-backend-amdgpu Author: Vikash Gupta (vg0204)
In both the ISEL under generic combines, various select with constants combine into binary ops with zext/sext operand like
For various architecture, instruction materialization for zext/sext might be cheaper as compared to select, thus making sense for above combine optimization. But in case of AMDGPU, both the zext/sext & select (for f32 with inline constants) materializes into If you look from different persepective, as in AMDGPU both the Zext/Sext and Select boils down to same machine instruction canonincally, thus really undoing the folding of binOp into Select. For example :
on which the binOp into Select combine is really missed, as Select is eliminated, but nevertheless (Zext cond) It is the root cause of SWDEV-505394, by increasing the code length. |
Suggested Solutions : Solution-0 : A target-specific combine to fold BinOp into Zext(in form of Select). This should be invoked after the completion of generic combines postDAGLegalization.
Solution-1 : A target-specific hook which prevents the combine of select with constants which give rise to binOp with zext/sext operand for AMDGPU, really eliminating the problem from source. But, have shaky thoughts on its implementation in a cleaner way without affecting the other target's pipeline. |
In globalisel, all combines are explicitly opt-in. This doesn't need a new hook, and is already a generic combine in the DAG. We can just add it to the default sets |
Can you say specifically, what are you referring to be already in generic combine in the DAG; and can be just added to default set? Do you referring to an already exiting combine OR talking about the one I mentioned in the Solution-1 ? |
In both the ISEL under generic combines, various select with constants combine into binary ops with zext/sext operand like
For various architecture, instruction materialization for zext/sext might be cheaper as compared to select, thus making sense for above combine optimization.
But in case of AMDGPU, both the zext/sext & select (for f32 with inline constants) materializes into
v_cndmask_b32_e64
. Thus the above optimization increases the cost by introducing an additional binary instruction.If you look from different persepective, as in AMDGPU both the Zext/Sext and Select boils down to same machine instruction canonincally, thus really undoing the folding of binOp into Select. For example :
Select Cond, 7, 6 --> add ( zext Cond ), 6
materializes as :instead of
on which the binOp into Select combine is really missed, as Select is eliminated, but nevertheless (Zext cond) materializes as same as (Select cond 1, 0). So for AMDGPU :
add ( zext Cond ), 6 <==> add ( Select 1, 0 ), 6
after the instruction selection is done. This really showcases that zext introduction (via select's combine) really caused the skip of BinOp fold into select, introducing the additional binary instruction.It is the root cause of SWDEV-505394, as increases the code length.
The text was updated successfully, but these errors were encountered: