forked from Dao-AILab/flash-attention
-
Notifications
You must be signed in to change notification settings - Fork 49
Pull requests: ROCm/flash-attention
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Work in progress] Add FP8 support in fwd_prefill
#115
opened Jan 2, 2025 by
brunomazzottiamd
•
Draft
[Do not merge] vllm layout varlen
WIP
work in progress
#106
opened Dec 3, 2024 by
rocking5566
•
Draft
Added Benchmark for Rotary Decode Kernel + Performance Speed Up for Rotary Kernel
#102
opened Nov 22, 2024 by
alexkranias-amd
Loading…
GPUAI-1250 - Flash Attention v2.04 two modules layer_norm cannot be used fixed
#52
opened Apr 3, 2024 by
xiaoxiangAMD
Loading…
GPUAI-1250 - Flash Attention v2.04 module rotary cannot be used code fixed
#47
opened Mar 1, 2024 by
xiaoxiangAMD
Loading…
ProTip!
Updated in the last three days: updated:>2025-01-03.