uops.info Alder Lake-P latency for SHLX #33

tavianator · 2024-12-31T20:10:21Z

https://uops.info/html-instr/SHLX_R64_R64_R64.html#ADL-P lists SHLX as having 3-cycle latency for both operands. This is in contrast to Intel's docs and InstLatx64's measurements. So what gives?

I looked into it and figured out something strange. If you run the specified nanoBench command, you'll see

# lscpu
...
  Model name:             12th Gen Intel(R) Core(TM) i7-1280P
...
# ./nanoBench.sh -f -unroll 100 -warm_up_count 10 -config configs/cfg_AlderLakeP_common.txt -cpu 0 \
    -asm "SHLX R9, R8, R10; MOVSX R8, R9D; MOVSX R8, R8D; MOVSX R8, R8D; MOVSX R8, R8D; MOVSX R8, R8D" \
...
Core cycles: 8.00
...

Here R10 has some arbitrary value. Let's try setting it to 1:

# ./nanoBench.sh -f -unroll 100 -warm_up_count 10 -config configs/cfg_AlderLakeP_common.txt -cpu 0 \
    -asm "SHLX R9, R8, R10; MOVSX R8, R9D; MOVSX R8, R8D; MOVSX R8, R8D; MOVSX R8, R8D; MOVSX R8, R8D" \
    -asm_init "MOV R10, 1"
...
Core cycles: 8.00
...
# ./nanoBench.sh -f -unroll 100 -warm_up_count 10 -config configs/cfg_AlderLakeP_common.txt -cpu 0 \
    -asm "SHLX R9, R8, R10; MOVSX R8, R9D; MOVSX R8, R8D; MOVSX R8, R8D; MOVSX R8, R8D; MOVSX R8, R8D" \
    -asm_init "MOV R10D, 1"
...
Core cycles: 6.00
...

So it actually matters how the count register is initialized. It seems like if it's a 64-bit op, then you get 3c latency. But a 32-bit op that implicitly zeroes the top half gets it down to 1c latency. Other things like MOV R10D, R10D also work.

I'm not sure how the nanoBench commands on uops.info are generated, so I'm reporting this here.

Relevant discussion here:

The text was updated successfully, but these errors were encountered:

tavianator · 2025-01-04T19:10:05Z

More details here: https://tavianator.com/2025/shlxplained.html

It's related to the "small immediate add renaming" optimization introduced in Alder Lake

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

uops.info Alder Lake-P latency for SHLX #33

uops.info Alder Lake-P latency for SHLX #33

tavianator commented Dec 31, 2024

tavianator commented Jan 4, 2025

uops.info Alder Lake-P latency for SHLX #33

uops.info Alder Lake-P latency for SHLX #33

Comments

tavianator commented Dec 31, 2024

tavianator commented Jan 4, 2025