Skip to content

Commit

Permalink
PR#431 porting from vvenc.
Browse files Browse the repository at this point in the history
Refactor AArch64 Interpolation Filter 16x16 implementation (#431)

* Move InterpolationFilter{ARM.h => _neon.cpp}

Since this header is only used in one place and would not share any code
with an eventual SVE implementation, simply move it to a .cpp file
similar to MCTF.cpp.

* Refactor simdFilter16xX_N8_neon

The use of the vsrcv temporary array rather than simple local variables
meant that LLVM emitted an unnecessary number of load/store instructions
in the inner loops. Refactoring this to make the dependency between loop
iterations more explicit allows for much nicer generated code.

Running a video encoding job on a Neoverse V2 machine using the
--preset=fast setting shows a ~1.8% improvement in reported FPS.
  • Loading branch information
georges-arm authored and Proudsalsa committed Oct 21, 2024
1 parent 3c9c671 commit 3d091bd
Show file tree
Hide file tree
Showing 2 changed files with 287 additions and 413 deletions.
Loading

0 comments on commit 3d091bd

Please sign in to comment.