forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[X86] Use SWAR techniques for some vector i8 shifts
SSE & AVX do not include instructions for shifting i8 vectors. Instead, they must be synthesized via other instructions. If pairs of i8 vectors share a shift amount, we can use SWAR techniques to substantially reduce the amount of code generated. Say we were going to execute this shift right: x >> {0, 0, 0, 0, 4, 4, 4, 4, 0, 0, 0, 0, ...} LLVM would previously generate: vpxor %xmm1, %xmm1, %xmm1 vpunpckhbw %ymm0, %ymm1, %ymm2 vpunpckhbw %ymm1, %ymm0, %ymm3 vpsllw $4, %ymm3, %ymm3 vpblendd $204, %ymm3, %ymm2, %ymm2 vpsrlw $8, %ymm2, %ymm2 vpunpcklbw %ymm0, %ymm1, %ymm3 vpunpcklbw %ymm1, %ymm0, %ymm0 vpsllw $4, %ymm0, %ymm0 vpblendd $204, %ymm0, %ymm3, %ymm0 vpsrlw $8, %ymm0, %ymm0 vpackuswb %ymm2, %ymm0, %ymm0 Instead, we can reinterpret a pair of i8 elements as an i16 and shift use the same shift amount. The only thing we need to do is mask out any bits which crossed the boundary from the top i8 to the bottom i8. This SWAR-style technique achieves: vpsrlw $4, %ymm0, %ymm1 vpblendd $170, %ymm1, %ymm0, %ymm0 vpand .LCPI0_0(%rip), %ymm0, %ymm0 This is implemented for both left and right logical shift operations. Arithmetic shifts are less well behaved here because the shift cannot also perform the sign extension for the lower 8 bits.
- Loading branch information
Showing
4 changed files
with
240 additions
and
5 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters