Skip to content

[x86] Generate AVX512 fixed-point instructions#7129

Merged
rootjalex merged 13 commits intomainfrom
rootjalex/x86-fp-cleanup
Oct 31, 2022
Merged

[x86] Generate AVX512 fixed-point instructions#7129
rootjalex merged 13 commits intomainfrom
rootjalex/x86-fp-cleanup

Conversation

@rootjalex
Copy link
Copy Markdown
Member

@rootjalex rootjalex commented Oct 26, 2022

This PR adds support for generating saturating_(add | sub) and pmulh(rs) on Skylake and Cannonlake (i.e. for AVX512BW). It also increases simd_op_check test coverage of fixed-point operations on those archs.

I also did a bit of clean-up on the way:

I did not add abs to codegen because it doesn't appear that LLVM currently exposes non-masked versions of AVX512 abs variants.

Fixes #7002

@rootjalex rootjalex requested a review from abadams October 26, 2022 22:49
Comment thread src/CodeGen_X86.cpp
Comment thread src/CodeGen_X86.cpp Outdated
@steven-johnson
Copy link
Copy Markdown
Contributor

Several legit failures here

@rootjalex
Copy link
Copy Markdown
Member Author

Can't quite figure out why the JIT doesn't like ssse3.pabs instructions. I see them used in LLVM tests (i.e. here). Gonna revert the use of those for now, but will still change the .ll to use llvm.abs.

@rootjalex
Copy link
Copy Markdown
Member Author

Ugh, same deal with the avx2.pabs instructions (despite showing up in LLVM tests here). I will revert that change and add a comment, but I don't know why these intrinsics in particular are an issue.

@rootjalex
Copy link
Copy Markdown
Member Author

Just updated the AVX512_Skylake pabs generation, with a fix to complete_x86_target thanks to @abadams

@rootjalex
Copy link
Copy Markdown
Member Author

Only test failure appears unrelated

Comment thread src/CodeGen_X86.cpp Outdated
Copy link
Copy Markdown
Contributor

@steven-johnson steven-johnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, tests pass on my AVX512 Linux box

@rootjalex rootjalex merged commit 5da5dfd into main Oct 31, 2022
@rootjalex rootjalex deleted the rootjalex/x86-fp-cleanup branch October 31, 2022 18:36
ardier pushed a commit to ardier/Halide-mutation that referenced this pull request Mar 3, 2024
* clean-up abs and saturating_pmulhrs, fix AVX512 saturating_ ops

* add test coverage for AVX512 fp ops

* generate vpabs on AVX512

* faster AVX2 lowering of saturating_pmulhrs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Saturating instructions not generated on AVX512

3 participants