Skip to content

Fix vectorization with predicates#617

Merged
inducer merged 2 commits intomainfrom
fix_vectorization_with_preds
May 10, 2022
Merged

Fix vectorization with predicates#617
inducer merged 2 commits intomainfrom
fix_vectorization_with_preds

Conversation

@kaushikcfd
Copy link
Collaborator

@kaushikcfd kaushikcfd commented May 7, 2022

TODO:

  • The added regression fails with a CL-kernel compilation error. Not (yet) sure how to teach loopy.codegen about such predicates.

@kaushikcfd kaushikcfd force-pushed the fix_vectorization_with_preds branch from 76c969f to e72894e Compare May 8, 2022 00:07
@kaushikcfd
Copy link
Collaborator Author

FWIW, the issue here seems to be that the CodeGenerationState.try_vectorized is a broken interface as it isn't in control of the predicates.

@inducer
Copy link
Owner

inducer commented May 9, 2022

try_vectorized is broken

IMO, float4-style vectorization (in CL, but also in gcc vector extensions) does not have a good way to express masking (i.e. lane-varying control flow). I'd suggest we don't try to support it and scalarize instead.

@sv2518
Copy link
Contributor

sv2518 commented May 9, 2022

TJ didn't support vectorisation of predicates either I think.

@kaushikcfd kaushikcfd force-pushed the fix_vectorization_with_preds branch from e72894e to 361b6ff Compare May 9, 2022 19:41
@kaushikcfd
Copy link
Collaborator Author

kaushikcfd commented May 9, 2022

TJ didn't support vectorisation of predicates either I think.

Yep!

IMO, float4-style vectorization (in CL, but also in gcc vector extensions) does not have a good way to express masking (i.e. lane-varying control flow). I'd suggest we don't try to support it and scalarize instead.

In a personal meeting, it was discussed that let's xfail this particular regression and hope to land a fix for #615 once #372 lands.


Workaround: Currently in the proposed PyOP2 patch (OP2/PyOP2#654) instead of leaving just these instructions unvectorized, we "turn-off" vectorization for all instructions of such kernels.

@kaushikcfd kaushikcfd force-pushed the fix_vectorization_with_preds branch from 361b6ff to 8cda9fd Compare May 9, 2022 19:53
@kaushikcfd kaushikcfd requested a review from inducer May 9, 2022 19:54
@inducer inducer force-pushed the fix_vectorization_with_preds branch from 8cda9fd to 0701507 Compare May 10, 2022 00:05
@inducer inducer enabled auto-merge (rebase) May 10, 2022 00:05
@inducer
Copy link
Owner

inducer commented May 10, 2022

Thx!

@inducer inducer merged commit 696c6a9 into main May 10, 2022
@inducer inducer deleted the fix_vectorization_with_preds branch May 10, 2022 01:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants