Stronger chain detection in LoopCarry pass#8016
Conversation
|
Also, fixed a bug when indices with different types are compared. BTW, as far as I know, this pass is only used in Hexagon and Xtensa backends. |
|
All tests are green now. |
|
See the comment up at line 250. It's not safe to use can_prove on a boolean Expr after doing substitute_in_all_lets. To make it safe to call, you have to call common_subexpression_elimination on the Expr first. Note that this gets called on every pair of indices, so it has quadratic complexity in the IR size. I worry that this will stall for very large unrolled stencils. It's worth writing a test of a very large case. If it does indeed stall, we might need a better algorithm. One could for example hash the expressions and look for hash collisions, where by "hash" I mean substitute in some arbitrary values for the variables and constant-fold, and then only do can_prove on exprs that have the same hash. |
Thanks a lot, this is very helpful! I changed it to apply CSE first and only then run can_prove. Also, added a test which triggers loop_carry on the loop with large number of indices and the compilation time seems to be fine. |
|
I don't think test failures are related. |
* Stronger chain detection in LoopCarry * Make sure that types are the same * Add a comment * Run CSE before calling can_prove * Test for loop carry * clang-tidy * Add missing override * Update comments
can_proveis stronger thangraph_equal, because it doesn't require index expressions to be exactly the same, but evalutate to the same value. I kept thegraph_equalcheck, because it's faster and should be executed before the more expensive check.In one of the internal workloads, I see that with this change, what was previously split into three different chains of 4-, 2-, 3- values, is correctly combined into one long chain of lenght 9-.