[Strategy] Support for Int8 schedules - CUDA/x86#5031
[Strategy] Support for Int8 schedules - CUDA/x86#5031icemelon merged 8 commits intoapache:masterfrom
Conversation
|
@vinx13 Adding you as well. Because I have padded C dim for GPU using Legalize to use DP4A schedules. Otherwise, we will have to put a check in strategy. |
icemelon
left a comment
There was a problem hiding this comment.
I think this line https://github.com/apache/incubator-tvm/pull/5031/files#diff-bf1d7b23844ba1082c770babaa524806R178 should pass both the final output (outs[0].op) and the conv output to the _schedule_conv2d_NCHWc_int8. Otherwise len(s[output].op.axis) == 5 should be always true, right? Correct me if I'm wrong.
|
Could you add a few tests for conv2d_nchw_int8 in the topi/tests/python/test_topi_conv2d_int8.py? otherwise, lgtm |
|
I think padding channels would be helpful, it would be good if we have comparison result (channel padding + int8 template vs direct template) |
|
Thanks @anijain2305 @kevinthesun @vinx13. This is now merged. |
* [CUDA] Op strategy changes for Int8 schedules. * Applying Haichen's suggestions. * Make 4D output work for task extraction. * Make x86 work. * Fix lint. * Lint fixes. * Tests, comments, out channel a multiple of 4. * Topi test. Co-authored-by: Ubuntu <ubuntu@ip-172-31-38-96.us-west-2.compute.internal>
* [CUDA] Op strategy changes for Int8 schedules. * Applying Haichen's suggestions. * Make 4D output work for task extraction. * Make x86 work. * Fix lint. * Lint fixes. * Tests, comments, out channel a multiple of 4. * Topi test. Co-authored-by: Ubuntu <ubuntu@ip-172-31-38-96.us-west-2.compute.internal>
Recently introduce op strategy currently has some issues with task extraction with AutoTVM. This PR fixes them for x86/CUDA.
@kevinthesun @icemelon9