#2615 introduced a performance regression to Mali GPU backend. Because some loops cannot be vectorized, the run time of resnet-18 increases to 750ms from original 130ms. Rolling back to its previous commit can fix the problem and reproduce our released benchmark.
This bug was reported by users in the forum.
https://discuss.tvm.ai/t/bad-performance-after-using-tvm-on-nanopc-t4/2006/3
https://discuss.tvm.ai/t/unable-to-reproduce-benchmark-results-on-rk3399-mali-t860/2340/7
@yzhliu has a temporary fix. Please follow up.
cc @icemelon9
#2615 introduced a performance regression to Mali GPU backend. Because some loops cannot be vectorized, the run time of resnet-18 increases to 750ms from original 130ms. Rolling back to its previous commit can fix the problem and reproduce our released benchmark.
This bug was reported by users in the forum.
https://discuss.tvm.ai/t/bad-performance-after-using-tvm-on-nanopc-t4/2006/3
https://discuss.tvm.ai/t/unable-to-reproduce-benchmark-results-on-rk3399-mali-t860/2340/7
@yzhliu has a temporary fix. Please follow up.
cc @icemelon9