[microTVM][RISCV] Tensorization for conv_2d_nchw_int8 with RVV extension#14836
[microTVM][RISCV] Tensorization for conv_2d_nchw_int8 with RVV extension#14836katebern-grovety wants to merge 10 commits intoapache:mainfrom
Conversation
|
Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment. Generated by tvm-bot |
|
|
||
| # 2) Check target. | ||
| current_target = target.Target.current(allow_none=False) | ||
| has_attr = "+v" in current_target.mattr |
There was a problem hiding this comment.
Could you please look up if your schedules would also be supported by the sub-extensions zve32x, zve32f, zve64x, zve64f, zve64d? If this is the case, please extend this check to be aware of these mattr flags.
| #ifdef __cplusplus | ||
| extern "C" | ||
| #endif | ||
| int32_t dot_uint8_int8_int32_body(uint8_t* data, int8_t* kernel, int32_t* output) {{ |
There was a problem hiding this comment.
Do you by any chance have some benchmark results on how this performs compared to the RVV autovectorizer which is provided by RISCV LLVM/GCC?
| @@ -0,0 +1,85 @@ | |||
| # Licensed to the Apache Software Foundation (ASF) under one | |||
There was a problem hiding this comment.
I tried to upstream a patch like this, before and if was rejected because of the AOTTestRunner being removed at some point a favor of the MicroTVM Project API interface. (See #12534 (comment))
I actually have MicroTVM Project API template for Spike for a long time now (and use it regulary). I still need to clean it up before upstreaming: https://github.com/PhilippvK/microtvm-spike-template
There was a problem hiding this comment.
I added Project API template for RISC-V with Spike based on CRT
There was a problem hiding this comment.
Thank you for coming up with that change. Would it be possible to split this up into a different PR with separate tests etc.?
That would make reviewing the individual changes much easier.
There was a problem hiding this comment.
Yes, sure. I'll create a new PR for Project API
|
|
||
| TARGET_CFLAGS = --target=riscv64-unknown-linux-gnu -march=rv64gcv -static | ||
| RUNNER = spike | ||
| RUNNER_OPT = --isa=rv64gcv $(shell which pk) |
There was a problem hiding this comment.
I would like to have support for rv32gc here as well.
| RUNNER = spike | ||
| RUNNER_OPT = --isa=rv64gcv $(shell which pk) | ||
|
|
||
| PKG_CFLAGS = ${PKG_COMPILE_OPTS} ${TARGET_CFLAGS} -O2 \ |
There was a problem hiding this comment.
Some reason why using -O2 instead of -O3 or -Os?
There was a problem hiding this comment.
No, there was no reason
| DMLC_CORE=$(TVM_ROOT)/3rdparty/dmlc-core | ||
| TOOLCHAIN_PATH=$(shell dirname $(shell which riscv64-unknown-linux-gnu-gcc))/.. | ||
|
|
||
| CC = clang-16 |
There was a problem hiding this comment.
Clang-16 and RISC-V GCC could both be supported here?
| has_attr = "+v" in current_target.mattr | ||
| is_arch_support = "v" in current_target.arch[2:] | ||
| if not is_arch_support and "march" in current_target.attrs: | ||
| is_arch_support = "v" in current_target.attrs["march"] |
There was a problem hiding this comment.
I was not aware that -march= is actually exposed to the TVM target. This is great because it would also work with non-llvm TVM targets.
| s[kernel_vec].reorder(oc_chunk, oh, ic_chunk, ow, ic_block, oc_block) | ||
| oc_bn = cfg["tile_oc"].size[-1] | ||
| if oc_bn > 1: | ||
| s[kernel_vec].vectorize(oc_block) |
There was a problem hiding this comment.
You are adding LLVM-exclusive schedule primitives but only testing them on TVMs C-backends (with tir.disable_vectorize=1). We eventually need to also test if your vector intrinsics does interfere with the auto-vectorization going on in LLVM. Testing this should be possible without compiling and running it for Spike, since errors would already show up during tvm.build().
b6db5aa to
f7fb425
Compare
| target_include_directories(tvm_model PRIVATE ${CMAKE_SOURCE_DIR}/include crt_config crt/include) | ||
| target_compile_options(tvm_model PRIVATE -Wno-error=unused-variable -Wno-error=missing-braces -Wno-error=unused-const-variable -Wno-unused-variable) | ||
| set_target_properties(tvm_model PROPERTIES LINKER_LANGUAGE C) | ||
| target_link_libraries(main PRIVATE tvm_model) |
There was a problem hiding this comment.
I think you would need to add m (aka. libmath) here to support Softmax layers.
| spike_args = ["spike", f"--isa={march}", os.path.join(toolchain_path, target, "bin/pk")] | ||
| self._proc = subprocess.Popen( | ||
| spike_args + [self.BUILD_TARGET], | ||
| stdin=subprocess.PIPE, |
There was a problem hiding this comment.
I was wondering if you also run into the issue with the broken STDIN support in recent version of spike mentioned here: riscv-software-src/riscv-isa-sim#1400
Are you using an version of spike which does not have that problem or did you find another workaround?
There was a problem hiding this comment.
On the latest version of spike, when using instruction counting, I got an error
An illegal instruction was executed!
now I'm using spike version 1.1.0 and this error does not occur on it.
| set(CMAKE_CXX_STANDARD 11) | ||
| set(CMAKE_C_COMPILER "clang") | ||
| set(CMAKE_CXX_COMPILER "clang++") | ||
| set(FLAGS "--target=${TARGET} --sysroot=${TOOLCHAIN_PATH}/sysroot --gcc-toolchain=${TOOLCHAIN_PATH} -march=${MARCH} -static") |
There was a problem hiding this comment.
I think we should also expose -mabi to the user.
| toolchain_path = options.get("toolchain_path") | ||
| target = options.get("target") | ||
| march = options.get("march") | ||
| spike_args = ["spike", f"--isa={march}", os.path.join(toolchain_path, target, "bin/pk")] |
There was a problem hiding this comment.
The --isa argument of spike will not always be the same as the -march used with clang. I would prefer having isa exposed as project option (defaulting to use the same value as march) to overwrite it if required.
There was a problem hiding this comment.
I'll add isa as project option.
| default="rv64gc", | ||
| help="Sets the value of target architecture.", | ||
| ), | ||
| ], |
There was a problem hiding this comment.
To compare how this performs on different vector length, we should make it configurable either by exposing vlen and elen as project options or by adding a generic spike_extra_args option to add any user defined arguments to the cmdline. See here: https://github.com/PhilippvK/microtvm-spike-template/blob/main/template_project/microtvm_api_server.py#L136-L149
f15d664 to
fe48a1b
Compare
Dot product tensorization with RISC-V V-extension intrinsic for conv_2d_nchw_int8. Using llvm intrinsic.
Testing with AOT test runner on Spike.