[microTVM] Replace arm_nnsupportfunctions.h with arm_acle.h#13363
Merged
asparkhi merged 6 commits intoapache:mainfrom Jan 9, 2023
Merged
[microTVM] Replace arm_nnsupportfunctions.h with arm_acle.h#13363asparkhi merged 6 commits intoapache:mainfrom
asparkhi merged 6 commits intoapache:mainfrom
Conversation
Collaborator
|
Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment. Generated by tvm-bot |
Member
Author
|
The include is removed in #13242, so we can wait for that to land 😸 |
540681a to
aaa7786
Compare
Mousius
commented
Nov 25, 2022
183ee05 to
f98eb78
Compare
f98eb78 to
1aa5d09
Compare
This attempts to replace the CMSIS-NN header with a more portable alternative and avoid dependence on CMSIS
1aa5d09 to
85cfa55
Compare
The packing definitions aren't implemented as ACLE intrinsics nor is there a simple way to convince a C compiler to generate them.
85cfa55 to
ab667a5
Compare
Member
|
@Mousius any update on this? |
Contributor
We have started looking at it. I am able to reproduce the hang locally: |
Introduce `memcpy` to explain to the compiler that we're changing the alignment of `int16_t` to `int32_t`. What this appears to actually do is encourage the compiler to use three loads rather than one double load plus a regular load. The padded array is aligned as an `int16_t`, it isn't guaranteed to behave like an `int32_t` aligned array. One of the side effects of the type punning from `int16_t*` to `int32_t*` is that we're effectively lying to the compiler that this is correctly aligned and it can use instructions which load multiple `int32_t`s at the same time - this does not work 😿 Co-authored-by: Ashutosh Parkhi <ashutosh.parkhi@arm.com>
Member
Author
fzi-peccia
pushed a commit
to fzi-peccia/tvm
that referenced
this pull request
Mar 27, 2023
…3363) * [microTVM] Replace arm_nnsupportfunctions.h with arm_acle.h This attempts to replace the CMSIS-NN header with a more portable alternative and avoid dependence on CMSIS * Remove CMSIS __STATIC_FORCEINLINE macro * Replace more intrinsics with ACLE variants * Use builtins for intrinsics missing in older GCC * Re-use common_includes to propagate shared functions The packing definitions aren't implemented as ACLE intrinsics nor is there a simple way to convince a C compiler to generate them. * Properly align memory access for Introduce `memcpy` to explain to the compiler that we're changing the alignment of `int16_t` to `int32_t`. What this appears to actually do is encourage the compiler to use three loads rather than one double load plus a regular load. The padded array is aligned as an `int16_t`, it isn't guaranteed to behave like an `int32_t` aligned array. One of the side effects of the type punning from `int16_t*` to `int32_t*` is that we're effectively lying to the compiler that this is correctly aligned and it can use instructions which load multiple `int32_t`s at the same time - this does not work 😿 Co-authored-by: Ashutosh Parkhi <ashutosh.parkhi@arm.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This attempts to replace the CMSIS-NN header with a more portable alternative and avoid dependence on CMSIS