[TVM][CUDA] NVIDIA GPU Int8 Support#1503
Conversation
|
@vinx13 since you are working on related stuff, can you take a look? |
| // directly 4 8 bit int in integer. | ||
| os << "int"; return; | ||
| enable_int8_ = true; | ||
| os << "char4"; return; |
There was a problem hiding this comment.
@tqchen do we need to support other lanes size? e.g. int4 if lanes == 16
There was a problem hiding this comment.
This, this would actually be very helpful, @vinx13 can you elaborate what are possible things we need to support and value translation rules in here?
There was a problem hiding this comment.
the rule here will be:
lanes() == 8 => int2 (aligned by 8)
lanes() == 16 => int4 (aligned by 16)
|
@nishi-t can you act on @vinx13 's comment? being able to perform full vector load will be very helpful to get the full perf. Specifically, we want to be able to load 4 words from memory at a time, unfortunately there is no char16 struct so we have two solutions:
int2 and int4 might be a easier path for now if we only need save/load and get nvidia's native support |
|
Sounds good, can you add test case to cover the load, possibly by a shared memory load vectorize? |
|
looks good from my side |
|
@tqchen I addressed. Please review again. |
This PR adds int8 support for nvidia gpus