[QNN] Conv2D operator#3580
Conversation
4f6e6bf to
52a3c2b
Compare
| } | ||
| } | ||
|
|
||
| /* |
There was a problem hiding this comment.
While the Requantize and Legalize pass are going through final details, it will be useful to prefetch and look at QNN conv2d lowering. Please review and let me know your comments.
c35d314 to
020c123
Compare
zhenhuaw-me
left a comment
There was a problem hiding this comment.
Thank you for ping. Suggesting to have if {NHWC} elif {NCHW} else {assert} when handling different memory layout. And, I am not sure dividing the compute into 4 terms is optimization-friendly, but that's out of this PR scope I think.
ps. Only looked into part of this PR, will recheck when the requantize op were merged. Feel free to ping please.
| input_scale of the input quantized tensors. The zero point of the output | ||
| quantized tensor is 0. By default, the dtype of output is int32. Please also | ||
| refer to Requantize operator to understand how to scale back the int32 | ||
| ouptut to (u)int8. |
| out_dtype="int32"): | ||
| r"""Quantized 2D convolution. | ||
|
|
||
| This operator convolves quantized weight with quantized data. The scale of |
There was a problem hiding this comment.
Can it be convolves quantized data with quantized weight? I know they are basically the same though...
There was a problem hiding this comment.
Yup, that sounds better.
| r"""Quantized 2D convolution. | ||
|
|
||
| This operator convolves quantized weight with quantized data. The scale of | ||
| the output quantized tensor is the product of the weight_scale and |
There was a problem hiding this comment.
Do we need to align the term weight and kernel to be one of them rather than mixed?
|
|
||
| namespace tvm { | ||
| namespace relay { | ||
|
|
There was a problem hiding this comment.
If we are moving code like this to headers, would it be better to have a dedicated PR which involves no extra functionality?
|
|
||
| const auto in_shape = get_shape(0); | ||
| int batch_size, in_channels; | ||
| // NCHW layout |
There was a problem hiding this comment.
Do we need to check it? Maybe simply if NCHW else if NHWC else assert?
There was a problem hiding this comment.
Personally, I'd like to discuss the layout handling a bit (won't block the PR). To me, I always prefer code like if condition 1: path 1; elif condition 2: path 2: else (condition 3): path 3 or assert, even if the input guarantees that condition 3 won't happen. I think the path 1 inside a if section is easier to read, and the last assert is to handle unexpected code typo. Anyway, that is out the scope of this PR :).
|
|
||
| const auto kernel_shape = get_shape(1); | ||
| int out_channels, kernel_h, kernel_w; | ||
| // OIHW layout |
There was a problem hiding this comment.
similar to input layout handling.
| Array<IndexExpr> pad_w({param->padding[1], param->padding[1]}); | ||
|
|
||
| Array<Array<IndexExpr>> pad_width; | ||
| pad_width = {pad_n, pad_c, pad_h, pad_w}; |
There was a problem hiding this comment.
Layout check and handling?
bc5b780 to
699e34c
Compare
|
@u99127 @FrozenGene @jackwish @tqchen This is ready for review. |
|
@tqchen |
|
Pinging again in case this was missed @u99127 @FrozenGene @jackwish @tqchen |
03cc5a1 to
cadb1e5
Compare
cadb1e5 to
8baa53e
Compare
zhenhuaw-me
left a comment
There was a problem hiding this comment.
CI reports that there is merge conflicts, so request changes still...
It is very glad that we are reaching this, the conditional optimization is really beneficial. Thank you for the impressive work @anijain2305 !
| } | ||
| auto reduced_t3 = Sum(Cast(weight, Int(32)), axes_t3, false, false); | ||
|
|
||
| // Find the newshape depenging on NCHW/NHWC layout. |
|
|
||
| const auto in_shape = get_shape(0); | ||
| int batch_size, in_channels; | ||
| // NCHW layout |
There was a problem hiding this comment.
Personally, I'd like to discuss the layout handling a bit (won't block the PR). To me, I always prefer code like if condition 1: path 1; elif condition 2: path 2: else (condition 3): path 3 or assert, even if the input guarantees that condition 3 won't happen. I think the path 1 inside a if section is easier to read, and the last assert is to handle unexpected code typo. Anyway, that is out the scope of this PR :).
f5cedbd to
434f40d
Compare
|
@jackwish Thanks for the good words :) I incorporated your comments. Can you please review again? |
zhenhuaw-me
left a comment
There was a problem hiding this comment.
LGTM. Glad to participate in this, thank you for the great work!
434f40d to
7773c32
Compare
zhiics
left a comment
There was a problem hiding this comment.
Overall LGTM, only left nit comment.
| return CallNode::make(op, {fill_value}, Attrs(attrs), {}); | ||
| } | ||
|
|
||
| static inline Expr Conv2D(Expr data, Expr weight, Array<IndexExpr> strides, |
There was a problem hiding this comment.
It looks this is the same as MakeConv2d, right?
If this is true, should we just keep one signature instead of having duplication. I am not strongly against this because it's obviously used by other cases as well.
There was a problem hiding this comment.
Yeah, I kept it to follow other usecases. I guess this repetition might be because typically TVM does not want header and implementation linking problems. Will keep it Conv2D for now.
There was a problem hiding this comment.
I don't have a strong feeling about this, actually I'm not sure why we prefer to copying this (and the others) here instead of adding declarations
There was a problem hiding this comment.
I think linking should be fine. We can put TVM_DLL if needed. But anyway, we can keep it this way for now.
|
|
||
| /*! | ||
| * Copyright (c) 2019 by Contributors | ||
| * \file nn.cc |
| */ | ||
| WorkloadType GetWorkload(const Array<tvm::relay::Type>& arg_types, const QnnConv2DAttrs* param) { | ||
| // Get conv parameters. | ||
| auto get_shape = [&](const Type& type) { |
There was a problem hiding this comment.
we don't actually need to capture anything here, right?
| // Since, this is integer division (floor), we can first multiply the data by the pool_size and | ||
| // then perform avg_pool2d. Reversing this causes inaccuracy due to floor division. | ||
| auto scaled_hw_t2 = Multiply(casted_t2, MakeConstantScalar(Int(32), kernel_h * kernel_w)); | ||
| Array<IndexExpr> padding; |
There was a problem hiding this comment.
Can we just use?
Array<IndexExpr> padding({0, 0})| from tvm.relay.testing import create_workload | ||
| from tvm.contrib import graph_runtime | ||
|
|
||
| def run_infer_type(expr): |
There was a problem hiding this comment.
run_infer_type could be obtained from tvm.relay.testing as well.
7773c32 to
f2e7ec4
Compare
| return CallNode::make(op, {fill_value}, Attrs(attrs), {}); | ||
| } | ||
|
|
||
| static inline Expr Conv2D(Expr data, Expr weight, Array<IndexExpr> strides, |
There was a problem hiding this comment.
I don't have a strong feeling about this, actually I'm not sure why we prefer to copying this (and the others) here instead of adding declarations
Rebasing. Empty commit. Clang-format styling.
f2e7ec4 to
d3a54c2
Compare
Rebasing. Empty commit. Clang-format styling.
Rebasing. Empty commit. Clang-format styling.
Rebasing. Empty commit. Clang-format styling.
Rebasing. Empty commit. Clang-format styling.
Lowering of QNN Conv2D operation. We break the convolution into 4 terms as described in Option 1 here. Other relevant discussion is present at #2351