[RFC][Quantization] Designing and lowering of quantized ops by shoubhik · Pull Request #3512 · apache/tvm

shoubhik · 2019-07-08T22:56:24Z

[Relay] [Quantization] WIP - This is the continuation of pull request #3367

In this PR I want to discuss the design and implementation of the

Quantize op -> FP32 to i8/u8
Dequantize Op -> i8/u8 -> fp32

I have added test cases to verify the correctness of the ops.

…apache#3367](apache#3367) In this PR I want to discuss the design and implementation of the - Quantize op -> FP32 to i8/u8 - Dequantize Op -> i8/u8 -> fp32 I have added test cases to verify the correctness of the ops.

1. Correcting docs 2. Reordering Clip and Cast in the dequantize op for stability.

shoubhik · 2019-07-08T23:01:28Z

@tqchen @FrozenGene @ZihengJiang @zhiics @wweic @eqy

zhiics · 2019-07-08T23:33:46Z

Is this ready for review? Have we been converged on the design in the quantization RFC?

anijain2305 · 2019-07-08T23:54:57Z

We have made good progress on the Quantization RFC, achieving clarity and convergence on many points.
For this PR specifically, @tqchen and @FrozenGene, can you please comment if this looks in line with our quantization RFC.

FrozenGene · 2019-07-09T02:22:37Z

+  const auto scale = MakeConstantScalar(Float(32), attrs->output_scale);
+  const int32_t min_val = get_qmin(out_dtype);
+  const int32_t max_val = get_qmax(out_dtype);
+  auto scale_data = Cast(Round(Divide(data, scale)), Int(32));


As discussed in the RFC, this should be round_away_from_zero, i.e. std::round in C++. How about this Round here? I don't see definition of this Round.

This is Relay Round operator which translates to LLVM round intrinsic

https://llvm.org/docs/LangRef.html#llvm-round-intrinsic

The comment on the above link says that it has same behavior as LIBM round function which is defined here
https://sourceware.org/newlib/libm.html#lround

This is round-away-from-zero.

Ok. That's make sense.

shoubhik · 2019-07-11T16:38:11Z

@FrozenGene and @tqchen, any other major comments for the PR?

tqchen · 2019-07-12T17:08:19Z

Mainly organizational issues, please make things consistent with what was discussed in #3531

1. Correcting the file paths as suggested in the reviews.

2. Fixing lint issues.

shoubhik · 2019-07-12T21:43:54Z

Mainly organizational issues, please make things consistent with what was discussed in #3531

I have addressed the namespace issues and have followed the same convetion as #3531 in the new commit.

shoubhik · 2019-07-17T17:29:03Z

@liangfu made the changes you suggested.

liangfu

LGTM

zhenhuaw-me

Have not check lowering code. Basically two things:

Is Quantize and Dequantize handle data type of 16 bits, regarding PR #3531 ?
This PR review may need to suspend until PR #3531 were merged, as there is so many rebase task to do.

zhenhuaw-me · 2019-07-24T05:47:17Z

+  DataType out_dtype;
+
+  TVM_DECLARE_ATTRS(QuantizeAttrs, "relay.attrs.QuantizeAttrs") {
+    TVM_ATTR_FIELD(out_dtype)


I have seen that PR #3531 accepts quantized tensor data type of 8/16 bit, are we going to align?

zhenhuaw-me · 2019-07-24T05:51:04Z

+namespace tvm {
+namespace relay {
+
+inline bool IsInt8(const DataType& dtype) {


hmmmmmm, do we really need such utils?

zhenhuaw-me · 2019-07-24T05:52:25Z

+  return dtype == Float(32);
+}
+
+inline bool IsQuantizedType(const DataType& dtype) {


Conflict naming against PR #3531 (also in many other files). Should we pending this review until #3531 is merged?

yes i am waiting for #3531 to get merged. it seems to be in flux for now. once it is merged i will make the changes in this one.

zhenhuaw-me · 2019-07-24T05:53:25Z

+    case QuantizeOpType::Quantize:
+      return IsFloat32(in_dtype);
+    case QuantizeOpType ::Dequantize:
+      return IsQuantizedType(in_dtype);


The type check is against your api definition.

zhenhuaw-me · 2019-07-24T05:55:12Z

+        const DataType& in_dtype) {
+  switch (op_type) {
+    case QuantizeOpType::Quantize:
+      return IsQuantizedType(in_dtype);


The type check is against your api definition. Alos @anijain2305 , do we need to align?

I think, if someone defines an OpType and conversions, it his responsible to include that code in one unified PR. Otherwise, don't include the OpType definition or conversions.

i agree, lets finish #3531 and then i will update this one.

zhenhuaw-me · 2019-07-24T05:56:50Z

+  }
+}
+
+inline const int32_t GetQmin(const DataType& dtype) {


The implementation seems against PR #3531, of which the code is more simple.

zhenhuaw-me · 2019-07-24T05:57:09Z

+  } else if (IsUint32(dtype)) {
+    return std::numeric_limits<uint32_t>::min();
+  }
+  LOG(FATAL) << "Type not supported\n";


maybe include in an else section.

zhenhuaw-me · 2019-07-24T05:57:23Z

+  return -1;
+}
+
+inline const int32_t GetQmax(const DataType& dtype) {


same as GetQmin().

zhenhuaw-me · 2019-07-25T11:34:39Z

Regarding 16/32 bits quantization, I have a discussion in RFC #3591 .

shoubhik · 2019-07-25T20:09:52Z

There are quite a lot of changes here that are depndent on #3531 . I am closing the PR for now. I will reopen this once #3531 is pushed.

shoubhikbhatti@gmail.com added 5 commits July 8, 2019 12:12

[Relay] [Quantization] WIP - Common files for the qauntization work.

8d9e317

[Relay] [Quantization] WIP - Adding the tests file.

7081694

[Relay] [Quantization] Removing redundant code.

bcf003b

[Relay] [Quantization]

6766af9

1. Correcting docs 2. Reordering Clip and Cast in the dequantize op for stability.

shoubhik changed the title ~~Quantize dequantize op~~ [RFC][Quantization] Designing and lowering of quantized ops Jul 8, 2019

shoubhik mentioned this pull request Jul 8, 2019

[RFC][Quantization] Designing and lowering of quantized ops #3457

Closed

FrozenGene reviewed Jul 9, 2019

View reviewed changes