Skip to content

[TFLite] TFLite FP16 Post Quantization Support #5823

@FrozenGene

Description

@FrozenGene

TensorFlow Lite now supports converting weights to 16-bit floating point values during model conversion from TensorFlow to TensorFlow Lite's flat buffer format. This results in a 2x reduction in model size.

However, this will insert new dequantize for ops (like conv2d) used for dequantize fp16 weight to fp32. Like this:
image

TVM doesn't support this behavior. List the things we mainly should to do:

  • Support float16 type inside tflite parser
  • Extend dequantize to support fp16 to fp32

Related issue:#5774

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions