[Cherry-Pick][TI-consistent] support quant use pow2scale(#7308)#7303
[Cherry-Pick][TI-consistent] support quant use pow2scale(#7308)#7303yuanlehome merged 6 commits intoPaddlePaddle:release/2.5from
Conversation
|
Thanks for your contribution! |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## release/2.5 #7303 +/- ##
==============================================
Coverage ? 69.95%
==============================================
Files ? 390
Lines ? 54429
Branches ? 8583
==============================================
Hits ? 38077
Misses ? 13614
Partials ? 2738
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
/re-run ci_xpu |
fastdeploy-bot
left a comment
There was a problem hiding this comment.
🤖 AI Code Review |
2026-04-11
📋 Review 摘要
PR 概述:新增 FD_FP8_QUANT_WITH_POW2SCALE 环境变量支持 FP8 量化使用 pow2scale 模式
变更范围:fastdeploy/envs.py、model_executor/layers/moe/、model_executor/layers/quantization/
影响面 Tag:[Quantization]
📝 PR 规范检查
PR 标题缺少有效的 [Tag] 标签。[Cherry-Pick] 和 [TI-consistent] 不在官方标签列表中。
标题建议(可直接复制):
[Quantization] [Cherry-Pick] support quant use pow2scale(#7308)[RL] [Cherry-Pick] support quant use pow2scale(#7308)
描述建议:建议补充完整的 Modifications 列表(各文件修改的具体调用点数量)和 Accuracy 测试结果。
问题
无阻塞性问题。
总体评价
变更逻辑清晰,通过新增环境变量 FD_FP8_QUANT_WITH_POW2SCALE 统一控制 FP8 量化的 pow2scale 模式,便于训推一致性对齐。使用 or 运算符确保当环境变量为 True 时强制启用 pow2scale,符合预期行为。
Motivation
针对训推一致性对齐需求,在FD中支持quant 选择pow2scale模式,通过FD_FP8_QUANT_WITH_POW2SCALE控制
Modifications
新增FD_FP8_QUANT_WITH_POW2SCALE参数
Usage or Command
Accuracy Tests
Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.