Skip to content

[XPU] handle inplace return value for XPU.#7463

Open
Jiajun-Ji wants to merge 1 commit intoPaddlePaddle:developfrom
Jiajun-Ji:speculate_set_value_by_flags_and_idx_test
Open

[XPU] handle inplace return value for XPU.#7463
Jiajun-Ji wants to merge 1 commit intoPaddlePaddle:developfrom
Jiajun-Ji:speculate_set_value_by_flags_and_idx_test

Conversation

@Jiajun-Ji
Copy link
Copy Markdown
Contributor

Motivation

💡 If this PR is a Cherry Pick, the PR title needs to follow the format by adding the [Cherry-Pick] label at the very beginning and appending the original PR ID at the end. For example, [Cherry-Pick][CI] Add check trigger and logic(#5191)

💡 如若此PR是Cherry Pick,PR标题需遵循格式,在最开始加上[Cherry-Pick]标签,以及最后面加上原PR ID,例如[Cherry-Pick][CI] Add check trigger and logic(#5191)

单测修复

Modifications

Usage or Command

Accuracy Tests

Checklist

  • Add at least a tag in the PR title.
    • Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
    • You can add new tags based on the PR content, but the semantics must be clear.
  • Format your code, run pre-commit before commit.
  • Add unit tests. Please write the reason in this PR if no unit tests.
  • Provide accuracy results.
  • If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

Copilot AI review requested due to automatic review settings April 17, 2026 08:30
@paddle-bot
Copy link
Copy Markdown

paddle-bot bot commented Apr 17, 2026

Thanks for your contribution!

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

该 PR 主要用于修复 XPU 场景下单测对自定义算子 speculate_set_value_by_flags_and_idx 的返回值处理:XPU 侧算子表现为 in-place 更新输入张量,因此单测不应依赖其返回值再取 .numpy()

Changes:

  • XPU 分支不再接收 speculate_set_value_by_flags_and_idx 的返回值,改为直接读取被原地更新后的 pre_ids_all
  • 非 XPU 分支保持原先逻辑:使用算子返回的 Tensor 并转换为 numpy 进行对比校验。

Copy link
Copy Markdown

@PaddlePaddle-bot PaddlePaddle-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Review | 2026-04-17 16:44 CST

📋 Review 摘要

PR 概述:修复 XPU 平台 speculate_set_value_by_flags_and_idx 单测中对 inplace 返回值的处理方式
变更范围custom_ops/xpu_ops/test/
影响面 TagXPU OP

📝 PR 规范检查

PR 标题包含 [XPU] Tag ✅;描述中 Motivation 仅写了"单测修复",建议补充更具体的说明,例如:

描述建议(可直接复制):

Motivation:XPU 版本的 speculate_set_value_by_flags_and_idx 算子通过 inplace 方式修改 pre_ids_all,C++ 函数返回 void,不应使用返回值。修复单测以直接从 pre_ids_all 读取结果,并将 GPU 分支的 .numpy() 调用移入 else 块以修正缩进逻辑。

Modifications:修正 XPU 分支不再捕获返回值,改为直接读取 inplace 修改后的 pre_ids_all;将 GPU 分支的 out_xpu.numpy() 缩进到 else 块内,避免 XPU 分支执行时引用未定义的返回值。

问题

未发现阻塞性问题。

代码分析

已验证 XPU 算子的 C++ 实现(speculate_set_value_by_flags.cc):

  • 函数签名为 void SpeculateSetValueByFlagsAndIdx(...),确认无返回值
  • PD_BUILD_STATIC_OP 使用 .SetInplaceMap({{"pre_ids_all", "pre_ids_all_out"}}) 声明 inplace 语义
  • 修改后的测试正确地忽略返回值,直接从 pre_ids_all.numpy() 获取结果

同时确认 GPU 版本(cpp_extensions.cc pybind 暴露)同样为 void 返回,但 GPU 测试分支通过 efficientllm.ops.gpu 导入的是 static op 版本(有 inplace 返回),因此 else 分支保留 out_xpu = ... 赋值是正确的。

总体评价

变更逻辑正确,准确处理了 XPU 算子的 inplace 语义差异。out_xpu.numpy() 的缩进修复也消除了 XPU 路径下潜在的变量引用问题。

@codecov-commenter
Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (develop@6847891). Learn more about missing BASE report.

Additional details and impacted files
@@            Coverage Diff             @@
##             develop    #7463   +/-   ##
==========================================
  Coverage           ?   74.14%           
==========================================
  Files              ?      398           
  Lines              ?    54977           
  Branches           ?     8613           
==========================================
  Hits               ?    40762           
  Misses             ?    11484           
  Partials           ?     2731           
Flag Coverage Δ
GPU 74.14% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants