Skip to content

[Cherry-Pick][Speculate Decoding][Engine] Cherry-pick #7166, #7349, #7402, #7445 to release/online/20260415#7447

Merged
freeliuzc merged 4 commits intoPaddlePaddle:release/online/20260415from
lonelygsh:cherry-pick-to-release-20260415
Apr 16, 2026
Merged

[Cherry-Pick][Speculate Decoding][Engine] Cherry-pick #7166, #7349, #7402, #7445 to release/online/20260415#7447
freeliuzc merged 4 commits intoPaddlePaddle:release/online/20260415from
lonelygsh:cherry-pick-to-release-20260415

Conversation

@lonelygsh
Copy link
Copy Markdown
Contributor

@lonelygsh lonelygsh commented Apr 16, 2026

Motivation

将 4 个已合入/待合入 develop 分支的 PR cherry-pick 到 release/online/20260415 分支,包含投机解码(Speculate Decoding)相关 bugfix 以及推理中断(Interrupt Reasoning)控制命令功能。

Modifications

包含以下 4 个 PR 的改动:

#7166 - [Speculative Decoding] fix mtp stop_seqs and limit thinking bugs
修复 limit_thinking 和 set_stop_value kernel 中 step_idx 语义错误
#7349 - [Speculate Decoding] Fix step_idx semantics in reasoning_phase_token_constraint and speculate set_value kernels
修复 reasoning_phase_token_constraint kernel 的 bug
#7402 - [Speculate Decoding] Fix reasoning_phase_token_constraint call args in SpeculativeSampler
修复 SpeculativeSampler 中调用 reasoning_phase_token_constraint 的参数错误
#7445 - [Interrupt reasoning] Add interrupt_requests control command support
在 PD 分离架构的控制命令通道中新增 interrupt_requests 命令,支持从外部主动中断正在执行的推理请求

Usage or Command

无新增用法,详见各原始 PR。

Accuracy Tests

不涉及模型前向计算或 kernel 精度变化,无需精度测试。

Checklist

  • Add at least a tag in the PR title.
    • Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
    • You can add new tags based on the PR content, but the semantics must be clear.
  • Format your code, run pre-commit before commit.
  • Add unit tests. Please write the reason in this PR if no unit tests.
  • Provide accuracy results.
  • If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

lonelygsh and others added 4 commits April 16, 2026 21:16
…_stop_value kernels (PaddlePaddle#7166)

- speculate_limit_thinking_content_length: update current_base_step to
  step_idx+1 (step_idx now records history count before current round);
  remove incorrect step_idx decrement on accept_num truncation; mark
  step_idx param as const.
- speculate_set_stop_value_multi_seqs: fix can_stop gate to use
  step_idx_now+accept_num>=min_token_limit; fix skip check and pre_ids_idx
  formula (remove stale -accept_num offset); use <= condition so accept_idx
  maps directly to the accepted token that ends the stop sequence; fix
  accept_tokens index (remove -1).
- Update unit tests for speculate_set_stop_value_multi_seqs kernel.
…el (PaddlePaddle#7349)

Co-authored-by: guanshihui] <guanshihui@baidu.com>
@paddle-bot
Copy link
Copy Markdown

paddle-bot bot commented Apr 16, 2026

Thanks for your contribution!

@paddle-bot paddle-bot bot added the contributor External developers label Apr 16, 2026
@lonelygsh lonelygsh changed the title [cherry pick]cherry pick 7166 7349 7402 7445 [Cherry-Pick]cherry pick develop of pr 7166 7349 7402 7445 Apr 16, 2026
@lonelygsh lonelygsh changed the title [Cherry-Pick]cherry pick develop of pr 7166 7349 7402 7445 [Cherry-Pick][Speculate Decoding][Engine] Cherry-pick #7166, #7349, #7402, #7445 to release/online/20260415 Apr 16, 2026
@freeliuzc freeliuzc merged commit 0ac91bf into PaddlePaddle:release/online/20260415 Apr 16, 2026
10 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor External developers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants