Skip to content

fix(agent): keep parallel tool-result messages contiguous on OpenAI Chat (Databricks image fix)#770

Merged
tlongwell-block merged 1 commit into
mainfrom
dawn/databricks-parallel-image-fix
May 28, 2026
Merged

fix(agent): keep parallel tool-result messages contiguous on OpenAI Chat (Databricks image fix)#770
tlongwell-block merged 1 commit into
mainfrom
dawn/databricks-parallel-image-fix

Conversation

@tlongwell-block
Copy link
Copy Markdown
Collaborator

What

Fix tool-result framing for parallel image-returning tool calls on the OpenAI Chat path so requests via Databricks model serving stop being rejected.

Why

Reported by Wes in #sprout-bugs: agents using Databricks-routed Claude (Opus 4.6/4.7) fail when asked to view multiple images in a thread. Error from Databricks is Anthropic-shaped, despite the agent talking OpenAI Chat:

llm: 400 Bad Request: tool_use ids were found without tool_result
blocks immediately after: toolu_bdrk_017tcizkcWrCFzQGQyci3DVr.
Each tool_use block must have a corresponding tool_result block
in the next message.

The toolu_bdrk_ prefix gives it away — Databricks (config.rs:119: OpenAiApi::Chat, // Databricks invocations is chat-shaped) translates OpenAI Chat back into Anthropic on the way to the model and the translation breaks on this pattern.

Root cause

openai_body emitted images as a role:"user" message immediately after each tool result's role:"tool" message. With one image-returning tool call that's fine. With two parallel calls — exactly the case in Wes's screenshot, two stacked sprout-mcp__view_image invocations — the wire becomes:

role:"tool"  (A text)
role:"user"  (A image)     ← splits the run
role:"tool"  (B text)      ← Databricks: where's B's tool_result?
role:"user"  (B image)

Databricks' OpenAI→Anthropic translator folds consecutive role:"tool" messages into one Anthropic user message of tool_result blocks. The intervening role:"user" (image) ends the run, so the second tool result lands in a separate user message, leaving tool_use B unpaired in the immediately-following user turn. Anthropic rejects.

Fix

In openai_body, defer image content into a pending_images accumulator while emitting role:"tool" messages. Flush as a single trailing role:"user" carrying every image from the batch before any non-ToolResult history item (or at end of history). Mirrors the existing Anthropic body's pending/flush pattern.

OpenAI Chat semantics unchanged: each tool's text result still lands in its own role:"tool" message in order; images still ride on a role:"user" message after their text results. Only the grouping changes — one user message per run of tool results instead of one per result.

Test

openai_parallel_image_tool_results_stay_contiguous constructs the two-parallel-images case and asserts both role:"tool" messages are adjacent with a single trailing role:"user" carrying both images. Confirmed failing on the prior implementation (7-message interleaved shape) before applying the fix.

All 45 sprout-agent lib tests pass. Clippy and fmt clean.

Scope

  • Single file: crates/sprout-agent/src/llm.rs
  • One function changed (openai_body); other paths (Anthropic native, OpenAI Responses) already handled this correctly and are untouched.

Reported by Wes in #sprout-bugs: agents using Databricks model serving
to view images via `sprout-mcp__view_image` fail with

  llm: 400 Bad Request: tool_use ids were found without tool_result
  blocks immediately after: toolu_bdrk_... Each tool_use block must
  have a corresponding tool_result block in the next message.

Databricks routes Anthropic models through an OpenAI-Chat-shaped
frontend and translates `role:"tool"` back into Anthropic
`tool_result` blocks on the way to the model. Anthropic requires every
`tool_use` in one assistant turn to be answered by a single
immediately-following user message that carries all of the
corresponding `tool_result` blocks.

`openai_body` previously emitted, for each tool result containing an
image:

  role:"tool"  (text)
  role:"user"  (image_url)

With one parallel tool call this is survivable. With two or more
parallel calls where any returns an image — the exact case in Wes's
screenshot, with two stacked `view_image` invocations — the wire
becomes:

  role:"tool"  (A)
  role:"user"  (image A)
  role:"tool"  (B)   ← Databricks translator can't fold this back
                       into the same Anthropic user message
  role:"user"  (image B)

Anthropic then rejects because `tool_use B` has no `tool_result`
immediately after.

Fix: defer image-carrying user content into a `pending_images`
accumulator while emitting tool results; flush it as a single trailing
user message before the next non-`ToolResult` history item (or at end
of history). All `role:"tool"` messages for a run of adjacent tool
results stay contiguous, and the translator has a clean batch to fold.

Mirrors the existing Anthropic body's `pending`/`flush` pattern.
Native OpenAI Chat semantics are unchanged: order within the run is
preserved, every image still rides on a `role:"user"` message after
its text result.

Regression test `openai_parallel_image_tool_results_stay_contiguous`
constructs the two-parallel-images case and asserts both `role:"tool"`
messages are adjacent with a single trailing `role:"user"` containing
both images. Confirmed failing on the prior implementation.

Signed-off-by: tlongwell-block <109685178+tlongwell-block@users.noreply.github.com>
@tlongwell-block tlongwell-block requested a review from a team as a code owner May 28, 2026 16:50
@tlongwell-block tlongwell-block merged commit 61297ac into main May 28, 2026
16 checks passed
@tlongwell-block tlongwell-block deleted the dawn/databricks-parallel-image-fix branch May 28, 2026 19:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants