Skip to content

Feat: refine tools#906

Open
suluyana wants to merge 56 commits intomodelscope:mainfrom
suluyana:feat/tools
Open

Feat: refine tools#906
suluyana wants to merge 56 commits intomodelscope:mainfrom
suluyana:feat/tools

Conversation

@suluyana
Copy link
Copy Markdown
Collaborator

No description provided.

suluyan and others added 30 commits February 6, 2026 15:03
…date reporter delivery flow, and improve the quality check module (54.51)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
- Add snapshot.py: isolated git repo under output_dir/.ms_agent_snapshots,
  stores message_count per commit for history truncation on rollback
- Auto-snapshot on every user turn via on_task_begin (enable_snapshots=True by default)
- Add list_snapshots()/rollback() to Agent base and LLMAgent:
  rollback restores files, truncates saved history, clears _read_cache
- Refactor filesystem_tool.py: remove replace_file_contents, rewrite
  edit_file with old_string→new_string exact replace, quote-normalization
  fallback, smart delete, trailing-whitespace strip, staleness check,
  multi-type read (images/binary), read dedup cache
- Add smoke tests (20 cases, all offline)
- AgentTool now handles dynamic mode (split_to_sub_task) internally,
  replacing the standalone SplitTask class. Backward compat preserved:
  configs with tools.split_task auto-register the built-in dynamic spec.
- Fix execution_mode missing from split_to_sub_task schema (was silently
  ignored before; now exposed as enum field with sequential/parallel).
- Increase max_subtask_output_chars default from 2048 to 8192.
- Add disallowed_tools to _AgentToolSpec to prevent recursive tool calls
  in sub-agents.
- Add sub-agent transcript persistence: in-process runs write messages
  to output_dir/subagents/<agent_tag>.jsonl for debugging.
- Add TaskManager (ms_agent/utils/task_manager.py): agent-level registry
  for background tasks with notification queue. LLMAgent initializes it
  in run_loop, wires it into AgentTool instances, and drains notifications
  at the top of each while-loop iteration. Supports future BashTool
  background mode via the same interface.
- diversity.py: replace SplitTask dependency with inline _run_tasks_sequential
  helper using LLMAgent directly.
When a spec has run_in_background=true, call_tool fires off the
subprocess and returns immediately with {status: async_launched,
task_id, tool_name}. A background asyncio watcher task polls the
result queue and calls task_manager.complete/fail when the process
exits. LLMAgent drains the TaskManager notification queue at the
top of each run_loop iteration, injecting <task-notification> XML
into the conversation so the model sees the result on the next turn.

run_in_background is opt-in per agent_tools definition:
  agent_tools:
    definitions:
      - tool_name: my_agent
        config_path: my_agent.yaml
        run_in_background: true
Exposes two tools to the model when tools.task_control is configured:
- list_tasks: show all background tasks with status and duration
- cancel_task: kill a running task by task_id

TaskControlTool receives the TaskManager reference via set_task_manager(),
which LLMAgent already calls for all extra_tools in run_loop. Enable with:

  tools:
    task_control: {}
suluyan and others added 24 commits April 3, 2026 11:37
Add SubAgentStreamWriter (ms_agent/utils/stream_writer.py) that appends
each new message to a JSONL file as soon as it arrives, so the parent
agent or an external observer can tail -f to watch a sub-agent run
step-by-step instead of waiting for it to finish.

Key details:
- JSONL format: header -> message* -> footer, one JSON object per line
- Deduplication via last_written_count: each chunk carries the full
  accumulated history; only newly added messages are written
- Thread-safe (threading.Lock) and flush-on-every-line for tail -f support
- Works for both inline-async and subprocess execution paths
- event_queue is now created when either _chunk_cb or the writer is active
- Opt-in via config: agent_stream_file: true (or
  tools.agent_tools.enable_stream_file: true)
- File path: {output_dir}/subagents/{call_id}.stream.jsonl
- A descriptive note is appended to the tool result so the parent LLM
  understands the file is an incremental execution trace, not tool output

Also includes AgentTool refactor: replace ThreadPoolExecutor with native
asyncio subprocess spawning, add sync_timeout_s + escape-to-background
support, TaskControlTool improvements, and related smoke tests.

Entire-Checkpoint: 37377e309a88
Add WorkspacePolicyKernel (allow-roots from output_dir), ArtifactManager for large shell outputs, TaskManager with shell background support and asyncio process kill, WorkspaceSearchTool (grep_files/glob_files). Wire TaskManager into LLMAgent (prepare_tools, cleanup, task notifications in step) and extend LocalCodeExecutionTool with policy checks, artifact spill, run_in_background shell, sh -lc wrapping. ToolManager registers WorkspaceSearchTool by default and injects __call_id for shell_executor. Add tests for workspace policy. Document implementation map in shell-grep-glob-workspace-policy.md.

Made-with: Cursor
…ools

Replace removed file_system tools (list_files, delete_file_or_dir) with workspace_search (glob/grep) and/or code_executor (shell, file_operation). Update deep_research prompts and callbacks for read_file offset/limit and edit_file. fin_research: aggregator adds python_env shell/file_operation; collector exposes shell and file_operation in sandbox; file_system keeps read/write/edit. code_genesis: prompts use glob_files/shell for listing; orchestrator_callback uses os.makedirs instead of removed create_directory(). singularity registers workspace_search.

Made-with: Cursor
Remove WorkspaceSearchTool and register grep and glob on the file_system
server alongside read/write/edit. Add read/edit/write include aliases and
optional grep_head_limit, glob_max_files, and grep_timeout_s on file_system.

Update project YAML and prompts to drop workspace_search blocks; document
the mapping in shell-grep-glob-workspace-policy.md. Add tests for include
aliases and grep/glob filtering.

Made-with: Cursor
Add Tavily HTTP client, search/extract schema, WebSearchTool integration,
optional large-result spill, researcher/searcher Tavily YAML presets, and
run_benchmark env hooks for RESEARCHER_CONFIG / BENCH paths.

Made-with: Cursor
…ack)

WebSearchTool imports fetch_single_text_with_meta; add tiered fetch helpers
and optional Playwright fallback module used by jina_reader.

Made-with: Cursor
Replace bench-specific filesystem_tool with feat/git version; accept
behavior differences vs prior tavily bench worktree.

Made-with: Cursor
AgentTool overhaul: stream files, TaskControlTool, TaskManager wiring,
SplitTask removal from default tool path. Excludes decision_chain_transparency.

Made-with: Cursor
Align edit/write with disk-backed validation (Claude Code style): remove
_check_staleness and post-write cache pops that caused redundant read_file
round-trips and noisy errors.

test: use tool_manager.extra_tools in rollback read_cache smoke (matches
LLMAgent.rollback).

Made-with: Cursor
…nfigs

- Subagent snapshot defaults and snapshot repo hook bypass
- FileSystemTool read_file path alias; grep newline guard
- Evidence write_note optional title; report commit_outline coercion and report_generator load_index
- Reporter todo_list; Tavily-only searcher yaml; exp_nosnap configs
- Searcher JSON parse resilience in callback

Made-with: Cursor
Resolve conflicts in llm_agent (omit bench-only knowledge_search init),
llm/utils (union UI-stripped keys), and tool_manager (keep LocalSearchTool
and TaskControlTool).

Made-with: Cursor
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces significant enhancements to the agent framework, including support for background task management, improved local search capabilities via sirchmunk, and refined file system operations. My review highlights critical issues with the read_file tool's output format which breaks compatibility with edit_file, as well as several opportunities to improve logging consistency, state management during rollbacks, and memory efficiency in file operations.

Comment thread ms_agent/tools/filesystem_tool.py Outdated
output_dir_real = os.path.realpath(self.output_dir)
is_in_output_dir = target_path_real.startswith(
output_dir_real + os.sep) or target_path_real == output_dir_real
results[path] = ''.join(f'{start_lineno + i}\t{line}' for i, line in enumerate(selected))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The read_file tool now prepends line numbers to the output. However, edit_file (line 906) uses exact string matching against the raw file content. If an agent uses a snippet from read_file as the old_string for an edit, it will fail because the line numbers are not present in the actual file. It is recommended to return raw content to maintain compatibility with editing tools.

Suggested change
results[path] = ''.join(f'{start_lineno + i}\t{line}' for i, line in enumerate(selected))
results[path] = ''.join(selected)

Comment on lines +1095 to +1096
if not spec.run_in_process:
self._save_transcript(result, runtime_agent_tag)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The _save_transcript call is gated by if not spec.run_in_process. This prevents sub-agents running in isolated processes from saving their execution logs, even though the results are available in the main process. Transcripts should be saved for all execution modes to ensure consistent logging.

Suggested change
if not spec.run_in_process:
self._save_transcript(result, runtime_agent_tag)
result = await runner()
self._save_transcript(result, runtime_agent_tag)

Comment on lines +1109 to +1110
if not spec.run_in_process:
self._save_transcript(result, runtime_agent_tag)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The _save_transcript call is gated by if not spec.run_in_process. This prevents sub-agents running in isolated processes from saving their execution logs. Transcripts should be saved for all execution modes to ensure consistent logging.

Suggested change
if not spec.run_in_process:
self._save_transcript(result, runtime_agent_tag)
result = await runner()
self._save_transcript(result, runtime_agent_tag)

except FileNotFoundError:
results[path] = f'Read file <{path}> failed: FileNotFound'
except Exception as e:
results[path] = f'Read file <{path}> failed, error: ' + str(e)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Reading the entire file into memory with fp.read_text() in the grep fallback can be inefficient for large files. Consider reading the file line-by-line to reduce memory usage.

Comment on lines +372 to +388
def rollback(self, commit_hash: str) -> bool:
"""Restore output_dir to snapshot and truncate message history."""
from ms_agent.utils.snapshot import restore_snapshot

ok, message_count = restore_snapshot(self.output_dir, commit_hash)
if not ok:
return False
# Truncate saved history to the message count at snapshot time
_, saved_messages = read_history(self.output_dir, self.tag)
if saved_messages and message_count < len(saved_messages):
save_history(self.output_dir, self.tag, self.config, saved_messages[:message_count])
# Clear read cache on FileSystemTool so stale entries don't block edits
if self.tool_manager is not None:
for tool in self.tool_manager.extra_tools:
if hasattr(tool, '_read_cache'):
tool._read_cache.clear()
return True
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The rollback method restores the state on disk and truncates the saved history file, but it does not update the messages list currently held in memory by the active run_loop. If a tool triggers a rollback, the agent will continue its current execution loop with the old, non-truncated history until the next task begins. Consider returning the truncated history or providing a mechanism to refresh the running loop's state.

suluyan added 2 commits April 28, 2026 15:22
This reverts commit 67668e9.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant