refactor: rename txt to text, document to file, and broaden usage tracking by larryro · Pull Request #800 · tale-project/tale

larryro · 2026-03-16T09:33:21Z

Summary

Rename the txt tool to text and the document agent to file for clarity
Rename subAgentUsage to toolsUsage and track all tool calls (not just delegation tools), with backwards-compatible deprecation aliases in the API
Fix stale references, add missing MIME types, and clean up text tool type mappings

Test plan

Verify text tool handles all previously supported file types
Verify file agent delegation works end-to-end
Confirm toolsUsage is populated for all tool calls in message metadata
Confirm deprecated subAgentUsage field still returns data for existing clients
Run typecheck and lint with no errors

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

New Features
- Added LLM model selection for document and file processing across PDF, DOCX, PPTX, and text files.
- Expanded text file support to include markdown, JSON, YAML, XML, HTML, CSS, code files, and other text-based formats.
- Enhanced tool usage metrics now include token counts, processing duration, model information, and input/output details.
Improvements
- Rebranded "Document Assistant" to "File Assistant" with broader file handling capabilities.
- Reorganized file tool categories for improved usability.

Broaden usage tracking from delegation-only sub-agents to all tool calls, adding input/output fields to the usage schema. Rename SubAgentDetailsDialog to ToolDetailsDialog and update validators, types, translations, and OpenAPI spec with backwards-compatible deprecation aliases.

…on tools

- Update chat agent instructions to reference "file agent" instead of "document agent" - Fix tool-details-dialog display name mapping (document_assistant → file_assistant) - Remove text/csv from text tool mimeTypes to avoid overlap with excel tool - Replace text/* wildcard with explicit file extensions and MIME types - Add missing MIME types for .sql, .graphql extensions - Remove unused model parameter from Python vision extractors - Update stale comments in generate_response.ts and create_delegation_tool.ts

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

coderabbitai · 2026-03-16T09:43:49Z

📝 Walkthrough

Walkthrough

This PR adds a model parameter throughout the file parsing pipeline (crawler endpoints, file parser services, and OpenAI LLM calls), enabling callers to specify which LLM model to use for text processing. Concurrently, it refactors terminology across the platform from "SubAgent" to "Tool", renames the "txt" tool to "text" with expanded format support (markdown, JSON, HTML, etc.), and renames the "Document Agent" to "File Agent". Tool categorization is reorganized by introducing a new "Files" category while restructuring the Documents and Knowledge sections. Tool usage metadata is expanded to capture model, provider, token counts, and input/output details. These changes span crawler endpoints, file parser services, chat UI components, agent definitions, and message metadata structures.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~45 minutes

Possibly related PRs

refactor(platform): rename website scan operations, add embedding workpool, and improve streaming #574 — Adds explicit model parameter through crawler vision parsing and OpenAI LLM calls (process_pages_with_llm), directly overlapping with this PR's model threading.
feat(platform): add TXT file analysis tool and improve sub-agent thread linking #271 — Modifies services/platform/convex/agent_tools/files/helpers/analyze_text.ts and related parsing tool call sites, which this PR also updates with the model parameter.
feat(chat): display sub-agent model information in message info dialog #332 — Modifies message metadata structures and chat UI to surface per-tool model information and metadata, aligned with this PR's tool usage expansion.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 57.14% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title concisely and accurately summarizes the three primary changes: renaming txt to text, document to file, and broadening usage tracking from sub-agents to all tool calls.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch refactor/rename-txt-to-text-and-document-to-file

📝 Coding Plan

Generate coding plan for human review comments

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Tip

Migrating from UI to YAML configuration.

Use the @coderabbitai configuration command in a PR comment to get a dump of all your UI settings in YAML format. You can then edit this YAML file and upload it to the root of your repository to configure CodeRabbit programmatically.

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (6)

services/crawler/app/services/vision/openai_client.py (1)

289-294: 🧹 Nitpick | 🔵 Trivial

Consider reusing the VisionClient instance instead of creating a new AsyncOpenAI client per call.

A new AsyncOpenAI client is instantiated on every call to process_pages_with_llm, while the VisionClient class at the bottom of this file maintains a singleton pattern with _get_client(). This is likely intentional (different timeout: 180s vs 120s), but if call frequency is high, reusing a client or extracting this to a shared helper could reduce connection overhead.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@services/crawler/app/services/vision/openai_client.py` around lines 289 -
294, process_pages_with_llm currently creates a new AsyncOpenAI client on each
call (AsyncOpenAI instantiation with timeout=180.0) which duplicates connection
overhead instead of reusing the VisionClient singleton; refactor to reuse a
shared client by either (a) adding a parameterized client getter on VisionClient
(e.g., VisionClient._get_client(timeout=180)) that returns a cached AsyncOpenAI
instance per timeout or (b) extracting a shared helper factory used by
process_pages_with_llm to obtain the client; update process_pages_with_llm to
call that helper instead of instantiating AsyncOpenAI directly and preserve the
intended resolved_model logic (resolved_model = model or
settings.get_fast_model()) and the different timeout behavior.

services/crawler/app/services/file_parser_service.py (1)

54-74: 🧹 Nitpick | 🔵 Trivial

Update docstring to document the new model parameter.

The model parameter was added to parse_pdf_with_vision but the docstring (lines 65-74) doesn't document it. The same applies to the other vision methods (parse_docx_with_vision, parse_pptx_with_vision, parse_file_with_vision).

📝 Suggested docstring update for parse_pdf_with_vision

         Args:
             file_bytes: Raw PDF bytes
             filename: Filename for logging
             user_input: Optional user instruction for AI extraction
             process_images: Whether to extract and describe embedded images
             ocr_scanned_pages: Whether to OCR pages with low text content
+            model: Optional LLM model name for text processing (defaults to fast model)
 
         Returns:
             Extraction result with full_text and metadata

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@services/crawler/app/services/file_parser_service.py` around lines 54 - 74,
The docstrings for parse_pdf_with_vision, parse_docx_with_vision,
parse_pptx_with_vision, and parse_file_with_vision are missing documentation for
the new model parameter; update each method's docstring to add a brief "model"
entry in the Args section (e.g., model: Optional[str] = None — name of the
vision/LLM model to use for extraction/analysis, defaults to the
service-configured model), keeping wording consistent with existing parameter
docs and noting default/behavior when None.

services/platform/convex/agent_tools/files/helpers/analyze_text.ts (1)

1-6: 🧹 Nitpick | 🔵 Trivial

Update outdated documentation comment.

The comment on line 2 mentions "using the fast model," but this PR changes the function to accept an explicit model parameter, making the model configurable by the caller. Consider updating the documentation to reflect this.

📝 Suggested documentation update

 /**
- * Helper for analyzing text files using the fast model.
+ * Helper for analyzing text files using a specified LLM model.
  * Handles encoding detection, chunking for large files, and LLM analysis.
  * Uses ctx.storage.get() for direct Convex storage access (like analyze_image.ts).
  * Uses Agent framework with saveMessages: 'none' to avoid creating visible thread messages.
  */

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@services/platform/convex/agent_tools/files/helpers/analyze_text.ts` around
lines 1 - 6, Update the top-of-file documentation to remove the hardcoded "fast
model" reference and document the new configurable model parameter (e.g., model)
accepted by the analyzeText function; mention that the caller supplies the model
and that the helper will use that model for LLM analysis, and keep existing
notes about encoding detection, chunking, ctx.storage.get(), and saveMessages:
'none' intact.

services/platform/convex/lib/agent_chat/internal_actions.ts (1)

574-595: ⚠️ Potential issue | 🟠 Major

Regression: attachment context is dropped before generation.

The beforeGenerateHook receives attachments (passed by buildHooksFromConfig at line 469), but the handler at line 574 does not destructure them, and line 594 returns promptContent: undefined unconditionally. This prevents attached files from being processed into prompt content, breaking file-attachment functionality.

💡 Proposed fix (restore attachment-derived prompt content)

-    const { threadId, promptMessage, contextMessagesTokens } = args;
+    const { threadId, promptMessage, attachments, contextMessagesTokens } = args;

@@
-    return {
-      promptContent: undefined,
-      contextExceedsBudget,
-    };
+    const promptContent =
+      attachments && attachments.length > 0
+        ? await processAttachments(ctx, attachments, promptMessage)
+        : undefined;
+
+    return {
+      promptContent,
+      contextExceedsBudget,
+    };

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@services/platform/convex/lib/agent_chat/internal_actions.ts` around lines 574
- 595, The handler currently drops attachments by not reading args.attachments
and always returning promptContent: undefined; fix it by destructuring
attachments from args alongside threadId/promptMessage/contextMessagesTokens,
then restore the attachment-derived prompt content before return (e.g., if
attachments exist, produce promptContent from args.attachments using the same
conversion logic used elsewhere / by buildHooksFromConfig and return that value
instead of undefined), keeping the contextExceedsBudget logging via
beforeGenerateDebugLog and preserving existing variables like threadId and
contextMessagesTokens.

services/platform/convex/lib/attachments/process_attachments.ts (1)

60-65: 🧹 Nitpick | 🔵 Trivial

Make the config contract consistently require model.

ProcessAttachmentsConfig marks model optional, while the exported function immediately strengthens it to ProcessAttachmentsConfig & { model: string }. Making the interface itself required is clearer and removes a misleading public shape.

Proposed refactor

 export interface ProcessAttachmentsConfig {
   maxDocumentLength?: number;
   debugLog?: (message: string, data?: Record<string, unknown>) => void;
   toolName?: string;
-  model?: string;
+  model: string;
 }
@@
 export async function processAttachments(
   ctx: ActionCtx,
   attachments: FileAttachment[],
   userText: string | undefined,
-  config: ProcessAttachmentsConfig & { model: string },
+  config: ProcessAttachmentsConfig,
 ): Promise<ProcessedAttachments> {

Also applies to: 83-88

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@services/platform/convex/lib/attachments/process_attachments.ts` around lines
60 - 65, ProcessAttachmentsConfig currently declares model as optional but the
exported function expects a guaranteed model; change the interface
ProcessAttachmentsConfig to require model: string (remove the optional mark) and
update any related type intersections (e.g., the exported function signature
that used ProcessAttachmentsConfig & { model: string }) to just use
ProcessAttachmentsConfig. Ensure callers and any defaulting logic (referenced in
the file around the exported function and related helper types) are updated to
pass a model string or handle validation so the new required property is
satisfied.

services/platform/app/features/chat/components/message-info-dialog.tsx (1)

223-234: ⚠️ Potential issue | 🟠 Major

The list still uses the pre-rename tool formatter.

formatAgentName only knows document_assistant, so a file_assistant row will render as the raw identifier here while ToolDetailsDialog now titles it as File. Reuse one formatter in both places.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@services/platform/app/features/chat/components/message-info-dialog.tsx`
around lines 223 - 234, The UI is using the old formatter formatAgentName in
message-info-dialog.tsx which doesn't handle the renamed tool ids (e.g.,
file_assistant), so create or expose a single shared formatter (e.g.,
formatToolName or export the formatter used by ToolDetailsDialog) and replace
formatAgentName usages with that shared function; update message-info-dialog.tsx
(the render mapping that calls formatAgentName) to import and call the shared
formatter so both the list and ToolDetailsDialog produce identical
human-friendly titles for tools like document_assistant and file_assistant.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@services/platform/app/features/chat/components/message-info-dialog.tsx`:
- Around line 289-292: The nested ToolDetailsDialog can remain open because
selectedTool outlives the parent dialog; update the parent dialog's close path
to clear selection so the child cannot survive a parent close — specifically,
ensure the parent's onOpenChange handler (the prop controlling isOpen) calls
setSelectedTool(null) when open becomes false (in addition to or instead of only
relying on ToolDetailsDialog's onOpenChange), so that ToolDetailsDialog receives
isOpen={selectedTool !== null} and will always reset when the parent closes.

In `@services/platform/convex/custom_agents/system_defaults.ts`:
- Around line 85-91: Update the agent instructions and tool descriptions to
remove any reference to the ".csv" extension for the text tool: edit
FILE_AGENT_INSTRUCTIONS in services/platform/convex/agents/file/agent.ts to stop
instructing the agent to treat .csv via the text tool, update the textTool
description in services/platform/convex/agent_tools/files/text_tool.ts (both the
header and the detailed description) to list only plain text-like extensions
(e.g., .txt, .md) or generic "text" instead of ".csv", and change the error
message in services/platform/convex/agent_tools/files/helpers/analyze_text.ts to
not mention ".csv" when suggesting the text tool; ensure CSV handling remains
mapped to the excel tool (leave TOOL_FILE_MAP as-is) and adjust phrasing so the
agent will route CSVs to the excel tool rather than the text tool.

In `@services/platform/convex/lib/agent_response/types.ts`:
- Around line 140-150: Reintroduce the deprecated alias subAgentUsage into the
public response type so the contract stays backward-compatible: add an optional
subAgentUsage?: Array<{ toolName: string; model?: string; provider?: string;
inputTokens?: number; outputTokens?: number; totalTokens?: number; durationMs?:
number; input?: string; output?: string; }>; alongside toolsUsage in
services/platform/convex/lib/agent_response/types.ts, and update the runtime
validator in services/platform/convex/lib/agent_response/validators.ts (the
validator that validates toolsUsage) to accept and normalize the alias (i.e.,
validate the same shape for subAgentUsage as for toolsUsage and continue to
support either field at runtime).

In `@services/platform/convex/message_metadata/internal_mutations.ts`:
- Line 27: The mutation stopped populating the deprecated alias field
subAgentUsage even though the table schema still exposes it; update the insert
and patch handlers in internal_mutations.ts that write to messageMetadataTable
so they persist subAgentUsage alongside toolsUsage (map the same array/value
used for toolsUsage into subAgentUsage) whenever toolsUsage is present, and do
the same mapping in any validators or transformation steps referenced (see the
toolsUsage: v.optional(...) validator and the mutations around lines referenced
61-62 and 82-82) so older readers continue to see fresh data.

---

Outside diff comments:
In `@services/crawler/app/services/file_parser_service.py`:
- Around line 54-74: The docstrings for parse_pdf_with_vision,
parse_docx_with_vision, parse_pptx_with_vision, and parse_file_with_vision are
missing documentation for the new model parameter; update each method's
docstring to add a brief "model" entry in the Args section (e.g., model:
Optional[str] = None — name of the vision/LLM model to use for
extraction/analysis, defaults to the service-configured model), keeping wording
consistent with existing parameter docs and noting default/behavior when None.

In `@services/crawler/app/services/vision/openai_client.py`:
- Around line 289-294: process_pages_with_llm currently creates a new
AsyncOpenAI client on each call (AsyncOpenAI instantiation with timeout=180.0)
which duplicates connection overhead instead of reusing the VisionClient
singleton; refactor to reuse a shared client by either (a) adding a
parameterized client getter on VisionClient (e.g.,
VisionClient._get_client(timeout=180)) that returns a cached AsyncOpenAI
instance per timeout or (b) extracting a shared helper factory used by
process_pages_with_llm to obtain the client; update process_pages_with_llm to
call that helper instead of instantiating AsyncOpenAI directly and preserve the
intended resolved_model logic (resolved_model = model or
settings.get_fast_model()) and the different timeout behavior.

In `@services/platform/app/features/chat/components/message-info-dialog.tsx`:
- Around line 223-234: The UI is using the old formatter formatAgentName in
message-info-dialog.tsx which doesn't handle the renamed tool ids (e.g.,
file_assistant), so create or expose a single shared formatter (e.g.,
formatToolName or export the formatter used by ToolDetailsDialog) and replace
formatAgentName usages with that shared function; update message-info-dialog.tsx
(the render mapping that calls formatAgentName) to import and call the shared
formatter so both the list and ToolDetailsDialog produce identical
human-friendly titles for tools like document_assistant and file_assistant.

In `@services/platform/convex/agent_tools/files/helpers/analyze_text.ts`:
- Around line 1-6: Update the top-of-file documentation to remove the hardcoded
"fast model" reference and document the new configurable model parameter (e.g.,
model) accepted by the analyzeText function; mention that the caller supplies
the model and that the helper will use that model for LLM analysis, and keep
existing notes about encoding detection, chunking, ctx.storage.get(), and
saveMessages: 'none' intact.

In `@services/platform/convex/lib/agent_chat/internal_actions.ts`:
- Around line 574-595: The handler currently drops attachments by not reading
args.attachments and always returning promptContent: undefined; fix it by
destructuring attachments from args alongside
threadId/promptMessage/contextMessagesTokens, then restore the
attachment-derived prompt content before return (e.g., if attachments exist,
produce promptContent from args.attachments using the same conversion logic used
elsewhere / by buildHooksFromConfig and return that value instead of undefined),
keeping the contextExceedsBudget logging via beforeGenerateDebugLog and
preserving existing variables like threadId and contextMessagesTokens.

In `@services/platform/convex/lib/attachments/process_attachments.ts`:
- Around line 60-65: ProcessAttachmentsConfig currently declares model as
optional but the exported function expects a guaranteed model; change the
interface ProcessAttachmentsConfig to require model: string (remove the optional
mark) and update any related type intersections (e.g., the exported function
signature that used ProcessAttachmentsConfig & { model: string }) to just use
ProcessAttachmentsConfig. Ensure callers and any defaulting logic (referenced in
the file around the exported function and related helper types) are updated to
pass a model string or handle validation so the new required property is
satisfied.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 89d75a5e-f413-4551-b2bc-4f331953b88a

📥 Commits

Reviewing files that changed from the base of the PR and between fe84a99 and abd40d7.

⛔ Files ignored due to path filters (1)

services/platform/convex/_generated/api.d.ts is excluded by !**/_generated/**

📒 Files selected for processing (36)

services/crawler/app/routers/docx.py
services/crawler/app/routers/pdf.py
services/crawler/app/routers/pptx.py
services/crawler/app/services/file_parser_service.py
services/crawler/app/services/vision/openai_client.py
services/platform/app/features/chat/components/message-info-dialog.tsx
services/platform/app/features/chat/components/tool-details-dialog.tsx
services/platform/app/features/chat/hooks/queries.ts
services/platform/app/features/custom-agents/components/tool-selector.tsx
services/platform/convex/agent_tools/delegation/create_delegation_tool.ts
services/platform/convex/agent_tools/files/docx_tool.ts
services/platform/convex/agent_tools/files/helpers/analyze_text.ts
services/platform/convex/agent_tools/files/helpers/get_agent_model.ts
services/platform/convex/agent_tools/files/helpers/parse_file.ts
services/platform/convex/agent_tools/files/internal_actions.ts
services/platform/convex/agent_tools/files/pdf_tool.ts
services/platform/convex/agent_tools/files/pptx_tool.ts
services/platform/convex/agent_tools/files/text_tool.ts
services/platform/convex/agent_tools/tool_names.ts
services/platform/convex/agent_tools/tool_registry.ts
services/platform/convex/agents/chat/agent.ts
services/platform/convex/agents/file/agent.ts
services/platform/convex/custom_agents/system_defaults.ts
services/platform/convex/lib/agent_chat/internal_actions.ts
services/platform/convex/lib/agent_completion/on_agent_complete.ts
services/platform/convex/lib/agent_response/generate_response.ts
services/platform/convex/lib/agent_response/types.ts
services/platform/convex/lib/agent_response/validators.ts
services/platform/convex/lib/attachments/process_attachments.ts
services/platform/convex/message_metadata/internal_mutations.ts
services/platform/convex/streaming/schema.ts
services/platform/convex/streaming/validators.ts
services/platform/lib/shared/__tests__/file-types-tool-mapping.test.ts
services/platform/lib/shared/file-types.ts
services/platform/messages/en.json
services/platform/public/openapi.json

coderabbitai · 2026-03-16T09:43:53Z

+      <ToolDetailsDialog
+        isOpen={selectedTool !== null}
+        onOpenChange={(open) => !open && setSelectedTool(null)}
+        usage={selectedTool}


⚠️ Potential issue | 🟡 Minor

Reset the nested tool dialog when the parent closes.

selectedTool outlives isOpen, so the child dialog can survive the parent close or reopen immediately on the next open. Clear the selection from the parent close path instead of only when the child closes itself.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@services/platform/app/features/chat/components/message-info-dialog.tsx` around lines 289 - 292, The nested ToolDetailsDialog can remain open because selectedTool outlives the parent dialog; update the parent dialog's close path to clear selection so the child cannot survive a parent close — specifically, ensure the parent's onOpenChange handler (the prop controlling isOpen) calls setSelectedTool(null) when open becomes false (in addition to or instead of only relying on ToolDetailsDialog's onOpenChange), so that ToolDetailsDialog receives isOpen={selectedTool !== null} and will always reset when the parent closes.

coderabbitai · 2026-03-16T09:43:53Z

+    systemAgentSlug: 'file',
+    name: 'file-assistant',
+    displayName: 'File Assistant',
    description:
-      'Reads and creates documents (PDF, Word, Excel, etc.). Can also create visual images by rendering HTML/CSS/SVG — use it for illustrations, charts, infographics, diagrams, and web page screenshots.',
-    systemInstructions: DOCUMENT_AGENT_INSTRUCTIONS,
-    toolNames: ['pdf', 'image', 'docx', 'pptx', 'txt', 'excel'],
+      'Reads and creates files (PDF, Word, Excel, etc.). Can also create visual images by rendering HTML/CSS/SVG — use it for illustrations, charts, infographics, diagrams, and web page screenshots.',
+    systemInstructions: FILE_AGENT_INSTRUCTIONS,
+    toolNames: ['pdf', 'image', 'docx', 'pptx', 'text', 'excel'],


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check if FILE_AGENT_INSTRUCTIONS mentions .csv for text tool rg -n '\.csv' services/platform/convex/agents/file/agent.ts

Repository: tale-project/tale

Length of output: 179

🏁 Script executed:

find . -name "file-types.ts" -type f

Repository: tale-project/tale

Length of output: 105

🏁 Script executed:

rg -n "\.csv|TOOL_FILE_MAP" services/platform/convex -A 3 -B 3 | head -100

Repository: tale-project/tale

Length of output: 3072

🏁 Script executed:

cat services/platform/lib/shared/file-types.ts

Repository: tale-project/tale

Length of output: 13439

Update agent instructions and tool descriptions to remove .csv from text tool references.

The .csv extension was correctly removed from the text tool mapping in TOOL_FILE_MAP (now exclusively mapped to excel tool), but the agent instructions still mention .csv for the text tool:

FILE_AGENT_INSTRUCTIONS in services/platform/convex/agents/file/agent.ts:52

textTool description in services/platform/convex/agent_tools/files/text_tool.ts:4, :86

Error message in services/platform/convex/agent_tools/files/helpers/analyze_text.ts:338

Remove .csv from all these text tool references to prevent the AI agent from incorrectly attempting CSV parsing with the text tool.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@services/platform/convex/custom_agents/system_defaults.ts` around lines 85 - 91, Update the agent instructions and tool descriptions to remove any reference to the ".csv" extension for the text tool: edit FILE_AGENT_INSTRUCTIONS in services/platform/convex/agents/file/agent.ts to stop instructing the agent to treat .csv via the text tool, update the textTool description in services/platform/convex/agent_tools/files/text_tool.ts (both the header and the detailed description) to list only plain text-like extensions (e.g., .txt, .md) or generic "text" instead of ".csv", and change the error message in services/platform/convex/agent_tools/files/helpers/analyze_text.ts to not mention ".csv" when suggesting the text tool; ensure CSV handling remains mapped to the excel tool (leave TOOL_FILE_MAP as-is) and adjust phrasing so the agent will route CSVs to the excel tool rather than the text tool.

coderabbitai · 2026-03-16T09:43:53Z

+  toolsUsage?: Array<{
    toolName: string;
    model?: string;
    provider?: string;
    inputTokens?: number;
    outputTokens?: number;
    totalTokens?: number;
    durationMs?: number;
+    input?: string;
+    output?: string;
  }>;


⚠️ Potential issue | 🟠 Major

Preserve the deprecated subAgentUsage field in the public response type.

Removing it here makes the shared response contract backward-incompatible, even though the PR objective says older clients should keep receiving the alias during the transition. The runtime validator in services/platform/convex/lib/agent_response/validators.ts should mirror the same alias.

Proposed compatibility fix

export interface GenerateResponseResult { threadId: string; text: string; usage?: { inputTokens?: number; outputTokens?: number; totalTokens?: number; reasoningTokens?: number; cachedInputTokens?: number; }; finishReason?: string; durationMs: number; timeToFirstTokenMs?: number; toolCalls?: Array<{ toolName: string; status: string }>; + /** `@deprecated` Use toolsUsage instead. */ + subAgentUsage?: Array<{ + toolName: string; + model?: string; + provider?: string; + inputTokens?: number; + outputTokens?: number; + totalTokens?: number; + durationMs?: number; + input?: string; + output?: string; + }>; toolsUsage?: Array<{ toolName: string; model?: string; provider?: string; inputTokens?: number; outputTokens?: number; totalTokens?: number; durationMs?: number; input?: string; output?: string; }>;

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

toolsUsage?: Array<{

toolName: string;

model?: string;

provider?: string;

inputTokens?: number;

outputTokens?: number;

totalTokens?: number;

durationMs?: number;

input?: string;

output?: string;

}>;

/** `@deprecated` Use toolsUsage instead. */

subAgentUsage?: Array<{

toolName: string;

model?: string;

provider?: string;

inputTokens?: number;

outputTokens?: number;

totalTokens?: number;

durationMs?: number;

input?: string;

output?: string;

}>;

toolsUsage?: Array<{

toolName: string;

model?: string;

provider?: string;

inputTokens?: number;

outputTokens?: number;

totalTokens?: number;

durationMs?: number;

input?: string;

output?: string;

}>;

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@services/platform/convex/lib/agent_response/types.ts` around lines 140 - 150, Reintroduce the deprecated alias subAgentUsage into the public response type so the contract stays backward-compatible: add an optional subAgentUsage?: Array<{ toolName: string; model?: string; provider?: string; inputTokens?: number; outputTokens?: number; totalTokens?: number; durationMs?: number; input?: string; output?: string; }>; alongside toolsUsage in services/platform/convex/lib/agent_response/types.ts, and update the runtime validator in services/platform/convex/lib/agent_response/validators.ts (the validator that validates toolsUsage) to accept and normalize the alias (i.e., validate the same shape for subAgentUsage as for toolsUsage and continue to support either field at runtime).

coderabbitai · 2026-03-16T09:43:53Z

    durationMs: v.optional(v.number()),
    timeToFirstTokenMs: v.optional(v.number()),
-    subAgentUsage: v.optional(v.array(subAgentUsageItemValidator)),
+    toolsUsage: v.optional(v.array(toolUsageItemValidator)),


⚠️ Potential issue | 🟠 Major

Persist the deprecated alias while the schema still exposes it.

messageMetadataTable still defines subAgentUsage, but this mutation no longer writes it on insert or patch. Older readers of that field will stop seeing fresh data even though the schema advertises compatibility.

Proposed compatibility fix

args: { @@ timeToFirstTokenMs: v.optional(v.number()), toolsUsage: v.optional(v.array(toolUsageItemValidator)), contextWindow: v.optional(v.string()), contextStats: v.optional(contextStatsValidator), }, @@ durationMs: args.durationMs ?? existing.durationMs, timeToFirstTokenMs: args.timeToFirstTokenMs ?? existing.timeToFirstTokenMs, toolsUsage: args.toolsUsage ?? existing.toolsUsage, + subAgentUsage: args.toolsUsage ?? existing.subAgentUsage, contextWindow: contextWindow ?? existing.contextWindow, contextStats: args.contextStats ?? existing.contextStats, }); @@ reasoning: args.reasoning, providerMetadata: args.providerMetadata, durationMs: args.durationMs, timeToFirstTokenMs: args.timeToFirstTokenMs, toolsUsage: args.toolsUsage, + subAgentUsage: args.toolsUsage, contextWindow, contextStats: args.contextStats, });

Also applies to: 61-62, 82-82

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@services/platform/convex/message_metadata/internal_mutations.ts` at line 27, The mutation stopped populating the deprecated alias field subAgentUsage even though the table schema still exposes it; update the insert and patch handlers in internal_mutations.ts that write to messageMetadataTable so they persist subAgentUsage alongside toolsUsage (map the same array/value used for toolsUsage into subAgentUsage) whenever toolsUsage is present, and do the same mapping in any validators or transformation steps referenced (see the toolsUsage: v.optional(...) validator and the mutations around lines referenced 61-62 and 82-82) so older readers continue to see fresh data.

- Rename document→file in AGENT_CONTEXT_CONFIGS key - Update txt→text in document_find and document_retrieve tool descriptions - Fix formatAgentName mapping: document_assistant→file_assistant - Harden getAgentModelId to handle both modelId and model properties - Guard aggregateChunkResults against empty chunkResults array

Add UsageAccumulator to collect input/output tokens and duration across all OpenAI Vision API calls (OCR, image description, LLM processing) for PDF, DOCX, and PPTX extractors. Surface usage data through the crawler response and remap to camelCase in the platform parse_file helper.

…cking (#800)

larryro added 4 commits March 16, 2026 15:16

refactor: rename txt tool to text and document agent to file

236e124

refactor: extract usage data from all tool results, not just delegati…

51bb92b

…on tools

greptile-apps Bot reviewed Mar 16, 2026

View reviewed changes

coderabbitai Bot requested changes Mar 16, 2026

View reviewed changes

larryro added 3 commits March 16, 2026 18:30

feat: log model parameter in file parsing analytics

e88e184

larryro merged commit fbdfdd9 into main Mar 16, 2026
17 checks passed

larryro deleted the refactor/rename-txt-to-text-and-document-to-file branch March 16, 2026 11:18

This was referenced Mar 24, 2026

Rename agent tool properties and switch to OpenAI-compatible provider #846

Merged

feat(platform): inline file tools and apply public URL rewriting #891

Merged

yannickmonney pushed a commit that referenced this pull request Apr 8, 2026

refactor: rename txt to text, document to file, and broaden usage tra…

643c437

…cking (#800)

coderabbitai Bot mentioned this pull request Apr 11, 2026

perf(platform,rag): optimize document comparison response time #1398

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: rename txt to text, document to file, and broaden usage tracking#800

refactor: rename txt to text, document to file, and broaden usage tracking#800
larryro merged 7 commits into
mainfrom
refactor/rename-txt-to-text-and-document-to-file

larryro commented Mar 16, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

greptile-apps Bot left a comment

Uh oh!

coderabbitai Bot commented Mar 16, 2026

Walkthrough

Estimated code review effort

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Mar 16, 2026

Uh oh!

coderabbitai Bot Mar 16, 2026

Uh oh!

coderabbitai Bot Mar 16, 2026

Uh oh!

coderabbitai Bot Mar 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

larryro commented Mar 16, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Release Notes

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot commented Mar 16, 2026

Walkthrough

Estimated code review effort

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

larryro commented Mar 16, 2026 •

edited by coderabbitai Bot

Loading