Skip to content

refactor: rename txt to text, document to file, and broaden usage tracking#800

Merged
larryro merged 7 commits into
mainfrom
refactor/rename-txt-to-text-and-document-to-file
Mar 16, 2026
Merged

refactor: rename txt to text, document to file, and broaden usage tracking#800
larryro merged 7 commits into
mainfrom
refactor/rename-txt-to-text-and-document-to-file

Conversation

@larryro

@larryro larryro commented Mar 16, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Rename the txt tool to text and the document agent to file for clarity
  • Rename subAgentUsage to toolsUsage and track all tool calls (not just delegation tools), with backwards-compatible deprecation aliases in the API
  • Fix stale references, add missing MIME types, and clean up text tool type mappings

Test plan

  • Verify text tool handles all previously supported file types
  • Verify file agent delegation works end-to-end
  • Confirm toolsUsage is populated for all tool calls in message metadata
  • Confirm deprecated subAgentUsage field still returns data for existing clients
  • Run typecheck and lint with no errors

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • New Features

    • Added LLM model selection for document and file processing across PDF, DOCX, PPTX, and text files.
    • Expanded text file support to include markdown, JSON, YAML, XML, HTML, CSS, code files, and other text-based formats.
    • Enhanced tool usage metrics now include token counts, processing duration, model information, and input/output details.
  • Improvements

    • Rebranded "Document Assistant" to "File Assistant" with broader file handling capabilities.
    • Reorganized file tool categories for improved usability.

larryro added 4 commits March 16, 2026 15:16
Broaden usage tracking from delegation-only sub-agents to all tool calls,
adding input/output fields to the usage schema. Rename SubAgentDetailsDialog
to ToolDetailsDialog and update validators, types, translations, and OpenAPI
spec with backwards-compatible deprecation aliases.
- Update chat agent instructions to reference "file agent" instead of "document agent"
- Fix tool-details-dialog display name mapping (document_assistant → file_assistant)
- Remove text/csv from text tool mimeTypes to avoid overlap with excel tool
- Replace text/* wildcard with explicit file extensions and MIME types
- Add missing MIME types for .sql, .graphql extensions
- Remove unused model parameter from Python vision extractors
- Update stale comments in generate_response.ts and create_delegation_tool.ts

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@coderabbitai

coderabbitai Bot commented Mar 16, 2026

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Walkthrough

This PR adds a model parameter throughout the file parsing pipeline (crawler endpoints, file parser services, and OpenAI LLM calls), enabling callers to specify which LLM model to use for text processing. Concurrently, it refactors terminology across the platform from "SubAgent" to "Tool", renames the "txt" tool to "text" with expanded format support (markdown, JSON, HTML, etc.), and renames the "Document Agent" to "File Agent". Tool categorization is reorganized by introducing a new "Files" category while restructuring the Documents and Knowledge sections. Tool usage metadata is expanded to capture model, provider, token counts, and input/output details. These changes span crawler endpoints, file parser services, chat UI components, agent definitions, and message metadata structures.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~45 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 57.14% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title concisely and accurately summarizes the three primary changes: renaming txt to text, document to file, and broadening usage tracking from sub-agents to all tool calls.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch refactor/rename-txt-to-text-and-document-to-file
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

Migrating from UI to YAML configuration.

Use the @coderabbitai configuration command in a PR comment to get a dump of all your UI settings in YAML format. You can then edit this YAML file and upload it to the root of your repository to configure CodeRabbit programmatically.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (6)
services/crawler/app/services/vision/openai_client.py (1)

289-294: 🧹 Nitpick | 🔵 Trivial

Consider reusing the VisionClient instance instead of creating a new AsyncOpenAI client per call.

A new AsyncOpenAI client is instantiated on every call to process_pages_with_llm, while the VisionClient class at the bottom of this file maintains a singleton pattern with _get_client(). This is likely intentional (different timeout: 180s vs 120s), but if call frequency is high, reusing a client or extracting this to a shared helper could reduce connection overhead.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@services/crawler/app/services/vision/openai_client.py` around lines 289 -
294, process_pages_with_llm currently creates a new AsyncOpenAI client on each
call (AsyncOpenAI instantiation with timeout=180.0) which duplicates connection
overhead instead of reusing the VisionClient singleton; refactor to reuse a
shared client by either (a) adding a parameterized client getter on VisionClient
(e.g., VisionClient._get_client(timeout=180)) that returns a cached AsyncOpenAI
instance per timeout or (b) extracting a shared helper factory used by
process_pages_with_llm to obtain the client; update process_pages_with_llm to
call that helper instead of instantiating AsyncOpenAI directly and preserve the
intended resolved_model logic (resolved_model = model or
settings.get_fast_model()) and the different timeout behavior.
services/crawler/app/services/file_parser_service.py (1)

54-74: 🧹 Nitpick | 🔵 Trivial

Update docstring to document the new model parameter.

The model parameter was added to parse_pdf_with_vision but the docstring (lines 65-74) doesn't document it. The same applies to the other vision methods (parse_docx_with_vision, parse_pptx_with_vision, parse_file_with_vision).

📝 Suggested docstring update for parse_pdf_with_vision
         Args:
             file_bytes: Raw PDF bytes
             filename: Filename for logging
             user_input: Optional user instruction for AI extraction
             process_images: Whether to extract and describe embedded images
             ocr_scanned_pages: Whether to OCR pages with low text content
+            model: Optional LLM model name for text processing (defaults to fast model)
 
         Returns:
             Extraction result with full_text and metadata
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@services/crawler/app/services/file_parser_service.py` around lines 54 - 74,
The docstrings for parse_pdf_with_vision, parse_docx_with_vision,
parse_pptx_with_vision, and parse_file_with_vision are missing documentation for
the new model parameter; update each method's docstring to add a brief "model"
entry in the Args section (e.g., model: Optional[str] = None — name of the
vision/LLM model to use for extraction/analysis, defaults to the
service-configured model), keeping wording consistent with existing parameter
docs and noting default/behavior when None.
services/platform/convex/agent_tools/files/helpers/analyze_text.ts (1)

1-6: 🧹 Nitpick | 🔵 Trivial

Update outdated documentation comment.

The comment on line 2 mentions "using the fast model," but this PR changes the function to accept an explicit model parameter, making the model configurable by the caller. Consider updating the documentation to reflect this.

📝 Suggested documentation update
 /**
- * Helper for analyzing text files using the fast model.
+ * Helper for analyzing text files using a specified LLM model.
  * Handles encoding detection, chunking for large files, and LLM analysis.
  * Uses ctx.storage.get() for direct Convex storage access (like analyze_image.ts).
  * Uses Agent framework with saveMessages: 'none' to avoid creating visible thread messages.
  */
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@services/platform/convex/agent_tools/files/helpers/analyze_text.ts` around
lines 1 - 6, Update the top-of-file documentation to remove the hardcoded "fast
model" reference and document the new configurable model parameter (e.g., model)
accepted by the analyzeText function; mention that the caller supplies the model
and that the helper will use that model for LLM analysis, and keep existing
notes about encoding detection, chunking, ctx.storage.get(), and saveMessages:
'none' intact.
services/platform/convex/lib/agent_chat/internal_actions.ts (1)

574-595: ⚠️ Potential issue | 🟠 Major

Regression: attachment context is dropped before generation.

The beforeGenerateHook receives attachments (passed by buildHooksFromConfig at line 469), but the handler at line 574 does not destructure them, and line 594 returns promptContent: undefined unconditionally. This prevents attached files from being processed into prompt content, breaking file-attachment functionality.

💡 Proposed fix (restore attachment-derived prompt content)
-    const { threadId, promptMessage, contextMessagesTokens } = args;
+    const { threadId, promptMessage, attachments, contextMessagesTokens } = args;

@@
-    return {
-      promptContent: undefined,
-      contextExceedsBudget,
-    };
+    const promptContent =
+      attachments && attachments.length > 0
+        ? await processAttachments(ctx, attachments, promptMessage)
+        : undefined;
+
+    return {
+      promptContent,
+      contextExceedsBudget,
+    };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@services/platform/convex/lib/agent_chat/internal_actions.ts` around lines 574
- 595, The handler currently drops attachments by not reading args.attachments
and always returning promptContent: undefined; fix it by destructuring
attachments from args alongside threadId/promptMessage/contextMessagesTokens,
then restore the attachment-derived prompt content before return (e.g., if
attachments exist, produce promptContent from args.attachments using the same
conversion logic used elsewhere / by buildHooksFromConfig and return that value
instead of undefined), keeping the contextExceedsBudget logging via
beforeGenerateDebugLog and preserving existing variables like threadId and
contextMessagesTokens.
services/platform/convex/lib/attachments/process_attachments.ts (1)

60-65: 🧹 Nitpick | 🔵 Trivial

Make the config contract consistently require model.

ProcessAttachmentsConfig marks model optional, while the exported function immediately strengthens it to ProcessAttachmentsConfig & { model: string }. Making the interface itself required is clearer and removes a misleading public shape.

Proposed refactor
 export interface ProcessAttachmentsConfig {
   maxDocumentLength?: number;
   debugLog?: (message: string, data?: Record<string, unknown>) => void;
   toolName?: string;
-  model?: string;
+  model: string;
 }
@@
 export async function processAttachments(
   ctx: ActionCtx,
   attachments: FileAttachment[],
   userText: string | undefined,
-  config: ProcessAttachmentsConfig & { model: string },
+  config: ProcessAttachmentsConfig,
 ): Promise<ProcessedAttachments> {

Also applies to: 83-88

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@services/platform/convex/lib/attachments/process_attachments.ts` around lines
60 - 65, ProcessAttachmentsConfig currently declares model as optional but the
exported function expects a guaranteed model; change the interface
ProcessAttachmentsConfig to require model: string (remove the optional mark) and
update any related type intersections (e.g., the exported function signature
that used ProcessAttachmentsConfig & { model: string }) to just use
ProcessAttachmentsConfig. Ensure callers and any defaulting logic (referenced in
the file around the exported function and related helper types) are updated to
pass a model string or handle validation so the new required property is
satisfied.
services/platform/app/features/chat/components/message-info-dialog.tsx (1)

223-234: ⚠️ Potential issue | 🟠 Major

The list still uses the pre-rename tool formatter.

formatAgentName only knows document_assistant, so a file_assistant row will render as the raw identifier here while ToolDetailsDialog now titles it as File. Reuse one formatter in both places.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@services/platform/app/features/chat/components/message-info-dialog.tsx`
around lines 223 - 234, The UI is using the old formatter formatAgentName in
message-info-dialog.tsx which doesn't handle the renamed tool ids (e.g.,
file_assistant), so create or expose a single shared formatter (e.g.,
formatToolName or export the formatter used by ToolDetailsDialog) and replace
formatAgentName usages with that shared function; update message-info-dialog.tsx
(the render mapping that calls formatAgentName) to import and call the shared
formatter so both the list and ToolDetailsDialog produce identical
human-friendly titles for tools like document_assistant and file_assistant.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@services/platform/app/features/chat/components/message-info-dialog.tsx`:
- Around line 289-292: The nested ToolDetailsDialog can remain open because
selectedTool outlives the parent dialog; update the parent dialog's close path
to clear selection so the child cannot survive a parent close — specifically,
ensure the parent's onOpenChange handler (the prop controlling isOpen) calls
setSelectedTool(null) when open becomes false (in addition to or instead of only
relying on ToolDetailsDialog's onOpenChange), so that ToolDetailsDialog receives
isOpen={selectedTool !== null} and will always reset when the parent closes.

In `@services/platform/convex/custom_agents/system_defaults.ts`:
- Around line 85-91: Update the agent instructions and tool descriptions to
remove any reference to the ".csv" extension for the text tool: edit
FILE_AGENT_INSTRUCTIONS in services/platform/convex/agents/file/agent.ts to stop
instructing the agent to treat .csv via the text tool, update the textTool
description in services/platform/convex/agent_tools/files/text_tool.ts (both the
header and the detailed description) to list only plain text-like extensions
(e.g., .txt, .md) or generic "text" instead of ".csv", and change the error
message in services/platform/convex/agent_tools/files/helpers/analyze_text.ts to
not mention ".csv" when suggesting the text tool; ensure CSV handling remains
mapped to the excel tool (leave TOOL_FILE_MAP as-is) and adjust phrasing so the
agent will route CSVs to the excel tool rather than the text tool.

In `@services/platform/convex/lib/agent_response/types.ts`:
- Around line 140-150: Reintroduce the deprecated alias subAgentUsage into the
public response type so the contract stays backward-compatible: add an optional
subAgentUsage?: Array<{ toolName: string; model?: string; provider?: string;
inputTokens?: number; outputTokens?: number; totalTokens?: number; durationMs?:
number; input?: string; output?: string; }>; alongside toolsUsage in
services/platform/convex/lib/agent_response/types.ts, and update the runtime
validator in services/platform/convex/lib/agent_response/validators.ts (the
validator that validates toolsUsage) to accept and normalize the alias (i.e.,
validate the same shape for subAgentUsage as for toolsUsage and continue to
support either field at runtime).

In `@services/platform/convex/message_metadata/internal_mutations.ts`:
- Line 27: The mutation stopped populating the deprecated alias field
subAgentUsage even though the table schema still exposes it; update the insert
and patch handlers in internal_mutations.ts that write to messageMetadataTable
so they persist subAgentUsage alongside toolsUsage (map the same array/value
used for toolsUsage into subAgentUsage) whenever toolsUsage is present, and do
the same mapping in any validators or transformation steps referenced (see the
toolsUsage: v.optional(...) validator and the mutations around lines referenced
61-62 and 82-82) so older readers continue to see fresh data.

---

Outside diff comments:
In `@services/crawler/app/services/file_parser_service.py`:
- Around line 54-74: The docstrings for parse_pdf_with_vision,
parse_docx_with_vision, parse_pptx_with_vision, and parse_file_with_vision are
missing documentation for the new model parameter; update each method's
docstring to add a brief "model" entry in the Args section (e.g., model:
Optional[str] = None — name of the vision/LLM model to use for
extraction/analysis, defaults to the service-configured model), keeping wording
consistent with existing parameter docs and noting default/behavior when None.

In `@services/crawler/app/services/vision/openai_client.py`:
- Around line 289-294: process_pages_with_llm currently creates a new
AsyncOpenAI client on each call (AsyncOpenAI instantiation with timeout=180.0)
which duplicates connection overhead instead of reusing the VisionClient
singleton; refactor to reuse a shared client by either (a) adding a
parameterized client getter on VisionClient (e.g.,
VisionClient._get_client(timeout=180)) that returns a cached AsyncOpenAI
instance per timeout or (b) extracting a shared helper factory used by
process_pages_with_llm to obtain the client; update process_pages_with_llm to
call that helper instead of instantiating AsyncOpenAI directly and preserve the
intended resolved_model logic (resolved_model = model or
settings.get_fast_model()) and the different timeout behavior.

In `@services/platform/app/features/chat/components/message-info-dialog.tsx`:
- Around line 223-234: The UI is using the old formatter formatAgentName in
message-info-dialog.tsx which doesn't handle the renamed tool ids (e.g.,
file_assistant), so create or expose a single shared formatter (e.g.,
formatToolName or export the formatter used by ToolDetailsDialog) and replace
formatAgentName usages with that shared function; update message-info-dialog.tsx
(the render mapping that calls formatAgentName) to import and call the shared
formatter so both the list and ToolDetailsDialog produce identical
human-friendly titles for tools like document_assistant and file_assistant.

In `@services/platform/convex/agent_tools/files/helpers/analyze_text.ts`:
- Around line 1-6: Update the top-of-file documentation to remove the hardcoded
"fast model" reference and document the new configurable model parameter (e.g.,
model) accepted by the analyzeText function; mention that the caller supplies
the model and that the helper will use that model for LLM analysis, and keep
existing notes about encoding detection, chunking, ctx.storage.get(), and
saveMessages: 'none' intact.

In `@services/platform/convex/lib/agent_chat/internal_actions.ts`:
- Around line 574-595: The handler currently drops attachments by not reading
args.attachments and always returning promptContent: undefined; fix it by
destructuring attachments from args alongside
threadId/promptMessage/contextMessagesTokens, then restore the
attachment-derived prompt content before return (e.g., if attachments exist,
produce promptContent from args.attachments using the same conversion logic used
elsewhere / by buildHooksFromConfig and return that value instead of undefined),
keeping the contextExceedsBudget logging via beforeGenerateDebugLog and
preserving existing variables like threadId and contextMessagesTokens.

In `@services/platform/convex/lib/attachments/process_attachments.ts`:
- Around line 60-65: ProcessAttachmentsConfig currently declares model as
optional but the exported function expects a guaranteed model; change the
interface ProcessAttachmentsConfig to require model: string (remove the optional
mark) and update any related type intersections (e.g., the exported function
signature that used ProcessAttachmentsConfig & { model: string }) to just use
ProcessAttachmentsConfig. Ensure callers and any defaulting logic (referenced in
the file around the exported function and related helper types) are updated to
pass a model string or handle validation so the new required property is
satisfied.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 89d75a5e-f413-4551-b2bc-4f331953b88a

📥 Commits

Reviewing files that changed from the base of the PR and between fe84a99 and abd40d7.

⛔ Files ignored due to path filters (1)
  • services/platform/convex/_generated/api.d.ts is excluded by !**/_generated/**
📒 Files selected for processing (36)
  • services/crawler/app/routers/docx.py
  • services/crawler/app/routers/pdf.py
  • services/crawler/app/routers/pptx.py
  • services/crawler/app/services/file_parser_service.py
  • services/crawler/app/services/vision/openai_client.py
  • services/platform/app/features/chat/components/message-info-dialog.tsx
  • services/platform/app/features/chat/components/tool-details-dialog.tsx
  • services/platform/app/features/chat/hooks/queries.ts
  • services/platform/app/features/custom-agents/components/tool-selector.tsx
  • services/platform/convex/agent_tools/delegation/create_delegation_tool.ts
  • services/platform/convex/agent_tools/files/docx_tool.ts
  • services/platform/convex/agent_tools/files/helpers/analyze_text.ts
  • services/platform/convex/agent_tools/files/helpers/get_agent_model.ts
  • services/platform/convex/agent_tools/files/helpers/parse_file.ts
  • services/platform/convex/agent_tools/files/internal_actions.ts
  • services/platform/convex/agent_tools/files/pdf_tool.ts
  • services/platform/convex/agent_tools/files/pptx_tool.ts
  • services/platform/convex/agent_tools/files/text_tool.ts
  • services/platform/convex/agent_tools/tool_names.ts
  • services/platform/convex/agent_tools/tool_registry.ts
  • services/platform/convex/agents/chat/agent.ts
  • services/platform/convex/agents/file/agent.ts
  • services/platform/convex/custom_agents/system_defaults.ts
  • services/platform/convex/lib/agent_chat/internal_actions.ts
  • services/platform/convex/lib/agent_completion/on_agent_complete.ts
  • services/platform/convex/lib/agent_response/generate_response.ts
  • services/platform/convex/lib/agent_response/types.ts
  • services/platform/convex/lib/agent_response/validators.ts
  • services/platform/convex/lib/attachments/process_attachments.ts
  • services/platform/convex/message_metadata/internal_mutations.ts
  • services/platform/convex/streaming/schema.ts
  • services/platform/convex/streaming/validators.ts
  • services/platform/lib/shared/__tests__/file-types-tool-mapping.test.ts
  • services/platform/lib/shared/file-types.ts
  • services/platform/messages/en.json
  • services/platform/public/openapi.json

Comment on lines +289 to +292
<ToolDetailsDialog
isOpen={selectedTool !== null}
onOpenChange={(open) => !open && setSelectedTool(null)}
usage={selectedTool}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Reset the nested tool dialog when the parent closes.

selectedTool outlives isOpen, so the child dialog can survive the parent close or reopen immediately on the next open. Clear the selection from the parent close path instead of only when the child closes itself.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@services/platform/app/features/chat/components/message-info-dialog.tsx`
around lines 289 - 292, The nested ToolDetailsDialog can remain open because
selectedTool outlives the parent dialog; update the parent dialog's close path
to clear selection so the child cannot survive a parent close — specifically,
ensure the parent's onOpenChange handler (the prop controlling isOpen) calls
setSelectedTool(null) when open becomes false (in addition to or instead of only
relying on ToolDetailsDialog's onOpenChange), so that ToolDetailsDialog receives
isOpen={selectedTool !== null} and will always reset when the parent closes.

Comment on lines +85 to +91
systemAgentSlug: 'file',
name: 'file-assistant',
displayName: 'File Assistant',
description:
'Reads and creates documents (PDF, Word, Excel, etc.). Can also create visual images by rendering HTML/CSS/SVG — use it for illustrations, charts, infographics, diagrams, and web page screenshots.',
systemInstructions: DOCUMENT_AGENT_INSTRUCTIONS,
toolNames: ['pdf', 'image', 'docx', 'pptx', 'txt', 'excel'],
'Reads and creates files (PDF, Word, Excel, etc.). Can also create visual images by rendering HTML/CSS/SVG — use it for illustrations, charts, infographics, diagrams, and web page screenshots.',
systemInstructions: FILE_AGENT_INSTRUCTIONS,
toolNames: ['pdf', 'image', 'docx', 'pptx', 'text', 'excel'],

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if FILE_AGENT_INSTRUCTIONS mentions .csv for text tool
rg -n '\.csv' services/platform/convex/agents/file/agent.ts

Repository: tale-project/tale

Length of output: 179


🏁 Script executed:

find . -name "file-types.ts" -type f

Repository: tale-project/tale

Length of output: 105


🏁 Script executed:

rg -n "\.csv|TOOL_FILE_MAP" services/platform/convex -A 3 -B 3 | head -100

Repository: tale-project/tale

Length of output: 3072


🏁 Script executed:

cat services/platform/lib/shared/file-types.ts

Repository: tale-project/tale

Length of output: 13439


Update agent instructions and tool descriptions to remove .csv from text tool references.

The .csv extension was correctly removed from the text tool mapping in TOOL_FILE_MAP (now exclusively mapped to excel tool), but the agent instructions still mention .csv for the text tool:

  • FILE_AGENT_INSTRUCTIONS in services/platform/convex/agents/file/agent.ts:52
  • textTool description in services/platform/convex/agent_tools/files/text_tool.ts:4, :86
  • Error message in services/platform/convex/agent_tools/files/helpers/analyze_text.ts:338

Remove .csv from all these text tool references to prevent the AI agent from incorrectly attempting CSV parsing with the text tool.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@services/platform/convex/custom_agents/system_defaults.ts` around lines 85 -
91, Update the agent instructions and tool descriptions to remove any reference
to the ".csv" extension for the text tool: edit FILE_AGENT_INSTRUCTIONS in
services/platform/convex/agents/file/agent.ts to stop instructing the agent to
treat .csv via the text tool, update the textTool description in
services/platform/convex/agent_tools/files/text_tool.ts (both the header and the
detailed description) to list only plain text-like extensions (e.g., .txt, .md)
or generic "text" instead of ".csv", and change the error message in
services/platform/convex/agent_tools/files/helpers/analyze_text.ts to not
mention ".csv" when suggesting the text tool; ensure CSV handling remains mapped
to the excel tool (leave TOOL_FILE_MAP as-is) and adjust phrasing so the agent
will route CSVs to the excel tool rather than the text tool.

Comment on lines +140 to 150
toolsUsage?: Array<{
toolName: string;
model?: string;
provider?: string;
inputTokens?: number;
outputTokens?: number;
totalTokens?: number;
durationMs?: number;
input?: string;
output?: string;
}>;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Preserve the deprecated subAgentUsage field in the public response type.

Removing it here makes the shared response contract backward-incompatible, even though the PR objective says older clients should keep receiving the alias during the transition. The runtime validator in services/platform/convex/lib/agent_response/validators.ts should mirror the same alias.

Proposed compatibility fix
 export interface GenerateResponseResult {
   threadId: string;
   text: string;
   usage?: {
     inputTokens?: number;
     outputTokens?: number;
     totalTokens?: number;
     reasoningTokens?: number;
     cachedInputTokens?: number;
   };
   finishReason?: string;
   durationMs: number;
   timeToFirstTokenMs?: number;
   toolCalls?: Array<{ toolName: string; status: string }>;
+  /** `@deprecated` Use toolsUsage instead. */
+  subAgentUsage?: Array<{
+    toolName: string;
+    model?: string;
+    provider?: string;
+    inputTokens?: number;
+    outputTokens?: number;
+    totalTokens?: number;
+    durationMs?: number;
+    input?: string;
+    output?: string;
+  }>;
   toolsUsage?: Array<{
     toolName: string;
     model?: string;
     provider?: string;
     inputTokens?: number;
     outputTokens?: number;
     totalTokens?: number;
     durationMs?: number;
     input?: string;
     output?: string;
   }>;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
toolsUsage?: Array<{
toolName: string;
model?: string;
provider?: string;
inputTokens?: number;
outputTokens?: number;
totalTokens?: number;
durationMs?: number;
input?: string;
output?: string;
}>;
/** `@deprecated` Use toolsUsage instead. */
subAgentUsage?: Array<{
toolName: string;
model?: string;
provider?: string;
inputTokens?: number;
outputTokens?: number;
totalTokens?: number;
durationMs?: number;
input?: string;
output?: string;
}>;
toolsUsage?: Array<{
toolName: string;
model?: string;
provider?: string;
inputTokens?: number;
outputTokens?: number;
totalTokens?: number;
durationMs?: number;
input?: string;
output?: string;
}>;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@services/platform/convex/lib/agent_response/types.ts` around lines 140 - 150,
Reintroduce the deprecated alias subAgentUsage into the public response type so
the contract stays backward-compatible: add an optional subAgentUsage?: Array<{
toolName: string; model?: string; provider?: string; inputTokens?: number;
outputTokens?: number; totalTokens?: number; durationMs?: number; input?:
string; output?: string; }>; alongside toolsUsage in
services/platform/convex/lib/agent_response/types.ts, and update the runtime
validator in services/platform/convex/lib/agent_response/validators.ts (the
validator that validates toolsUsage) to accept and normalize the alias (i.e.,
validate the same shape for subAgentUsage as for toolsUsage and continue to
support either field at runtime).

durationMs: v.optional(v.number()),
timeToFirstTokenMs: v.optional(v.number()),
subAgentUsage: v.optional(v.array(subAgentUsageItemValidator)),
toolsUsage: v.optional(v.array(toolUsageItemValidator)),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Persist the deprecated alias while the schema still exposes it.

messageMetadataTable still defines subAgentUsage, but this mutation no longer writes it on insert or patch. Older readers of that field will stop seeing fresh data even though the schema advertises compatibility.

Proposed compatibility fix
   args: {
@@
     timeToFirstTokenMs: v.optional(v.number()),
     toolsUsage: v.optional(v.array(toolUsageItemValidator)),
     contextWindow: v.optional(v.string()),
     contextStats: v.optional(contextStatsValidator),
   },
@@
         durationMs: args.durationMs ?? existing.durationMs,
         timeToFirstTokenMs:
           args.timeToFirstTokenMs ?? existing.timeToFirstTokenMs,
         toolsUsage: args.toolsUsage ?? existing.toolsUsage,
+        subAgentUsage: args.toolsUsage ?? existing.subAgentUsage,
         contextWindow: contextWindow ?? existing.contextWindow,
         contextStats: args.contextStats ?? existing.contextStats,
       });
@@
       reasoning: args.reasoning,
       providerMetadata: args.providerMetadata,
       durationMs: args.durationMs,
       timeToFirstTokenMs: args.timeToFirstTokenMs,
       toolsUsage: args.toolsUsage,
+      subAgentUsage: args.toolsUsage,
       contextWindow,
       contextStats: args.contextStats,
     });

Also applies to: 61-62, 82-82

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@services/platform/convex/message_metadata/internal_mutations.ts` at line 27,
The mutation stopped populating the deprecated alias field subAgentUsage even
though the table schema still exposes it; update the insert and patch handlers
in internal_mutations.ts that write to messageMetadataTable so they persist
subAgentUsage alongside toolsUsage (map the same array/value used for toolsUsage
into subAgentUsage) whenever toolsUsage is present, and do the same mapping in
any validators or transformation steps referenced (see the toolsUsage:
v.optional(...) validator and the mutations around lines referenced 61-62 and
82-82) so older readers continue to see fresh data.

larryro added 3 commits March 16, 2026 18:30
- Rename document→file in AGENT_CONTEXT_CONFIGS key
- Update txt→text in document_find and document_retrieve tool descriptions
- Fix formatAgentName mapping: document_assistant→file_assistant
- Harden getAgentModelId to handle both modelId and model properties
- Guard aggregateChunkResults against empty chunkResults array
Add UsageAccumulator to collect input/output tokens and duration across
all OpenAI Vision API calls (OCR, image description, LLM processing)
for PDF, DOCX, and PPTX extractors. Surface usage data through the
crawler response and remap to camelCase in the platform parse_file helper.
@larryro larryro merged commit fbdfdd9 into main Mar 16, 2026
17 checks passed
@larryro larryro deleted the refactor/rename-txt-to-text-and-document-to-file branch March 16, 2026 11:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant