[pull] develop from baserow:develop#215
Merged
pull[bot] merged 5 commits intocode:developfrom Mar 20, 2026
Merged
Conversation
…ata; add google and anthropic models compatibility (#4951) * chore(deps): replace udspy with pydantic-ai and opentelemetry-sdk Replace the udspy dependency with pydantic-ai-slim (with openai, groq, anthropic, bedrock providers) and opentelemetry-sdk for structured telemetry collection. * fix(sentry): exclude pydantic_ai from auto-enabling integrations sentry-sdk's pydantic_ai integration patches ToolManager._call_tool which was removed in pydantic-ai >= 1.x (now execute_tool_call), causing import-time errors. * feat(settings): add dev log file mirroring and allow embeddings URL in tests - Add BASEROW_LOG_FILE support in dev settings to mirror logs (including loguru output) to a file, useful for AI-assisted debugging. - Allow BASEROW_EMBEDDINGS_API_URL to be overridden via env in test settings for search_user_docs eval tests. * feat(assistant): add message_history field to AssistantChat Add a BinaryField to store serialized pydantic-ai message history (JSON bytes) for multi-turn conversation context, replacing the previous udspy-based conversation state. * refactor(assistant): port to pydantic-ai agent framework Replace udspy with pydantic-ai as the agent framework for the AI assistant. Key changes: - Add Agent definitions with typed deps (AssistantDeps) and dynamic toolsets for runtime tool loading - Add deps module with AssistantDeps, ToolHelpers, and EventBus for streaming events to the UI - Add history module for serializing/deserializing pydantic-ai message history to the database - Add model_profiles for provider-specific configuration (Anthropic, OpenAI, Groq, Bedrock) - Add toolset module with ToolGroup base class replacing the udspy tool registry pattern - Add shared/ with formula_utils and sub-agent helpers - Add tool_types.py per tool module for pydantic-ai ToolDefinition - Port all tool modules (core, database, navigation, automation, search_user_docs) from udspy decorators to pydantic-ai Tool instances - Port assistant orchestrator, handler, and prompts - Remove signatures.py (replaced by pydantic-ai output types) * refactor(assistant): update telemetry for pydantic-ai Rework telemetry collection to use pydantic-ai's message history format and OpenTelemetry SDK for structured span/event recording, replacing the previous udspy-based telemetry hooks. * test(assistant): update unit tests for pydantic-ai port Rewrite assistant unit tests to use pydantic-ai's testing utilities (TestModel, FunctionModel) instead of udspy mocks. Add new test files for core tools, navigation tools, and search docs tools. Remove obsolete skip file. * test(assistant): add LLM eval test suite Add end-to-end eval tests that run the real agent against a live LLM to verify tool selection, schema compatibility, and output quality. Includes evals for: navigation, core builders, database tables/rows, sample rows, automation workflows, search_user_docs, and cross-cutting structured output validation. Tests are marked with @pytest.mark.eval and skipped by default. Configure via EVAL_LLM_MODEL or EVAL_LLM_MODELS env vars. * docs: add eval guide and update AI assistant installation docs - Add docs/development/ai-assistant-evals.md with instructions for running the eval suite, configuring models, and writing new evals. - Update docs/installation/ai-assistant.md to reflect pydantic-ai provider configuration replacing the previous udspy setup. * fix(assistant): fix test patch paths, optional filter args, and eval marker - Fix mock patch paths from `assistant.agent` to `assistant.agents` - Make ListTablesFilterArg fields optional to prevent LLM validation errors - Surface field_errors in create_fields tool result - Simplify EvalToolTracker to use message history inspection - Register `eval` pytest marker and skip evals by default Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: move testing docs to docs/testing/ and add PR test plan Move ai-assistant-evals.md from docs/development/ to docs/testing/, add ai-assistant-test-plan.md with manual and automated test steps for the pydantic-ai port PR. * refactor(assistant): extract row models to types/rows.py, rename utils to helpers - Move FieldDefinition, row model builders, and get_link_row_hints to new types/rows.py module with dict-of-callables dispatch replacing match/case - Simplify update model: fields are optional (omit = don't change), removing the __NO_CHANGE__ sentinel - Move get_table_rows_tools into tools.py as _build_row_tools since it builds pydantic-ai Tool objects - Rename utils.py to helpers.py for clarity, remove dead list_tables - Add docstrings with :param/:returns to all public functions, add proper type annotations throughout * refactor(assistant): flatten field/view/filter types into single models - Replace per-type config classes in fields.py with a single flat FieldItemCreate model using optional type-specific fields and a model_validator for type aliases - Simplify view_filters.py and views.py type hierarchies similarly - Update table.py types and corresponding tests * fix(assistant): improve telemetry span processor and minor tweaks - Replace SpanExporter with SpanProcessor for real-time span handling - Remap child tool spans past 'running tools' grouping span - Parse JSON string arguments in tool call parts - Add data_brief parameter to sample rows prompt - Disable reasoning_effort for groq models temporarily - Rename _fix_formula to _fix_formula_field in helpers * refactor(assistant): use ISO strings for dates and add dateutil fallback Replace Date/Datetime Pydantic model objects with compact ISO 8601 strings. Add lenient parsing via dateutil.parser fallback when fromisoformat fails. * feat(assistant): add DO/EXPLAIN agent modes with switch_mode tool Introduce AgentMode enum (DO/EXPLAIN) and ModeAwareToolset that filters available tools based on mode. DO mode (default) exposes all action tools except search_user_docs. EXPLAIN mode exposes only read-only tools (list_*, navigate) plus search_user_docs for answering Baserow feature questions. The switch_mode tool allows bidirectional switching. * refactor(assistant): flatten automation node types and add $formula: convention Replace 13+ per-type action node classes (RouterNodeCreate, SendEmailActionCreate, etc.) with a single flat ActionNodeCreate model using @model_validator for per-type validation and dict-dispatched functions for ORM conversion and formula generation. Add $formula: prefix convention — values prefixed with '$formula:' are sent to the LLM formula generator, plain values become literal formulas. Also detect raw formula expressions (get(), concat(), etc.) written inline. Add trigger validation: periodic triggers now require periodic_interval (with automatic folding of flat fields), row triggers require rows_triggers_settings. * build: bump pydantic-ai-slim to 0.1.66 and anthropic to 0.84.0 * feat(assistant): add RetryingModel for transient provider error recovery Wraps pydantic-ai model instances to automatically retry on transient errors (rate limits, timeouts, server errors) with exponential backoff. Handles both streaming and non-streaming calls. * feat(assistant): add AgentMode system with ModeAwareToolset and switch_mode Introduce domain modes (DATABASE, APPLICATION, AUTOMATION, EXPLAIN) that control which tools are visible to the agent. ModeAwareToolset filters the combined toolset per-mode, registries generate per-mode manifests, and switch_mode lets the agent transition between domains. Each mode gets a cross-mode summary so the agent knows what other modes offer. * refactor(assistant): integrate RetryingModel, event-based streaming, and JSON retry Replace direct model usage with RetryingModel for resilience. Rewrite streaming to use run_stream_events for proper text/reasoning/tool event handling. Add JSON-tool-call-as-text detection with automatic retry. Auto-detect starting mode from UI context. Update model_profiles with max_tokens settings. * refactor(assistant): improve shared formula utils and add formula language reference Add RAW_FORMULA_RE for detecting raw formula expressions, needs_formula() for $formula: prefix and raw formula detection, literal_or_placeholder() for ORM value creation, and a shared formula language prompt. Improve formula generator to track remaining unresolved fields across retries. * refactor(assistant): improve database tools with routing rules and type fixes Add per-module routing rules via get_routing_rules(). Extract ToolInputError to helpers. Fix field type validators, improve row model handling, and refine view filter types. Update agents and prompts for better tool guidance. * refactor(assistant): improve automation tools with routing rules and formula handling Add per-module routing rules for automation. Improve node type handling with better formula context support. Refine automation agents and prompts. * fix(assistant): improve telemetry span processor with real-time remapping Enhance SpanProcessor with JSON arg parsing, real-time span remapping, and improved trace output handling. Update tests to cover new behavior. * test(assistant): update tests for mode system, retry logic, and type refactors Add tests for switch_mode, mode-aware manifests, JSON retry logic. Add test_assistant_automation_node_tools and test_assistant_database_field_tools. Update existing database/automation/core tests for refactored types and new tool signatures. Update eval_utils for new deps structure. * fix: lint * chore(frontend): ignore .claude dir in vite watcher and fix Nitro EMFILE - Add .claude/ to vite server watch ignore list alongside node_modules and .git to avoid unnecessary file watching in worktrees. - Configure Nitro devStorage to use fs-lite driver to prevent chokidar from watching the entire repo root, which causes EMFILE on macOS in large monorepos. * fix(tests): fix test settings and seat usage test isolation with xdist - Ensure pytest always finds backend/pytest.ini by passing -c pytest.ini explicitly, fixing DJANGO_SETTINGS_MODULE=dev when running from root - Preserve existing TEST dict keys when setting MIGRATE in test settings - Add transaction=True to seat usage tests to prevent data leaking from TransactionTestCase tests running on the same xdist worker * fix(assistant): fix field types, formula regex, validator guard, and consolidate evals Fix multiple_select returning None instead of [], link_row description typo, formula regex missing greater_than_or_equal/less_than_or_equal variants, and guard against overwriting original validators on repeated prepare_tools calls. Consolidate sample rows and navigation evals into the database tables eval file and remove the meta tool-call history test. * fix(assistant): improve tool return types, filter aliases, and eval infrastructure - Return consistent dict types from all tools instead of plain strings - Add operator aliases for view filters so LLMs can use natural names - Fix boolean filter operator (is → equal) - Remove reasoning format from UTILITY model profiles (pollutes structured output) - Add ModelRetry for workflow creation and formula agent errors - Add EvalChecklist for soft assertions with pass/fail scoring - Add EVAL_RETRIES support for flake detection in eval tests - Suppress loguru DEBUG noise during evals * refactor(assistant): extract table creation helpers and remove unused model profile - Extract _create_empty_tables and _create_table_fields from create_tables - Filter out duplicate primary field in field creation to avoid model mistakes - Remove unused gpt-oss-20b model profile - Always attempt sample rows regardless of field errors * fix(assistant): strip <think> tags and unify streaming as reasoning chunks Models like MiniMax-M2.5 emit <think>...</think> tags inline. Handle ThinkingPart/ThinkingPartDelta events from pydantic-ai and extract inline thinking from text parts as a fallback. Stream all content as AiReasoningChunk during the agent run; the final answer is emitted as AiMessageChunk by _emit_answer. * fix(assistant): simplify streaming and add collapsible reasoning UI Replace _accumulate_text/_extract_thinking with a single _get_content_delta helper that forwards text/thinking deltas. Accumulate reasoning_so_far and strip <think> tags before sending to frontend (which replaces content on each chunk). Add collapsible reasoning bubble (max 250px with fade mask and chevron toggle). * fix(assistant): bridge legacy UDSPY_LM_* env vars to pydantic-ai config * docs(assistant): improve ai-assistant.md and add AWS_REGION_NAME backward compat - Add both Bedrock auth methods (boto3 creds + bearer token) - Add section 6 with pydantic-ai model overview link and provider list - Restructure migration table: unchanged / bridged / new variables - Fix AWS_BEARER_TOKEN_BEDROCK incorrectly listed as removed - Bridge AWS_REGION_NAME to AWS_DEFAULT_REGION in settings for backward compat * minor doc/evals fixes * fix(assistant): strip unclosed <think> tags during streaming Models behind Groq emit <think> tags as text content rather than using the native thinking protocol. During streaming, the closing </think> tag may not have arrived yet, causing raw thinking content to leak to the frontend. Also strip think tags from tool thought fields and reset reasoning on tool results. * docs(assistant): use provider:model format and refresh eval docs - Update all docs to use pydantic-ai provider:model format (colon separator) - Fix mixed-up provider descriptions in configuration.md - Refresh eval docs: replace assert_no_tool_errors with EvalChecklist pattern - Add embeddings URL for local vs Docker in ai-assistant-evals.md - Skip KB sync post_migrate signal during tests - Fix temperature type in model_profiles.py - Refactor justfile PYTHONPATH for test recipe * fix(assistant): clean up navigation and improve tool types - Remove unused WorkspaceNavigationRequestType - Narrow exception catch in navigate tool to ObjectDoesNotExist - Guard id field in CreateRowModel.from_django_orm - Use id__in for batch filtering in ListTablesFilterArg * Revert "chore(frontend): ignore .claude dir in vite watcher and fix Nitro EMFILE" This reverts commit 4be1be1. * fix: ai-assistant-test-plan.md tool smoke test prompt for list_builders * fix: Posthog env var names in docs/testing/ai-assistant-test-plan.md * fix: wrong doc reference
Bumps [flatted](https://github.com/WebReflection/flatted) from 3.4.1 to 3.4.2. - [Commits](WebReflection/flatted@v3.4.1...v3.4.2) --- updated-dependencies: - dependency-name: flatted dependency-version: 3.4.2 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )