Skip to content

[pull] develop from baserow:develop#215

Merged
pull[bot] merged 5 commits intocode:developfrom
baserow:develop
Mar 20, 2026
Merged

[pull] develop from baserow:develop#215
pull[bot] merged 5 commits intocode:developfrom
baserow:develop

Conversation

@pull
Copy link

@pull pull bot commented Mar 20, 2026

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

silvestrid and others added 5 commits March 20, 2026 09:25
…ata; add google and anthropic models compatibility (#4951)

* chore(deps): replace udspy with pydantic-ai and opentelemetry-sdk

Replace the udspy dependency with pydantic-ai-slim (with openai, groq,
anthropic, bedrock providers) and opentelemetry-sdk for structured
telemetry collection.

* fix(sentry): exclude pydantic_ai from auto-enabling integrations

sentry-sdk's pydantic_ai integration patches ToolManager._call_tool
which was removed in pydantic-ai >= 1.x (now execute_tool_call),
causing import-time errors.

* feat(settings): add dev log file mirroring and allow embeddings URL in tests

- Add BASEROW_LOG_FILE support in dev settings to mirror logs (including
  loguru output) to a file, useful for AI-assisted debugging.
- Allow BASEROW_EMBEDDINGS_API_URL to be overridden via env in test
  settings for search_user_docs eval tests.

* feat(assistant): add message_history field to AssistantChat

Add a BinaryField to store serialized pydantic-ai message history
(JSON bytes) for multi-turn conversation context, replacing the
previous udspy-based conversation state.

* refactor(assistant): port to pydantic-ai agent framework

Replace udspy with pydantic-ai as the agent framework for the AI
assistant. Key changes:

- Add Agent definitions with typed deps (AssistantDeps) and dynamic
  toolsets for runtime tool loading
- Add deps module with AssistantDeps, ToolHelpers, and EventBus for
  streaming events to the UI
- Add history module for serializing/deserializing pydantic-ai message
  history to the database
- Add model_profiles for provider-specific configuration (Anthropic,
  OpenAI, Groq, Bedrock)
- Add toolset module with ToolGroup base class replacing the udspy
  tool registry pattern
- Add shared/ with formula_utils and sub-agent helpers
- Add tool_types.py per tool module for pydantic-ai ToolDefinition
- Port all tool modules (core, database, navigation, automation,
  search_user_docs) from udspy decorators to pydantic-ai Tool instances
- Port assistant orchestrator, handler, and prompts
- Remove signatures.py (replaced by pydantic-ai output types)

* refactor(assistant): update telemetry for pydantic-ai

Rework telemetry collection to use pydantic-ai's message history
format and OpenTelemetry SDK for structured span/event recording,
replacing the previous udspy-based telemetry hooks.

* test(assistant): update unit tests for pydantic-ai port

Rewrite assistant unit tests to use pydantic-ai's testing utilities
(TestModel, FunctionModel) instead of udspy mocks. Add new test files
for core tools, navigation tools, and search docs tools. Remove
obsolete skip file.

* test(assistant): add LLM eval test suite

Add end-to-end eval tests that run the real agent against a live LLM
to verify tool selection, schema compatibility, and output quality.

Includes evals for: navigation, core builders, database tables/rows,
sample rows, automation workflows, search_user_docs, and cross-cutting
structured output validation.

Tests are marked with @pytest.mark.eval and skipped by default.
Configure via EVAL_LLM_MODEL or EVAL_LLM_MODELS env vars.

* docs: add eval guide and update AI assistant installation docs

- Add docs/development/ai-assistant-evals.md with instructions for
  running the eval suite, configuring models, and writing new evals.
- Update docs/installation/ai-assistant.md to reflect pydantic-ai
  provider configuration replacing the previous udspy setup.

* fix(assistant): fix test patch paths, optional filter args, and eval marker

- Fix mock patch paths from `assistant.agent` to `assistant.agents`
- Make ListTablesFilterArg fields optional to prevent LLM validation errors
- Surface field_errors in create_fields tool result
- Simplify EvalToolTracker to use message history inspection
- Register `eval` pytest marker and skip evals by default

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: move testing docs to docs/testing/ and add PR test plan

Move ai-assistant-evals.md from docs/development/ to docs/testing/,
add ai-assistant-test-plan.md with manual and automated test steps
for the pydantic-ai port PR.

* refactor(assistant): extract row models to types/rows.py, rename utils to helpers

- Move FieldDefinition, row model builders, and get_link_row_hints to
  new types/rows.py module with dict-of-callables dispatch replacing
  match/case
- Simplify update model: fields are optional (omit = don't change),
  removing the __NO_CHANGE__ sentinel
- Move get_table_rows_tools into tools.py as _build_row_tools since it
  builds pydantic-ai Tool objects
- Rename utils.py to helpers.py for clarity, remove dead list_tables
- Add docstrings with :param/:returns to all public functions, add
  proper type annotations throughout

* refactor(assistant): flatten field/view/filter types into single models

- Replace per-type config classes in fields.py with a single flat
  FieldItemCreate model using optional type-specific fields and a
  model_validator for type aliases
- Simplify view_filters.py and views.py type hierarchies similarly
- Update table.py types and corresponding tests

* fix(assistant): improve telemetry span processor and minor tweaks

- Replace SpanExporter with SpanProcessor for real-time span handling
- Remap child tool spans past 'running tools' grouping span
- Parse JSON string arguments in tool call parts
- Add data_brief parameter to sample rows prompt
- Disable reasoning_effort for groq models temporarily
- Rename _fix_formula to _fix_formula_field in helpers

* refactor(assistant): use ISO strings for dates and add dateutil fallback

Replace Date/Datetime Pydantic model objects with compact ISO 8601
strings. Add lenient parsing via dateutil.parser fallback when
fromisoformat fails.

* feat(assistant): add DO/EXPLAIN agent modes with switch_mode tool

Introduce AgentMode enum (DO/EXPLAIN) and ModeAwareToolset that filters
available tools based on mode. DO mode (default) exposes all action
tools except search_user_docs. EXPLAIN mode exposes only read-only
tools (list_*, navigate) plus search_user_docs for answering Baserow
feature questions. The switch_mode tool allows bidirectional switching.

* refactor(assistant): flatten automation node types and add $formula: convention

Replace 13+ per-type action node classes (RouterNodeCreate, SendEmailActionCreate,
etc.) with a single flat ActionNodeCreate model using @model_validator for per-type
validation and dict-dispatched functions for ORM conversion and formula generation.

Add $formula: prefix convention — values prefixed with '$formula:' are sent to the
LLM formula generator, plain values become literal formulas. Also detect raw formula
expressions (get(), concat(), etc.) written inline.

Add trigger validation: periodic triggers now require periodic_interval (with
automatic folding of flat fields), row triggers require rows_triggers_settings.

* build: bump pydantic-ai-slim to 0.1.66 and anthropic to 0.84.0

* feat(assistant): add RetryingModel for transient provider error recovery

Wraps pydantic-ai model instances to automatically retry on transient
errors (rate limits, timeouts, server errors) with exponential backoff.
Handles both streaming and non-streaming calls.

* feat(assistant): add AgentMode system with ModeAwareToolset and switch_mode

Introduce domain modes (DATABASE, APPLICATION, AUTOMATION, EXPLAIN) that
control which tools are visible to the agent. ModeAwareToolset filters
the combined toolset per-mode, registries generate per-mode manifests,
and switch_mode lets the agent transition between domains. Each mode
gets a cross-mode summary so the agent knows what other modes offer.

* refactor(assistant): integrate RetryingModel, event-based streaming, and JSON retry

Replace direct model usage with RetryingModel for resilience. Rewrite
streaming to use run_stream_events for proper text/reasoning/tool event
handling. Add JSON-tool-call-as-text detection with automatic retry.
Auto-detect starting mode from UI context. Update model_profiles with
max_tokens settings.

* refactor(assistant): improve shared formula utils and add formula language reference

Add RAW_FORMULA_RE for detecting raw formula expressions, needs_formula()
for $formula: prefix and raw formula detection, literal_or_placeholder()
for ORM value creation, and a shared formula language prompt. Improve
formula generator to track remaining unresolved fields across retries.

* refactor(assistant): improve database tools with routing rules and type fixes

Add per-module routing rules via get_routing_rules(). Extract ToolInputError
to helpers. Fix field type validators, improve row model handling, and
refine view filter types. Update agents and prompts for better tool
guidance.

* refactor(assistant): improve automation tools with routing rules and formula handling

Add per-module routing rules for automation. Improve node type handling
with better formula context support. Refine automation agents and prompts.

* fix(assistant): improve telemetry span processor with real-time remapping

Enhance SpanProcessor with JSON arg parsing, real-time span remapping,
and improved trace output handling. Update tests to cover new behavior.

* test(assistant): update tests for mode system, retry logic, and type refactors

Add tests for switch_mode, mode-aware manifests, JSON retry logic.
Add test_assistant_automation_node_tools and test_assistant_database_field_tools.
Update existing database/automation/core tests for refactored types and
new tool signatures. Update eval_utils for new deps structure.

* fix: lint

* chore(frontend): ignore .claude dir in vite watcher and fix Nitro EMFILE

- Add .claude/ to vite server watch ignore list alongside node_modules
  and .git to avoid unnecessary file watching in worktrees.
- Configure Nitro devStorage to use fs-lite driver to prevent chokidar
  from watching the entire repo root, which causes EMFILE on macOS in
  large monorepos.

* fix(tests): fix test settings and seat usage test isolation with xdist

- Ensure pytest always finds backend/pytest.ini by passing -c pytest.ini
  explicitly, fixing DJANGO_SETTINGS_MODULE=dev when running from root
- Preserve existing TEST dict keys when setting MIGRATE in test settings
- Add transaction=True to seat usage tests to prevent data leaking from
  TransactionTestCase tests running on the same xdist worker

* fix(assistant): fix field types, formula regex, validator guard, and consolidate evals

Fix multiple_select returning None instead of [], link_row description
typo, formula regex missing greater_than_or_equal/less_than_or_equal
variants, and guard against overwriting original validators on repeated
prepare_tools calls. Consolidate sample rows and navigation evals into
the database tables eval file and remove the meta tool-call history test.

* fix(assistant): improve tool return types, filter aliases, and eval infrastructure

- Return consistent dict types from all tools instead of plain strings
- Add operator aliases for view filters so LLMs can use natural names
- Fix boolean filter operator (is → equal)
- Remove reasoning format from UTILITY model profiles (pollutes structured output)
- Add ModelRetry for workflow creation and formula agent errors
- Add EvalChecklist for soft assertions with pass/fail scoring
- Add EVAL_RETRIES support for flake detection in eval tests
- Suppress loguru DEBUG noise during evals

* refactor(assistant): extract table creation helpers and remove unused model profile

- Extract _create_empty_tables and _create_table_fields from create_tables
- Filter out duplicate primary field in field creation to avoid model mistakes
- Remove unused gpt-oss-20b model profile
- Always attempt sample rows regardless of field errors

* fix(assistant): strip <think> tags and unify streaming as reasoning chunks

Models like MiniMax-M2.5 emit <think>...</think> tags inline. Handle
ThinkingPart/ThinkingPartDelta events from pydantic-ai and extract
inline thinking from text parts as a fallback. Stream all content as
AiReasoningChunk during the agent run; the final answer is emitted
as AiMessageChunk by _emit_answer.

* fix(assistant): simplify streaming and add collapsible reasoning UI

Replace _accumulate_text/_extract_thinking with a single _get_content_delta
helper that forwards text/thinking deltas. Accumulate reasoning_so_far and
strip <think> tags before sending to frontend (which replaces content on
each chunk). Add collapsible reasoning bubble (max 250px with fade mask
and chevron toggle).

* fix(assistant): bridge legacy UDSPY_LM_* env vars to pydantic-ai config

* docs(assistant): improve ai-assistant.md and add AWS_REGION_NAME backward compat

- Add both Bedrock auth methods (boto3 creds + bearer token)
- Add section 6 with pydantic-ai model overview link and provider list
- Restructure migration table: unchanged / bridged / new variables
- Fix AWS_BEARER_TOKEN_BEDROCK incorrectly listed as removed
- Bridge AWS_REGION_NAME to AWS_DEFAULT_REGION in settings for backward compat

* minor doc/evals fixes

* fix(assistant): strip unclosed <think> tags during streaming

Models behind Groq emit <think> tags as text content rather than using
the native thinking protocol. During streaming, the closing </think>
tag may not have arrived yet, causing raw thinking content to leak to
the frontend. Also strip think tags from tool thought fields and reset
reasoning on tool results.

* docs(assistant): use provider:model format and refresh eval docs

- Update all docs to use pydantic-ai provider:model format (colon separator)
- Fix mixed-up provider descriptions in configuration.md
- Refresh eval docs: replace assert_no_tool_errors with EvalChecklist pattern
- Add embeddings URL for local vs Docker in ai-assistant-evals.md
- Skip KB sync post_migrate signal during tests
- Fix temperature type in model_profiles.py
- Refactor justfile PYTHONPATH for test recipe

* fix(assistant): clean up navigation and improve tool types

- Remove unused WorkspaceNavigationRequestType
- Narrow exception catch in navigate tool to ObjectDoesNotExist
- Guard id field in CreateRowModel.from_django_orm
- Use id__in for batch filtering in ListTablesFilterArg

* Revert "chore(frontend): ignore .claude dir in vite watcher and fix Nitro EMFILE"

This reverts commit 4be1be1.

* fix: ai-assistant-test-plan.md tool smoke test prompt for list_builders

* fix: Posthog env var names in docs/testing/ai-assistant-test-plan.md

* fix: wrong doc reference
Bumps [flatted](https://github.com/WebReflection/flatted) from 3.4.1 to 3.4.2.
- [Commits](WebReflection/flatted@v3.4.1...v3.4.2)

---
updated-dependencies:
- dependency-name: flatted
  dependency-version: 3.4.2
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
@pull pull bot locked and limited conversation to collaborators Mar 20, 2026
@pull pull bot added the ⤵️ pull label Mar 20, 2026
@pull pull bot merged commit 0f42093 into code:develop Mar 20, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants