Skip to content

fix(platform): seed new orgs from builtin catalog, not default workspace#1751

Merged
larryro merged 5 commits into
mainfrom
fix/scaffold-from-builtin-catalog
May 27, 2026
Merged

fix(platform): seed new orgs from builtin catalog, not default workspace#1751
larryro merged 5 commits into
mainfrom
fix/scaffold-from-builtin-catalog

Conversation

@larryro

@larryro larryro commented May 27, 2026

Copy link
Copy Markdown
Collaborator

Why

scaffoldNewOrganization sourced from domain.resolve('default') — the
default org's writable workspace. That dir doubles as both the seeded
shipped-templates location and a mutable workspace, so anything created
while using the default org (test workflows, scratch agents) got
permanently copied into every newly-scaffolded org. Users saw stale
"Default / testing 2 / testing this" workflows in the Create automation
→ From template dialog for orgs they never touched.

For agents and providers this was worse: they use raw <slug>/
per-org subdirs with no @ marker, so copyTree's @-skip didn't
catch them — scaffolding a new org would recursively copy other
tenants'
agent/provider subdirs into the new org. Already flagged in
the copyTree comment as a known leak awaiting a domain-aware fix.

Diagnosis confirmed it is not a live cross-org query leak:
listWorkflows reads only its own dir, @-prefixed foreign paths fail
slug validation, and resolveOrgSlug throws rather than defaulting.

What

Source the scaffold from the immutable builtin catalog instead of the
default org's workspace.

  • services/platform/Dockerfile — add five *_BUILTIN_DIR ENV
    lines (stage 4 + stage 5 squash). The platform entrypoint's env-sync
    loop pushes all platform env vars into Convex's deployment env, which
    is what Node actions actually read via process.env.
  • services/platform/convex/organizations/scaffold.ts
    • Extend the DOMAINS table with builtinEnv per domain.
    • In scaffoldNewOrganization, read
      process.env[builtinEnv] ?? domain.resolve('default'). The fallback
      preserves behavior for local bun dev (where *_BUILTIN_DIR is
      unset and examples/{domain} is intentionally the catalog) and
      degrades gracefully on rollback to a pre-fix platform image.
    • In copyTree, swap statlstat and skip symbolic links —
      defence-in-depth so a symlink ever planted in any catalog dir can't
      escape. Matches the existing precedent at
      services/platform/convex/skills/file_utils.ts:249.
  • services/platform/convex/organizations/scaffold.test.ts — new
    file, 7 vitest cases covering catalog-source isolation, the agents
    cross-tenant leak, symlink rejection, dev fallback, the
    @/.history/*.secrets.json skips, per-domain idempotency, and the
    default-org early return.

Out of scope (explicit)

  • No data backfill. The per-domain dirHasFiles guard keeps
    scaffolding idempotent, so existing orgs keep whatever junk they
    already have. Only orgs created after this lands start clean. A
    cleanup pass for existing orgs is a separate task.
  • The default org itself. Its template list will still show whatever
    has been created in it — that's its own workspace and the leak only
    ever was about propagation to other orgs.
  • branding and retention domains. Branding is global by design
    (all readers hardcode 'default'); retention uses a different per-org
    shape — a single {slug}.json per org with explicit fallback to
    default.json at read time — so neither exhibits this leak class.

Notes on degradation modes

  • Rollback to a pre-fix platform image strips the five
    *_BUILTIN_DIR vars via the entrypoint's "remove vars not seen on
    platform" step. Scaffold then falls back to
    domain.resolve('default') — the original pre-fix behavior.
    Intentional graceful degradation; no flapping in normal forward-only
    deploys. ENV_SYNC_DENYLIST (would block push too) and
    ORPHAN_DERIVED (removes matching values) are both the wrong tools
    for the job — left as-is.
  • Missing-domain backfill (e.g. when skills was added to
    DOMAINS): backfill_skill_scaffolding re-invokes
    scaffoldNewOrganization; with the fix, domains that don't already
    exist for an org are now seeded from the catalog instead of the
    default workspace. Strict improvement.
  • Dev parity. bun dev doesn't set *_BUILTIN_DIR, so the leak
    can still occur locally if a developer creates content in their
    local default org. Acceptable: dev is single-developer and
    examples/{domain} is intentionally the catalog there. Flagged so
    the next person doesn't chase a "but it works in prod" ghost.

Pre-PR checklist

  • Ran bun run check (format, lint, typecheck, all tests). Only
    failure is the preexisting @tale/docs link-check (3 broken
    links from fr/legal/subprocessors.md to a not-yet-translated
    fr/legal/data-processing-agreement) — verified to fail on a
    clean main with the same 3 failed / 200 passed counts;
    unrelated to this change.
  • Updated services/platform/messages/{en,de,fr}.jsonN/A
    (no user-facing strings touched).
  • Updated /docs/{en,de,fr}/ for every user-visible change —
    N/A (internal scaffolding behavior; no UI/CLI surface change).
  • Every touched docs page has a real opening and closing — N/A
    (no docs pages touched).
  • Ran bun run --filter @tale/docs lint and
    bun run --filter @tale/docs test — covered by bun run check;
    preexisting failure noted above.
  • Updated README.md / README.de.md / README.fr.mdN/A.

Summary by CodeRabbit

  • New Features

    • Enhanced organization scaffolding with improved builtin catalog support and fallback behavior.
  • Bug Fixes

    • Improved symlink handling to prevent unintended dereferencing during organization setup.
  • Chores

    • Added environment variable configuration for builtin catalog directories.
    • Expanded test coverage for organization scaffolding scenarios.

Review Change Stack

scaffoldNewOrganization sourced from `domain.resolve('default')` — the
default org's writable workspace. That dir doubles as both the seeded
shipped-templates location and a mutable workspace, so anything created
while using the default org (test workflows, scratch agents) got
permanently copied into every newly-scaffolded org. For agents and
providers — which use raw `<slug>/` per-org subdirs with no `@` marker —
this also leaked other tenants' subdirs into new orgs, since copyTree's
`@`-skip didn't cover them.

Source the scaffold from the immutable builtin catalog instead:

- Add `*_BUILTIN_DIR` ENV declarations to the platform Dockerfile
  (stage 4 + stage 5 squash). The platform entrypoint already pushes
  all platform env vars into Convex's deployment env, which is what
  Node actions actually read.
- Extend the DOMAINS table in scaffold.ts with a `builtinEnv` per
  domain and read `process.env[builtinEnv] ?? domain.resolve('default')`
  in scaffoldNewOrganization. The fallback preserves current behavior
  for local `bun dev` (where `*_BUILTIN_DIR` is unset and
  `examples/{domain}` is intentionally the catalog) and degrades
  gracefully on rollback to a pre-fix platform image.
- In copyTree, swap `stat` → `lstat` and skip symbolic links — defence
  in depth so a symlink ever planted in any catalog dir can't escape.

No data backfill: the per-domain `dirHasFiles` guard keeps scaffolding
idempotent, so existing orgs keep whatever they already have. Only orgs
created after this change start clean.

branding and retention are intentionally not in scope: branding is
global by design (all readers hardcode 'default') and retention uses a
different per-org shape (single `{slug}.json` with explicit fallback to
`default.json` at read time), so neither exhibits the leak this targets.

Tests cover catalog-source isolation, the agents cross-tenant leak,
symlink rejection, dev fallback, the @/dot/secrets skips, per-domain
idempotency, and the default-org early return.
@coderabbitai

coderabbitai Bot commented May 27, 2026

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Walkthrough

This PR introduces immutable per-domain builtin catalog directories for organization scaffolding. Docker environment variables now declare five *_BUILTIN_DIR paths for agents, providers, integrations, workflows, and skills. The scaffoldNewOrganization function prefers these catalogs when the corresponding environment variable is set, otherwise falls back to the default org's directory for graceful degradation and local development. File copying logic is hardened to detect and skip symbolic links using lstat instead of stat, preventing unintended filesystem dereferencing. A comprehensive test suite validates scaffolding across multiple scenarios including catalog seeding, cross-tenant isolation, fallback behavior, and idempotency.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: seeding new organizations from builtin catalog instead of the default workspace.
Description check ✅ Passed The description comprehensively covers Why, What, Out of scope, Notes on degradation modes, and Pre-PR checklist. All template sections are addressed with sufficient detail and rationale.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/scaffold-from-builtin-catalog

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.

Warning

Billing warning: we have not been able to collect payment for this subscription for more than 72 hours. Please update the payment method or pay any pending invoices in Billing to avoid service interruption.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@services/platform/convex/organizations/scaffold.test.ts`:
- Around line 28-29: The test uses unsafe assertion casts around
scaffoldNewOrganization and passes "{} as never" into scaffoldHandler; remove
those casts and instead provide properly typed mocks so the handler and its
calls type-check. Import or reference the ActionConfig/handler input types used
by scaffoldNewOrganization, declare scaffoldHandler by typing
scaffoldNewOrganization with that type (or use TypeScript's "satisfies" to
assert the proper shape) rather than "as unknown as ActionConfig", and replace
every "{} as never" passed into scaffoldHandler(...) with concrete mock objects
that match the handler input/Context type (create small typed fixtures for the
request, ctx, and params expected by scaffoldHandler). Ensure all uses of
scaffoldHandler(...) use those typed mocks so no assertion casts remain.

In `@services/platform/convex/organizations/scaffold.ts`:
- Around line 260-261: The code sets sourceDir using
process.env[domain.builtinEnv] ?? domain.resolve('default'), which treats an
empty string as valid and can cause copyTree to read an unintended path; change
the selection to treat empty or whitespace-only env values as unset (e.g., check
process.env[domain.builtinEnv] is non-empty after trim and only then use it,
otherwise call domain.resolve('default')) and update scaffold.test.ts to add a
test case where the *_BUILTIN_DIR env var is set to '' (and to whitespace) to
assert the fallback to domain.resolve('default'); reference the symbols
domain.builtinEnv, domain.resolve('default'), sourceDir, copyTree, and
scaffold.test.ts when making the change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: f2c3cc60-32a0-409e-8dec-e6fffba7d5da

📥 Commits

Reviewing files that changed from the base of the PR and between cb40ea3 and 6864cb3.

📒 Files selected for processing (3)
  • services/platform/Dockerfile
  • services/platform/convex/organizations/scaffold.test.ts
  • services/platform/convex/organizations/scaffold.ts

Comment on lines +28 to +29
const scaffoldHandler = (scaffoldNewOrganization as unknown as ActionConfig)
.handler;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify current assertion usage in this test file.
rg -nP '\sas\s+(unknown|never|[A-Z])|as unknown as' services/platform/convex/organizations/scaffold.test.ts

Repository: tale-project/tale

Length of output: 615


Remove as/as unknown as/as never casts from scaffold.test.ts; type the mocks directly

services/platform/convex/organizations/scaffold.test.ts still contains disallowed TypeScript assertion casts:

  • Line 28: scaffoldNewOrganization as unknown as ActionConfig
  • Lines 94, 119, 144, 162, 187, 206, 218: {} as never passed into scaffoldHandler(...)
Problematic cast
const scaffoldHandler = (scaffoldNewOrganization as unknown as ActionConfig)
  .handler;

Replace these with a typed mock surface/contract (e.g., define the handler input type and use satisfies/proper mock objects) so .handler and scaffoldHandler(...) calls type-check without any as assertions.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@services/platform/convex/organizations/scaffold.test.ts` around lines 28 -
29, The test uses unsafe assertion casts around scaffoldNewOrganization and
passes "{} as never" into scaffoldHandler; remove those casts and instead
provide properly typed mocks so the handler and its calls type-check. Import or
reference the ActionConfig/handler input types used by scaffoldNewOrganization,
declare scaffoldHandler by typing scaffoldNewOrganization with that type (or use
TypeScript's "satisfies" to assert the proper shape) rather than "as unknown as
ActionConfig", and replace every "{} as never" passed into scaffoldHandler(...)
with concrete mock objects that match the handler input/Context type (create
small typed fixtures for the request, ctx, and params expected by
scaffoldHandler). Ensure all uses of scaffoldHandler(...) use those typed mocks
so no assertion casts remain.

Comment on lines +260 to +261
const sourceDir =
process.env[domain.builtinEnv] ?? domain.resolve('default');

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify env-based source selection and whether empty-value behavior is covered by tests.
rg -n "process\\.env\\[domain\\.builtinEnv\\].*\\?\\?|builtinEnv|sourceDir" services/platform/convex/organizations/scaffold.ts
rg -n "empty|unset|BUILTIN_DIR" services/platform/convex/organizations/scaffold.test.ts

Repository: tale-project/tale

Length of output: 1509


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "---- services/platform/convex/organizations/scaffold.ts (source selection) ----"
sed -n '230,310p' services/platform/convex/organizations/scaffold.ts

echo "---- services/platform/convex/organizations/scaffold.test.ts (env-related tests) ----"
sed -n '1,260p' services/platform/convex/organizations/scaffold.test.ts

Repository: tale-project/tale

Length of output: 10300


Guard against empty *_BUILTIN_DIR values when selecting sourceDir.

process.env[domain.builtinEnv] ?? domain.resolve('default') treats '' as a valid value, so sourceDir can become '' and copyTree will read from an unintended path. scaffold.test.ts only covers the “unset” fallback case (it deletes env vars), not the empty-string case.

Proposed fix
-      const sourceDir =
-        process.env[domain.builtinEnv] ?? domain.resolve('default');
+      const builtinDir = process.env[domain.builtinEnv];
+      const sourceDir =
+        builtinDir && builtinDir.trim().length > 0
+          ? builtinDir
+          : domain.resolve('default');
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const sourceDir =
process.env[domain.builtinEnv] ?? domain.resolve('default');
const builtinDir = process.env[domain.builtinEnv];
const sourceDir =
builtinDir && builtinDir.trim().length > 0
? builtinDir
: domain.resolve('default');
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@services/platform/convex/organizations/scaffold.ts` around lines 260 - 261,
The code sets sourceDir using process.env[domain.builtinEnv] ??
domain.resolve('default'), which treats an empty string as valid and can cause
copyTree to read an unintended path; change the selection to treat empty or
whitespace-only env values as unset (e.g., check process.env[domain.builtinEnv]
is non-empty after trim and only then use it, otherwise call
domain.resolve('default')) and update scaffold.test.ts to add a test case where
the *_BUILTIN_DIR env var is set to '' (and to whitespace) to assert the
fallback to domain.resolve('default'); reference the symbols domain.builtinEnv,
domain.resolve('default'), sourceDir, copyTree, and scaffold.test.ts when making
the change.

larryro added 4 commits May 27, 2026 13:13
…edges

Consolidate the 5 per-domain *_BUILTIN_DIR envs into one root-with-derivation
TALE_CONFIG_BUILTIN_DIR, mirroring the existing TALE_CONFIG_DIR/<domain>/
pattern used for the writable side; rename the FS layout
/app/<domain>-builtin/ → /app/builtin/<domain>/ to match
/app/data/<domain>/. One env knob, one source of truth.

Harden scaffold idempotency and observability:

- dirHasFiles now only ignores atomicWrite .tmp orphans; .history/ counts
  as occupied so re-scaffold (e.g. via backfill migration) cannot silently
  write the catalog on top of a user's surviving edit trail.

- env-set-but-missing catalog path now logs a loud console.error with the
  env name and resolved path; was previously a silent zero-seed at
  copyTree's ENOENT branch and invisible to operators on image version
  skew.

Tests cover .history-only target, .tmp orphan retry, and missing-catalog
misconfig.
…fresh → --override

Default tale deploy now docker-cp's host workspace into the convex container
without clobbering files the operator has edited through the UI. Mirrors the
.history-based skip logic the convex entrypoint's seed loops have always
used, so the two write paths apply the same "user edit wins" rule.

Coverage: agents, workflows, providers (config files; encrypted secrets
were already always-preserved), branding. Integrations has no history
mechanism today and stays at the unconditional overwrite — separate work.

Flag rename --fresh → --override: the old name described the convex
entrypoint behavior (force re-seed builtin) but missed that docker cp
always clobbered regardless. New semantics are unified — --override
bypasses .history-based protection on both layers. Encrypted provider
secrets and UI-uploaded skill bundles stay preserved even under --override
(no host source to re-derive from).

Refactor: stageProvidersWithoutConflictingSecrets +
stageSkillsWithoutConflictingBundles collapse into one stageWithSkips
helper driven by a per-reason SkipEntry list. New listContainerHistorySlugs
reuses the same find-in-container pattern. Preserved-message logging now
groups by reason so operators see which protection kicked in and the
correct escape hatch per reason.

tale start --fresh is unchanged — its scope is only the convex entrypoint
(no docker cp from host CWD), and "fresh seed" still describes it
accurately.
The flag's apparent purpose — re-seed builtin configs after an upgrade —
is already automatic: the convex entrypoint's seed marker is version-scoped
(/app/data/.seeded-${TALE_VERSION}), so a new version always re-runs the
seed and picks up new builtin templates, with per-file skip-if-exists
preserving local edits.

What remained was a niche "reset builtin-originated config to factory while
keeping custom files" operation behind a misleadingly-named flag that
silently overwrote local dev edits — the same footgun class just removed
from tale deploy. Operators who want that reset can delete the specific
files (or git checkout) and restart; the underlying FORCE_SEED env stays
available via compose override for the rare power-user case.

Removes the flag, its plumbing through start.ts and the dev compose
generator, and the documented line in all three READMEs.
…override

Replace the per-slug `.history` skip machinery with a two-mode model:

- default `tale deploy`: do not push host config at all. The convex
  container self-seeds builtin defaults on first start and UI edits stay
  authoritative; host config is pushed only on explicit `--override`.
- `tale deploy --override`: overwrite container config from the host
  workspace, always preserving `*.secrets.json` files and `.history/`
  directories. Since `docker cp` is additive, preservation is just
  excluding those paths from the staged copy via an fs.cp filter — no
  container queries needed.

This deletes buildSkipsForDir, HISTORY_SLUG_TO_RELPATH, logPreserved,
findInContainer and the listContainer* helpers, and resolves several
review findings at once: providers' inert `.history` guard, the
unprotected integrations/per-org paths, the skills special-case, and
host-committed `.history/` polluting the container edit trail. Also fixes
the stage temp-dir leak on copy failure (register before I/O).

Drop the FORCE_SEED coupling from `--override`: it re-seeded from the
builtin catalog (a different source than the host workspace) and could
resurrect files deleted locally. `--override` is now purely host-
authoritative. The entrypoint FORCE_SEED branches are left in place
(now dead-on-deploy, harmless).

scaffold:
- fix EACCES double-log: a non-ENOENT stat error no longer also logs a
  false "does not exist" line.
- harden copyTree: agents/providers are flat domains and never recurse
  into subdirs, making the raw-slug cross-tenant leak structurally
  impossible on any source path (catalog or dev fallback), with no
  change to dev ergonomics.
@larryro larryro merged commit 95438c8 into main May 27, 2026
31 of 32 checks passed
@larryro larryro deleted the fix/scaffold-from-builtin-catalog branch May 27, 2026 07:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant