feat(platform): env-var source for provider API keys (#1711) by larryro · Pull Request #1832 · tale-project/tale

larryro · 2026-06-05T03:38:04Z

Closes #1711.

What

Adds an opt-in secretsEnv field at provider and model level in the public <provider>.json config, so operators whose secrets live in Kubernetes Secrets / Vault / cloud secret managers can inject an LLM provider API key as an environment variable instead of round-tripping it through a *.secrets.json file.

The env-var name is not a secret, so it lives in the committable config (survives tale deploy --override, which leaves *.secrets.json untouched).

Two of the original blockers in the issue already resolved on main: SOPS is optional now (#1672), and the entrypoint syncs all platform env vars to Convex (so process.env.OPENROUTER_API_KEY already reaches Node actions). The "dependent change" to ENV_VARS_TO_SYNC is therefore not needed.

Resolution order

model.secretsEnv → provider.secretsEnv → file modelKeys[id] → file apiKey. Each tier is skipped when it yields nothing, so a configured-but-empty variable falls back to the file.

Strictly additive: env is consulted only when secretsEnv is set, so existing file-only deployments are byte-for-byte unchanged. Env values are trimmed (trailing-newline footgun); file values pass through raw (consistent with the Python loader).

Security — required allowlist

Making env-only providers reachable is a net-new exfiltration primitive: a config-write actor could point secretsEnv at a deployment secret (SOPS_AGE_KEY, BETTER_AUTH_SECRET, …) — all synced into the Node action's process.env — and have it sent as a bearer token to a provider baseUrl (the chat path does not re-check host policy).

Env reads are therefore gated by a required operator allowlist TALE_PROVIDER_SECRET_ENV_ALLOWLIST (comma-separated, empty = locked), enforced identically in TS and Python. Names are capped at 40 chars to match the platform→Convex env-name sync limit.

Recommended follow-up (out of scope here): also call checkProviderHostPolicy(baseUrl) at the chat resolution sites to close a pre-existing file-key gap. The allowlist already neutralizes the env-specific primitive.

Changes

secret_resolver.ts (new, pure): allowlist-gated envSecret / resolveApiKey / providerHasEnvKey / envSecretStatus + warn-once on configured-but-unresolved.
file_actions.ts: loadAllProviders keeps env-only providers (secrets: null); a shared resolveModelApiKey at the 3 resolution sites throws a failover-eligible MissingApiKeyError; the two direct-read paths log genuine decrypt failures instead of swallowing them; listProviders folds provider-level env into hasApiKey and model-level env into per-model hasApiKeyOverride (a keyless sibling model is never falsely unblocked); readProvider returns a tri-state envSecretStatus.
Python providers.py: secrets_env on both dataclasses; _resolve_api_key (3-tier, returns "" not None); all three accessors use it.
Settings UI: provider-level and per-model secretsEnv fields (wired through form / hydration / submit so editing a model never wipes the binding) + an env-source indicator (Set / Not set / Not allowlisted) in the API Key section.
i18n (en/de/fr; de-CH intentionally skipped), docs (providers.md incl. the <name>.config.json → <name>.json filename fix, secrets-with-sops.md, environment-reference.md), and .env.example.

Provenance

This PR was shaped by a two-round, 15-lens adversarial review of the plan before any code was written; the security allowlist, the composer-gate fix, the per-model hydration fix, the 40-char cap, and the docs filename fix all came out of that review.

Tests

TS: secret_resolver.test.ts (allowlist gate, precedence, trim semantics, status), schema secretsEnv cases, MissingApiKeyError failover classification.
Python: test_providers.py (env-only, allowlist locked, env-overrides-file, empty-env fallback, model-env precedence, trim, empty-key contract).
Verified locally: tsc ✓, oxlint ✓, format:check (incl. ruff) ✓, i18n parity ✓, 387 provider-area vitest ✓, 8 pytest ✓, 139 docs tests ✓.

Summary by CodeRabbit

New Features
- Providers can now source API keys from environment variables via new secretsEnv configuration, controlled by the TALE_PROVIDER_SECRET_ENV_ALLOWLIST environment variable allowlist.
- Settings UI updated to display and configure environment variable-based API key sources for providers and individual models.
Documentation
- Updated configuration guides for environment variable API key sourcing (English, German, French).
- Provider configuration file naming updated from <name>.config.json to <name>.json.

coderabbitai · 2026-06-05T03:49:32Z

📝 Walkthrough

Walkthrough

This pull request adds environment-variable allowlist support for provider API keys, addressing the need to source secrets from external systems without writing cleartext files. The implementation includes: a schema contract with secretsEnv fields at provider and model levels validated against a 40-character identifier pattern; allowlist-driven resolution in TypeScript (secret_resolver.ts) and Python (providers.py) with precedence order (model env → provider env → file keys); schema validation with corresponding tests; updated file actions to treat secrets as nullable and apply defensive parsing for env-only providers; a new MissingApiKeyError for failover-eligible scenarios; frontend UI components for editing and displaying env-var status; localization strings in three languages; and comprehensive documentation across configuration guides explaining the allowlist, file format changes (*.config.json → *.json), and process-start vs. hot-reload behavior.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

tale-project/tale#1732: Restructures the provider settings UI in provider-detail-drawer.tsx, overlapping with this PR's changes to the same component's API key section rendering and model form state.
tale-project/tale#1729: Updates provider API-key resolution and fetchConfiguredProviderModels flow in file_actions.ts, directly related to this PR's refactoring of those same functions for env-key support.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 65.52% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically describes the main feature: adding environment variable sourcing for provider API keys, with the issue number (`#1711`) for traceability.
Description check	✅ Passed	The PR description comprehensively covers the what, why, security considerations, resolution order, changes, and test verification. It addresses all key aspects expected in a description template for a complex feature.
Linked Issues check	✅ Passed	The PR fulfills issue `#1711`'s core requirements: opt-in env-var fallback for provider API keys with a configurable allowlist, supports Kubernetes/Vault/cloud secret managers, maintains SOPS as primary, and implements resolution order (model env → provider env → file).
Out of Scope Changes check	✅ Passed	All changes are directly aligned with the feature scope: env-var secret resolution, allowlist enforcement, settings UI updates, i18n, documentation, tests, and schema updates. No unrelated refactoring or scope creep detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/provider-secrets-env

Warning

Billing warning: we have not been able to collect payment for this subscription for more than 72 hours. Please update the payment method or pay any pending invoices in Billing to avoid service interruption.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

packages/tale_shared/src/tale_shared/config/providers.py (1)

166-175: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate secretsEnv against the shared provider-config contract.

These assignments accept any JSON string verbatim, but this PR's contract limits secretsEnv to a 40-character identifier. A hand-edited <provider>.json can therefore be rejected by Convex and still be accepted by the Python services, so the same config file resolves differently across runtimes.

Suggested fix

+import re
@@
 _ALLOWLIST_ENV = "TALE_PROVIDER_SECRET_ENV_ALLOWLIST"
+_SECRETS_ENV_RE = re.compile(r"^[A-Z][A-Z0-9_]{0,39}$")
+
+
+def _parse_secrets_env(value: object) -> str | None:
+    if value is None:
+        return None
+    if isinstance(value, str) and _SECRETS_ENV_RE.fullmatch(value):
+        return value
+    logger.warning("Ignoring invalid secretsEnv value: %r", value)
+    return None
@@
                     description=m.get("description", ""),
                     dimensions=m.get("dimensions"),
-                    secrets_env=m.get("secretsEnv"),
+                    secrets_env=_parse_secrets_env(m.get("secretsEnv")),
                 )
             )
@@
                 supports_structured_outputs=data.get("supportsStructuredOutputs", False),
                 api_key=api_key,
                 defaults=defaults,
-                secrets_env=data.get("secretsEnv"),
+                secrets_env=_parse_secrets_env(data.get("secretsEnv")),
             )
         )

Also applies to: 190-200

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/tale_shared/src/tale_shared/config/providers.py` around lines 166 -
175, The ModelConfig construction currently accepts any JSON value for
secretsEnv; validate this value before assigning it to ModelConfig.secrets_env:
if m.get("secretsEnv") is present ensure it is a string of exactly 40 characters
and matches the allowed identifier pattern (e.g. only alphanumeric plus -/_ as
your contract requires), otherwise raise a ValueError (or skip/transform per
project convention). Update the same validation in the other model-parsing block
referenced (the code around the other ModelConfig creation at lines ~190-200) so
both paths enforce the shared provider-config contract; refer to ModelConfig and
the secrets_env/secretsEnv keys to find and change the two locations.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/tale_shared/src/tale_shared/config/providers.py`:
- Around line 54-80: _env_secret currently logs a warning every time a
secretsEnv lookup is unresolved which floods logs; change it to warn only once
per unique (name, reason) combination: add a module-level tracking set (e.g.
_env_secret_warnings: set[tuple[str, str]] = set()) and, inside _env_secret,
before each logger.warning call (the "not in allowlist" warning and the
"empty/unset" warning), check if (name, "not_allowlisted") or (name, "empty") is
in that set and only call logger.warning and then add the tuple if it wasn't
present; keep the rest of _env_secret logic unchanged so
get_chat_model/get_embedding_model/get_vision_model get the same fallback
behavior but without per-request log spam.

In `@services/platform/convex/providers/file_actions.ts`:
- Around line 1949-1969: The probe loop currently resolves per-model credentials
but always targets config.baseUrl; update the probe dispatch to honor the
per-model endpoint by using the model-level override when present (i.e., use
model.baseUrl ?? config.baseUrl) wherever probes are constructed or invoked (the
block iterating config.models that calls resolveApiKey and subsequently
dispatches probes), ensuring consistency with resolveModelData/resolveModelByTag
which already use definition.baseUrl ?? provider.config.baseUrl; adjust any
probe helper calls and the endpoint passed to them so each model is probed
against its effective baseUrl.
- Around line 291-304: The console.warn in the env-fallback branch currently
logs raw decrypt/parse error text via the local reason variable which may leak
secrets; update the branch around providerHasEnvKey(result.data) so it does not
emit the raw reason — either call the existing sanitization helper used
elsewhere in this file (e.g., mask/sanitize function used for decrypt/parse
errors) and log that sanitizedReason, or remove reason from the warn and log
only a generic message referencing providerName and that the secrets file could
not be read; update the console.warn invocation and any local variables (reason)
accordingly to ensure no secret fragments are written to logs.

In `@services/platform/lib/shared/schemas/providers.ts`:
- Around line 488-502: The EnvSecretStatus interface lacks runtime validation;
define a Zod schema (e.g., envSecretStatusSchema) that matches EnvSecretStatus
(fields: name?: string, allowlisted: boolean, resolved: boolean), export it
alongside the interface, and then validate backend action responses in the
client code (e.g., in queries.ts) using envSecretStatusSchema.safeParse() and
handle parse failures (throw/log or return a safe fallback) before casting/using
the data.

---

Outside diff comments:
In `@packages/tale_shared/src/tale_shared/config/providers.py`:
- Around line 166-175: The ModelConfig construction currently accepts any JSON
value for secretsEnv; validate this value before assigning it to
ModelConfig.secrets_env: if m.get("secretsEnv") is present ensure it is a string
of exactly 40 characters and matches the allowed identifier pattern (e.g. only
alphanumeric plus -/_ as your contract requires), otherwise raise a ValueError
(or skip/transform per project convention). Update the same validation in the
other model-parsing block referenced (the code around the other ModelConfig
creation at lines ~190-200) so both paths enforce the shared provider-config
contract; refer to ModelConfig and the secrets_env/secretsEnv keys to find and
change the two locations.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 701c7655-0a26-47a1-a7a0-3e6df21c8af5

📥 Commits

Reviewing files that changed from the base of the PR and between 296841c and 6110776.

📒 Files selected for processing (26)

.env.example
docs/de/self-hosted/configuration/environment-reference.md
docs/de/self-hosted/configuration/providers.md
docs/de/self-hosted/configuration/secrets-with-sops.md
docs/en/self-hosted/configuration/environment-reference.md
docs/en/self-hosted/configuration/providers.md
docs/en/self-hosted/configuration/secrets-with-sops.md
docs/fr/self-hosted/configuration/environment-reference.md
docs/fr/self-hosted/configuration/providers.md
docs/fr/self-hosted/configuration/secrets-with-sops.md
packages/tale_shared/src/tale_shared/config/providers.py
packages/tale_shared/tests/test_providers.py
services/docs/app/content/frontmatter.json
services/platform/app/features/settings/providers/components/provider-detail-drawer.tsx
services/platform/app/features/settings/providers/components/provider-edit-panel.tsx
services/platform/app/features/settings/providers/hooks/queries.ts
services/platform/convex/providers/errors.test.ts
services/platform/convex/providers/errors.ts
services/platform/convex/providers/file_actions.ts
services/platform/convex/providers/secret_resolver.test.ts
services/platform/convex/providers/secret_resolver.ts
services/platform/lib/shared/schemas/providers.test.ts
services/platform/lib/shared/schemas/providers.ts
services/platform/messages/de.json
services/platform/messages/en.json
services/platform/messages/fr.json

coderabbitai · 2026-06-05T03:49:35Z

+def _env_secret(name: str | None) -> str | None:
+    """Resolve an env-var name to its trimmed value, honoring the allowlist.
+
+    Returns None when the name is missing, not allowlisted, or the env var is
+    empty/whitespace. Trailing-newline normalization (a common Vault/k8s
+    injection footgun) is applied to env values here.
+    """
+    if not name:
+        return None
+    raw = os.environ.get(_ALLOWLIST_ENV, "")
+    allowlist = {n.strip() for n in raw.split(",") if n.strip()}
+    if name not in allowlist:
+        logger.warning(
+            "secretsEnv %r is not in %s (empty allowlist disables the env key "
+            "source) — falling back to the secrets file",
+            name,
+            _ALLOWLIST_ENV,
+        )
+        return None
+    value = os.environ.get(name, "").strip()
+    if not value:
+        logger.warning(
+            "secretsEnv %r is set but the env var is empty/unset — falling back to the secrets file",
+            name,
+        )
+        return None
+    return value


🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Don't warn on every unresolved secretsEnv lookup.

get_chat_model(), get_embedding_model(), and get_vision_model() call through this resolver on runtime paths, so a configured-but-empty or non-allowlisted env var will emit one warning per request/job even when the file fallback is working. The TypeScript side already uses warn-once behavior; the Python side should do the same to avoid log flooding.

Suggested fix

_ALLOWLIST_ENV = "TALE_PROVIDER_SECRET_ENV_ALLOWLIST" +_WARNED_UNRESOLVED: set[tuple[str, str]] = set() @@ def _env_secret(name: str | None) -> str | None: @@ if name not in allowlist: - logger.warning( - "secretsEnv %r is not in %s (empty allowlist disables the env key " - "source) — falling back to the secrets file", - name, - _ALLOWLIST_ENV, - ) + warning_key = (name, "not_allowlisted") + if warning_key not in _WARNED_UNRESOLVED: + _WARNED_UNRESOLVED.add(warning_key) + logger.warning( + "secretsEnv %r is not in %s (empty allowlist disables the env key " + "source) — falling back to the secrets file", + name, + _ALLOWLIST_ENV, + ) return None value = os.environ.get(name, "").strip() if not value: - logger.warning( - "secretsEnv %r is set but the env var is empty/unset — falling back to the secrets file", - name, - ) + warning_key = (name, "empty_or_unset") + if warning_key not in _WARNED_UNRESOLVED: + _WARNED_UNRESOLVED.add(warning_key) + logger.warning( + "secretsEnv %r is set but the env var is empty/unset — falling back to the secrets file", + name, + ) return None

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/tale_shared/src/tale_shared/config/providers.py` around lines 54 - 80, _env_secret currently logs a warning every time a secretsEnv lookup is unresolved which floods logs; change it to warn only once per unique (name, reason) combination: add a module-level tracking set (e.g. _env_secret_warnings: set[tuple[str, str]] = set()) and, inside _env_secret, before each logger.warning call (the "not in allowlist" warning and the "empty/unset" warning), check if (name, "not_allowlisted") or (name, "empty") is in that set and only call logger.warning and then add the tuple if it wasn't present; keep the rest of _env_secret logic unchanged so get_chat_model/get_embedding_model/get_vision_model get the same fallback behavior but without per-request log spam.

coderabbitai · 2026-06-05T03:49:35Z

    } catch (err) {
      const reason = err instanceof Error ? err.message : String(err);
-      // ENOENT on the secrets file means the operator has a provider
-      // config but no API key yet — the common "I just created an org"
-      // case. Classify so the UI can point at Settings → Providers.
+      // No usable secrets file. Before skipping, check whether the config can
+      // resolve a key from the environment (issue #1711): provider-level
+      // `secretsEnv` or any model-level `secretsEnv`, gated by the operator
+      // allowlist. If so, keep the provider with `secrets: null` — the
+      // per-resolution `resolveApiKey` reads the env value. Otherwise skip as
+      // before. ENOENT (the common "config but no key yet" case) drives the
+      // UI's Settings → Providers hint.
+      if (providerHasEnvKey(result.data)) {
+        console.warn(
+          `Provider "${providerName}": no secrets file, using env key source.`,
+          reason,
+        );


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Sanitize secrets-read failures before logging them.

This new env-fallback branch still logs the raw decrypt/parse error text. Those failures are already sanitized elsewhere in this file because they can carry secret material, so emitting reason here can leak key fragments into server logs.

Suggested fix

} catch (err) { const reason = err instanceof Error ? err.message : String(err); + const safeReason = sanitizeError(err); // No usable secrets file. Before skipping, check whether the config can // resolve a key from the environment (issue `#1711`): provider-level // `secretsEnv` or any model-level `secretsEnv`, gated by the operator @@ if (providerHasEnvKey(result.data)) { console.warn( `Provider "${providerName}": no secrets file, using env key source.`, - reason, + safeReason, ); providers.push({ name: providerName, config: result.data, secrets: null,

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@services/platform/convex/providers/file_actions.ts` around lines 291 - 304, The console.warn in the env-fallback branch currently logs raw decrypt/parse error text via the local reason variable which may leak secrets; update the branch around providerHasEnvKey(result.data) so it does not emit the raw reason — either call the existing sanitization helper used elsewhere in this file (e.g., mask/sanitize function used for decrypt/parse errors) and log that sanitizedReason, or remove reason from the warn and log only a generic message referencing providerName and that the secrets file could not be read; update the console.warn invocation and any local variables (reason) accordingly to ensure no secret fragments are written to logs.

coderabbitai · 2026-06-05T03:49:35Z

    for (const model of config.models) {
-      const apiKey = secrets.modelKeys?.[model.id] ?? secrets.apiKey;
+      // Resolve the per-model key (env source preferred). On nothing resolved,
+      // skip this model with a clear reason BEFORE any probe dispatch — keeps
+      // empty keys out of the probe helpers and the apiKey-keyed listingCache,
+      // and never throws the whole action for a partially-configured provider.
+      const apiKey = resolveApiKey({
+        modelSecretsEnv: model.secretsEnv,
+        providerSecretsEnv: config.secretsEnv,
+        fileModelKey: secrets?.modelKeys?.[model.id],
+        fileApiKey: secrets?.apiKey,
+      });
+      if (!apiKey) {
+        skipped.push({
+          modelId: model.id,
+          reason:
+            model.secretsEnv || config.secretsEnv
+              ? 'No API key resolved (secretsEnv unset/empty or not allowlisted, and no file key)'
+              : 'No API key configured',
+        });
+        continue;
+      }


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Honor model.baseUrl when probing each model.

This loop now resolves credentials per model, but every probe still targets config.baseUrl. Models with a baseUrl override will be tested against the wrong endpoint, even though runtime resolution in resolveModelData / resolveModelByTag uses definition.baseUrl ?? provider.config.baseUrl.

Suggested fix

for (const model of config.models) { + const baseUrl = model.baseUrl ?? config.baseUrl; + checkProviderHostPolicy(baseUrl); + // Resolve the per-model key (env source preferred). On nothing resolved, // skip this model with a clear reason BEFORE any probe dispatch — keeps // empty keys out of the probe helpers and the apiKey-keyed listingCache, // and never throws the whole action for a partially-configured provider. @@ if (isChat) { probes.push( runProbe( - buildProbeUrl(config.baseUrl, 'chat/completions'), + buildProbeUrl(baseUrl, 'chat/completions'), apiKey, { ...mergedProviderOptions, @@ } else if (isEmbedding) { probes.push( runProbe( - buildProbeUrl(config.baseUrl, 'embeddings'), + buildProbeUrl(baseUrl, 'embeddings'), apiKey, { ...mergedProviderOptions, model: model.id, input: 'hi' }, model.id, @@ } else if (isTranscription) { - probes.push(runTranscriptionProbe(config.baseUrl, apiKey, model.id)); + probes.push(runTranscriptionProbe(baseUrl, apiKey, model.id)); } else if (isTextToSpeech) { @@ probes.push( runTtsProbe( - config.baseUrl, + baseUrl, apiKey, model.id, probeVoice, @@ } else if (isImageGeneration) { @@ probes.push( - runImageListingProbe(config.baseUrl, apiKey, model.id, listingCache), + runImageListingProbe(baseUrl, apiKey, model.id, listingCache), );

Also applies to: 1987-2057

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@services/platform/convex/providers/file_actions.ts` around lines 1949 - 1969, The probe loop currently resolves per-model credentials but always targets config.baseUrl; update the probe dispatch to honor the per-model endpoint by using the model-level override when present (i.e., use model.baseUrl ?? config.baseUrl) wherever probes are constructed or invoked (the block iterating config.models that calls resolveApiKey and subsequently dispatches probes), ensuring consistency with resolveModelData/resolveModelByTag which already use definition.baseUrl ?? provider.config.baseUrl; adjust any probe helper calls and the endpoint passed to them so each model is probed against its effective baseUrl.

coderabbitai · 2026-06-05T03:49:35Z

+
+/**
+ * Resolution status of a `secretsEnv` name for the settings UI (issue #1711).
+ * Carries no key material — only whether the name is configured, allowlisted,
+ * and currently resolves. Single canonical shape shared by the server
+ * (`readProvider` / `secret_resolver.ts`) and the provider settings UI.
+ */
+export interface EnvSecretStatus {
+  /** The configured env-var name, if any. */
+  name?: string;
+  /** Whether `name` appears in `TALE_PROVIDER_SECRET_ENV_ALLOWLIST`. */
+  allowlisted: boolean;
+  /** Whether the env var currently resolves to a non-empty value. */
+  resolved: boolean;
+}


🧹 Nitpick | 🔵 Trivial | 💤 Low value

Consider using a Zod schema for runtime validation.

EnvSecretStatus is defined as a plain TypeScript interface rather than a Zod schema. Since this is consumed from the backend action response (which declares v.any()), there's no runtime validation of the shape. While acceptable for display-only data, a Zod schema would provide runtime type safety.

♻️ Optional: Define as Zod schema

-export interface EnvSecretStatus { - /** The configured env-var name, if any. */ - name?: string; - /** Whether `name` appears in `TALE_PROVIDER_SECRET_ENV_ALLOWLIST`. */ - allowlisted: boolean; - /** Whether the env var currently resolves to a non-empty value. */ - resolved: boolean; -} +export const envSecretStatusSchema = z.object({ + /** The configured env-var name, if any. */ + name: z.string().optional(), + /** Whether `name` appears in `TALE_PROVIDER_SECRET_ENV_ALLOWLIST`. */ + allowlisted: z.boolean(), + /** Whether the env var currently resolves to a non-empty value. */ + resolved: z.boolean(), +}); + +export type EnvSecretStatus = z.infer<typeof envSecretStatusSchema>;

Then validate in queries.ts with envSecretStatusSchema.safeParse() when casting the action response.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@services/platform/lib/shared/schemas/providers.ts` around lines 488 - 502, The EnvSecretStatus interface lacks runtime validation; define a Zod schema (e.g., envSecretStatusSchema) that matches EnvSecretStatus (fields: name?: string, allowlisted: boolean, resolved: boolean), export it alongside the interface, and then validate backend action responses in the client code (e.g., in queries.ts) using envSecretStatusSchema.safeParse() and handle parse failures (throw/log or return a safe fallback) before casting/using the data.

Add an opt-in `secretsEnv` field at provider and model level in the public provider config, so operators with Kubernetes Secrets / Vault / cloud secret managers can inject an LLM provider API key as an environment variable instead of round-tripping it through a *.secrets.json file. Resolution order (each tier tested for a non-empty value, else falls through): model `secretsEnv` -> provider `secretsEnv` -> file `modelKeys[id]` -> file `apiKey`. Strictly additive: env is consulted only when `secretsEnv` is set, so existing file-only deployments are unchanged. Env values are trimmed; file values pass through raw (byte-identical to before, and consistent with the Python loader). Security: env reads are gated by a required operator allowlist `TALE_PROVIDER_SECRET_ENV_ALLOWLIST` (comma-separated, empty = locked), enforced identically in TS and Python. Without it, making env-only providers reachable would let a config-write actor point `secretsEnv` at a deployment secret (SOPS_AGE_KEY, BETTER_AUTH_SECRET, ...) and exfiltrate it as a bearer token via a provider base URL. Names are capped at 40 chars to match the platform->Convex env-name sync limit. - secret_resolver.ts: pure, allowlist-gated envSecret/resolveApiKey/ providerHasEnvKey/envSecretStatus + warn-once on configured-but-unresolved. - file_actions.ts: loadAllProviders keeps env-only providers (secrets: null); shared resolveModelApiKey throws a failover-eligible MissingApiKeyError; the two direct-read paths log real decrypt failures instead of swallowing them; listProviders folds provider-level env into hasApiKey and model-level env into per-model hasApiKeyOverride (so a keyless sibling model is never falsely unblocked); readProvider returns a tri-state envSecretStatus. - Python providers.py: secrets_env on both dataclasses; _resolve_api_key (3-tier, returns "" not None); all three accessors use it. - Settings UI: provider-level and per-model secretsEnv fields (wired through form/hydration/submit so edits never wipe the binding) + an env-source indicator in the API Key section. - i18n (en/de/fr), docs (providers.md incl. the <name>.config.json -> <name>.json filename fix, secrets-with-sops.md, environment-reference.md), and .env.example.

…ver modules

…t allowlist (#1711) Replace the operator-managed `TALE_PROVIDER_SECRET_ENV_ALLOWLIST` (empty = locked) with a hardcoded reserved prefix `TALE_PROVIDER_KEY_`. A `secretsEnv` name must carry the prefix; any other name is rejected fail-closed, so it can never point at a deployment secret (SOPS_AGE_KEY, BETTER_AUTH_SECRET, ...) and have it sent as a bearer token to a provider's baseUrl. This drops the deployment-config step entirely — operators just name their key vars under the namespace. - schemas/providers.ts: add SECRETS_ENV_PREFIX / SECRETS_ENV_REGEX, tighten secretsEnv validation; rename EnvSecretStatus.allowlisted -> allowed - secret_resolver.ts: gate envSecret/envSecretStatus on the prefix; keep all three derivations (TS resolver, zod schema, Python loader) in sync - tale_shared/config/providers.py: prefix check replaces allowlist parsing - errors.ts / file_actions.ts: update unresolved-key messaging - provider settings UI: move the secretsEnv editor out of the edit panel into the API Key dialog as a file/env key-source chooser; badge "Not allowlisted" -> "Invalid prefix" - messages (en/de/fr) + self-hosted docs (en/de/fr): document the prefix - add packages/tale_shared/uv.lock

Declare secretsEnv: TALE_PROVIDER_KEY_OPENROUTER on the default OpenRouter example provider so its API key can be supplied via the reserved-prefix deployment env var from #1711 instead of a SOPS secrets file.

…ned API-key i18n keys The prefix-gating refactor (5d8749b) tightened the secretsEnv schema to require the reserved TALE_PROVIDER_KEY_ prefix and made the provider API-key dialog title generic, but missed two follow-ups that broke the Unit job: - providers.test.ts still asserted pre-prefix names (OPENROUTER_API_KEY, MODEL_KEY, _KEY) were accepted. Rewrite the secretsEnv block to validate prefix gating: accept prefixed names, reject unprefixed / bare-prefix / bad-char names, and exercise the 40-char cap at both boundaries. - settings.providers.addApiKey / replaceApiKey are no longer referenced by source (the dialog now uses providers.apiKey). Remove them from en/de/fr to clear the orphan-key i18n check.

…failover, env-key UX) Two-round review of feat/provider-secrets-env surfaced correctness fixes that survived adversarial verification. Addresses the High + Medium issues plus the genuine Low defects (pre-existing ones folded in): - cascade (High): org deletion ran four ~6K delete sweeps in one mutation, which could cross the per-mutation write budget on a large org, throw, and leave the org permanently un-deletable (GDPR Art 17). cascadeOnOrgDeleted now schedules a self-rescheduling drain that erases at most one table per pass (6000 rows for pure-DB tables, 3000 for storage-bearing ones), guaranteeing completion. Removes the now-dead memories+prefs-only continuation. - providers (Medium): resolveModelByTag threw on the first keyless tag-match instead of trying a sibling model that holds the env key keeping the provider loaded — terminal on the no-failover transcription/TTS paths. It now skips keyless candidates and throws the failover-eligible MissingApiKeyError only when none resolve. Extracted as a pure, tested selectModelByTag. - providers UI (Medium): gate the model dialog Save on secretsEnv validity, mirroring the provider dialog, so an invalid env-var name can't reach the server and fail with an opaque toast. - notifications (Low): unreadCount now enforces org membership like its list/markRead siblings — it previously leaked an arbitrary org's count. - tale_shared (Low): _env_secret tolerates a non-string secretsEnv from a hand-edited config, degrading to the file key instead of raising. - cascade test (Low): fake ctx keys rows by (table, index) so the prefs delete path is actually exercised; adds one-table-per-pass / cap / ordering / termination coverage. - docs (Low): correct the stale `<name>.config.json` provider filename in the en/de/fr overview pages.

…ret UI The API Key dialog presented the env-var override and stored file secret as a mutually-exclusive radio, but the resolver treats them as layers (env shadows file; file survives as the fallback). That mismatch silently ignored a file key entered while an env var resolved — with a false success toast — and hid a shadowed key that then resurfaced on env-clear. Drop the radio: show an effective-source banner (new pure helper computeEffectiveKeyState, mirroring the resolver precedence) plus two independent rows — stored key (replace) and env override (set/clear). The read-only card now always surfaces a stored key, tagged as a fallback when an env var shadows it. UI + i18n only; no backend change. Removing only the stored key is deferred (no clearProviderSecret action yet). Remove orphaned keySource*/replaceApiKeyDescription i18n keys; add effectiveSource*/storedKey*/setEnvVar across en/de/fr.

coderabbitai Bot requested changes Jun 5, 2026

View reviewed changes

larryro added 4 commits June 5, 2026 15:51

chore(platform): regenerate api.d.ts for auth_policy and secret_resol…

01a2b5f

…ver modules

feat(platform): wire OpenRouter example to env-var key source

e3364ec

Declare secretsEnv: TALE_PROVIDER_KEY_OPENROUTER on the default OpenRouter example provider so its API key can be supplied via the reserved-prefix deployment env var from #1711 instead of a SOPS secrets file.

larryro force-pushed the feat/provider-secrets-env branch from e2648b8 to e3364ec Compare June 5, 2026 07:51

larryro added 3 commits June 5, 2026 16:22

larryro merged commit 34b7f39 into main Jun 5, 2026
30 checks passed

larryro deleted the feat/provider-secrets-env branch June 5, 2026 14:05

yannickmonney mentioned this pull request Jun 10, 2026

test(platform): add Playwright E2E suite with mock LLM and CI workflow #1860

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(platform): env-var source for provider API keys (#1711)#1832

feat(platform): env-var source for provider API keys (#1711)#1832
larryro merged 7 commits into
mainfrom
feat/provider-secrets-env

larryro commented Jun 5, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 5, 2026

Walkthrough

Estimated code review effort

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 5, 2026

Uh oh!

coderabbitai Bot Jun 5, 2026

Uh oh!

coderabbitai Bot Jun 5, 2026

Uh oh!

coderabbitai Bot Jun 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

larryro commented Jun 5, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Resolution order

Security — required allowlist

Changes

Provenance

Tests

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 5, 2026

Walkthrough

Estimated code review effort

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

larryro commented Jun 5, 2026 •

edited by coderabbitai Bot

Loading