feat(cli): add Tale CLI with blue-green deployment support by larryro · Pull Request #301 · tale-project/tale

larryro · 2026-01-29T08:32:51Z

Summary

Add @tale/cli - a self-contained Bun CLI tool for managing Tale deployments
Implements blue-green deployment strategy with zero-downtime updates
Binary output: tale (cross-platform: Linux x64, macOS ARM64, Windows x64)

New Command Structure

tale deploy <version>        # Deploy a new version
tale deploy rollback         # Rollback to previous version
tale deploy status           # Show deployment status
tale deploy cleanup          # Remove inactive containers
tale deploy reset            # Reset all containers
tale deploy logs <service>   # View service logs

Features

Blue-green deployments - Zero-downtime with automatic traffic switching
Dry-run mode - Preview deployments before executing
Per-service updates - Deploy specific services with --services
Health checks - Wait for containers to be healthy before switching
Rollback support - Quick rollback to previous versions
Lock mechanism - Prevent concurrent deployments

Architecture

src/commands/deploy/ - Deploy command group
src/lib/compose/ - Docker Compose YAML generation
src/lib/docker/ - Docker CLI operations
src/lib/state/ - Deployment state management
src/utils/ - Logging and environment loading

Test plan

tale --help shows CLI commands
tale deploy --help shows deploy subcommands
tale deploy status shows container status
tale deploy 1.0.0 --dry-run simulates deployment
tale deploy logs db --tail 5 shows container logs
TypeScript typecheck passes

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Introduced Tale CLI for deployment management with commands: deploy, rollback, status, logs, cleanup, and reset
- Added blue-green deployment orchestration for zero-downtime updates
- Added loading page display during deployment transitions
Documentation
- Added comprehensive CLI documentation with usage examples and command reference
- Updated deployment guidance to use the new tale-deploy CLI tool
Chores
- Removed legacy deployment automation components
- Streamlined deployment infrastructure with updated proxy configuration

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Create a new deployment CLI that generates Docker Compose configs inline with security-hardened settings. Only ports 80/443 are exposed in production. Features: - Blue-green deployment with zero-downtime - Rollback to previous version - Status monitoring - Cleanup of inactive containers - Reset functionality The CLI compiles to a single binary for easy server deployment. Refs #294

Build the tale-deploy binary for Linux x64 and upload to GitHub Releases. Triggers: - Manual dispatch with optional release tag - Push to main when tools/deploy/ changes Refs #294

Add clear warning that compose.yml exposes ports for development only. Production deployments should use the tale-deploy CLI. Refs #294

Remove files replaced by the new tale-deploy CLI: - compose.blue.yml - compose.green.yml - scripts/deploy.sh Refs #294

- Fix network aliases format (use object instead of array) - Fix external volume declaration (remove driver when external) - Write compose file to deploy dir for correct env_file resolution - Add ensureVolumes and ensureNetwork to create resources before deploy Refs #294

…lify proxy - Add --dry-run flag to deploy and reset commands for previewing changes - Add --services flag to deploy specific services without full blue-green switch - Add in-place update mode when deploying individual services - Add logs command for viewing service container logs - Add --version flag to rollback for targeting a specific version - Rename --update-stateful/--include-stateful to -a/--all - Simplify Caddyfile to route via single platform DNS alias - Add service type guards and ALL_SERVICES constant

Split monolithic docker/client.ts and docker/health.ts into focused single-responsibility modules (container, exec, network, volume, pull-image, wait-for-healthy, check-http-health, image-exists). Rename command exports from xxxCommand to plain names. Add mac and windows build targets.

…-purpose files Break down monolithic modules (state/lock.ts, state/deployment.ts, compose/services/*, utils/env.ts) into focused single-function files following the same pattern applied to docker modules. Update all command imports accordingly.

…neric aliases

…allback

…rollback

…ealthy

Build Linux x64, macOS ARM64, and Windows x64 binaries in CI instead of only Linux. Each platform binary is uploaded as a separate artifact and attached to releases.

Now that CI builds for all platforms, replace string template paths with path.join to handle platform-specific path separators correctly.

…bcommand Restructure the deployment CLI to be a general-purpose Tale CLI tool: - Rename package from @tale/deploy to @tale/cli - Move tools/deploy/ to tools/cli/ - Change binary output from tale-deploy-* to tale - Nest deploy commands under `tale deploy` subcommand: - tale deploy <version> - Deploy a new version - tale deploy rollback - Rollback to previous version - tale deploy status - Show deployment status - tale deploy cleanup - Remove inactive containers - tale deploy reset - Reset all containers - tale deploy logs - View service logs - Reorganize source structure: - src/commands/deploy/ - Deploy command group - src/lib/compose/ - Docker Compose generation - src/lib/docker/ - Docker operations - src/lib/state/ - State management - Update CI workflow to build tale binary with platform suffixes - Add tools/cli to npm workspaces This structure enables future command groups (tale db, tale config, etc.)

coderabbitai · 2026-01-29T08:43:28Z

📝 Walkthrough

Walkthrough

This pull request replaces the shell-based blue-green deployment orchestration with a comprehensive TypeScript CLI tool. It removes the legacy scripts/deploy.sh and Docker Compose overlay files (compose.blue.yml, compose.green.yml), introduces a new tools/cli package with Bun and Commander providing deploy, rollback, status, logs, cleanup, and reset commands. The CLI includes Docker Compose generators for different deployment scenarios, state management (color tracking, version history, deployment locking), comprehensive Docker operations, and health check utilities. Supporting changes include a GitHub Actions workflow for multi-platform CLI builds, package.json workspace additions, and updated proxy documentation.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

feat(deployment): implement zero-downtime blue-green deployment #60 — Implements the prior blue-green deployment strategy using shell scripts and Docker Compose overlays that this PR replaces with a TypeScript CLI implementation
feat: git tag-based version management and release workflow #163 — Updates deployment scripts and compose files to add VERSION and release logic, addressing similar deployment tooling artifacts that this PR modifies
feat(deployment): implement zero-downtime blue-green deployment #50 — Removes the blue-green deployment overlay files, which this PR also eliminates while providing equivalent functionality via the new CLI tool

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 27

🤖 Fix all issues with AI agents

In @.github/workflows/cli.yml:
- Around line 96-108: The workflow step "Upload to latest release" currently
runs on every push and uses gh release upload with --clobber; change it to only
run for tagged pushes (e.g., update the step condition to check github.ref
startsWith 'refs/tags/' or require a provided release_tag input) and stop using
--clobber so published assets aren’t overwritten (update the LATEST_TAG logic to
use the tag from github.ref or the provided release_tag and call gh release
upload without --clobber, failing the job on upload errors instead of silently
echoing).

In `@compose.yml`:
- Around line 7-8: Replace the incorrect CLI reference "tale-deploy deploy
<version>" with the correct binary invocation "tale deploy <version>" wherever
it appears in the compose.yml (including the other occurrence noted in the
comment), ensuring all documentation strings and examples use "tale deploy
<version>" consistently.

In `@tools/cli/README.md`:
- Around line 79-86: The markdown tables in tools/cli/README.md have
inconsistent pipe spacing in their separator rows; update each table's separator
line to use a consistent format (e.g., "| --- | --- |") and normalize spacing
around pipes for the rows that list options like `-a, --all`, `-s, --services
<list>`, `--dry-run`, `-d, --dir <path>`, and `--host <hostname>` so
markdownlint MD060 passes; apply the same normalization to the other table
blocks referenced (lines near the other ranges) to ensure all tables use
identical separator formatting and alignment.

In `@tools/cli/src/commands/deploy/cleanup.ts`:
- Around line 32-47: The loop only removes containers when
isContainerRunning(...) is true; change it so containers for each
ROTATABLE_SERVICES inactiveColor are removed regardless of running state:
compute containerName as before, optionally attempt stopContainer(containerName)
if isContainerRunning returns true (or always attempt but ignore errors), then
always call removeContainer(containerName) (since it uses docker rm -f) and
handle warnings if removeContainer returns false; update references to
isContainerRunning, stopContainer, removeContainer, ROTATABLE_SERVICES,
containerName, and cleaned to ensure cleaned increments when a container was
actually removed.
- Around line 36-46: The cleaned counter is incremented unconditionally even
when stopContainer or removeContainer fails; update the cleanup logic around the
running check so that cleaned++ only runs when the cleanup actually succeeded
(e.g., require both stopped and removed be true, or at minimum removed be true)
— modify the block referencing running, stopContainer(containerName),
removeContainer(containerName), and cleaned so you move the increment inside an
if that checks the boolean results and leave the warning logs as-is when either
operation fails.

In `@tools/cli/src/commands/deploy/deploy.ts`:
- Around line 151-163: Extract the duplicated volume/network setup into a helper
(e.g., ensureInfrastructure) and replace the two inline blocks in deploy (the
in-place path and the blue-green path) to call it; the helper should take
projectName, dryRun, and prefix, declare the requiredVolumes array
(["platform-convex-data","caddy-data","rag-data"]), log dry-run actions via
logger when dryRun is true, and otherwise call ensureVolumes(projectName,
requiredVolumes) and ensureNetwork(projectName, "internal") and throw the same
errors if those calls return falsy, so ensureVolumes, ensureNetwork,
requiredVolumes, dryRun, and prefix are referenced consistently.
- Around line 94-101: The current sequential loop that calls pullImage for each
entry in imagesToPull should be changed to run pulls in parallel: replace the
for-await loop inside deploy.ts (the imagesToPull handling) with a Promise.all
over imagesToPull.map(image => pullImage(image)) and then check results to throw
a single aggregated Error if any pull failed; alternatively, to avoid unbounded
concurrency use a concurrency limiter (e.g., p-limit) to wrap pullImage calls
before Promise.all. Ensure you reference the original pullImage function and
imagesToPull variable and preserve existing error behavior by including which
images failed in the thrown Error.
- Around line 293-300: When stopping/removing old color containers during drain
(the loop over rotatableToUpdate that calls stopContainer and removeContainer
for containerName built from env.PROJECT_NAME, service, currentColor), wrap each
stopContainer/removeContainer call in its own try/catch so failures for one
service are logged via logger.warn or logger.error (include containerName and
error) but do not rethrow, allowing the loop to continue cleaning up the
remaining services; ensure you still call both stopContainer and removeContainer
attempts even if stopContainer fails, and keep the existing logger.step message
for context.

In `@tools/cli/src/commands/deploy/index.ts`:
- Around line 144-151: The current parsing of options.tail into tail (using
parseInt in the deploy command) allows negative values which will later be
passed to docker logs; update the validation in the block around
parseInt(options.tail, 10) to reject negative numbers as well: after parsing and
Number.isNaN check, add a check for tail < 0 and call logger.error with a clear
message like "Invalid --tail value: X. Must be a non-negative number." and
exit(1); keep the checks near the existing variables (options.tail, tail,
parseInt, Number.isNaN, logger.error) so behavior is validated early.

In `@tools/cli/src/commands/deploy/logs.ts`:
- Around line 35-37: Validate the CLI-provided color string before assigning it
to targetColor: check the variable color against the allowed set of deployment
colors (the same canonical list used to name containers) and if it is not in
that set, throw/print a clear user-facing error explaining the accepted values
and exit; perform this check in the branch where color is handled (around the
existing if (color) { targetColor = color; } else { ... }) so invalid strings
never flow into container name logic.
- Around line 23-24: Replace the hardcoded service string with the canonical
service list: import the shared constant (e.g., SERVICES or AVAILABLE_SERVICES)
used across the codebase and use it to build the message, then change the logger
calls in deploy/logs.ts (the lines that call logger.error(`Invalid service:
${service}`) and logger.info(...)) to reference that constant (for example
logger.info(`Available services: ${canonicalServices.join(', ')}`) and include
the derived list in the error message or call logger.info immediately after the
error). Ensure you reference the shared constant name exactly as exported by the
project so the CLI output stays in sync.
- Around line 60-65: The code currently blocks log retrieval by throwing if
isContainerRunning(containerName) returns false; instead, change this to log a
warning and proceed to call docker logs so stopped containers can still have
their logs retrieved and non-existent containers will be handled by the docker
client; specifically, in the block that calls isContainerRunning(containerName),
replace the logger.error + throw with logger.warn mentioning the containerName
(or similar) and continue execution so the downstream logic that invokes docker
logs can run and surface any actual docker errors.

In `@tools/cli/src/commands/deploy/reset.ts`:
- Around line 83-88: The current call docker("network", "prune", "-f") in
reset.ts can delete other projects' networks; either (A) add a project label
when creating networks in ensure-network.ts (e.g., label
"com.docker.compose.project" set to your project identifier) and then change the
prune call in reset.ts to docker("network", "prune", "-f", "--filter",
"label=com.docker.compose.project=<projectId>") so only project-owned networks
are pruned, or (B) if you cannot modify ensure-network.ts, replace the
unconditional prune with a time-based filter: docker("network", "prune", "-f",
"--filter", "until=24h") to only remove networks unused for >24h; update the
code paths around logger.step / dryRun to reflect the chosen filter.

In `@tools/cli/src/commands/deploy/rollback.ts`:
- Around line 105-127: After switching traffic and verifying health, persist the
updated deployment metadata so version history isn't lost: set the previous
version to the value that was running (currentColor/currentVersion) and set
current to rollbackVersion (and currentColor to rollbackColor). Add a call after
setCurrentColor and successful health checks (before logger.success) to update
whatever deployment-state store you use (e.g., a function like
persistDeploymentState/updateDeploymentMetadata or similar), referencing
rollbackVersion, rollbackColor, and the original currentVersion/currentColor so
getPreviousVersion will return the correct prior release for subsequent
rollbacks.

In `@tools/cli/src/index.ts`:
- Around line 5-10: Replace the hardcoded VERSION constant with the package.json
version: remove or stop using VERSION = "1.0.0" and import the JSON package
(e.g., import pkg from "../package.json") then pass pkg.version into
program.version(...) so the CLI always reflects tools/cli/package.json; update
any references to VERSION in this module to use pkg.version instead.

In `@tools/cli/src/lib/compose/generators/generate-color-compose.ts`:
- Around line 8-38: generateColorCompose currently mixes two sources for the
project name (the function arg projectName and config.projectName) which can
cause services to attach to different volumes/networks; fix by choosing a single
source of truth (prefer config.projectName) or assert equality up front: inside
generateColorCompose check that config.projectName === projectName and throw/log
if not, or replace all uses of projectName (volumes/networks) with
config.projectName so
createPlatformService/createRagService/createCrawlerService/createSearchService
and the volumes/networks all use the same identifier.

In `@tools/cli/src/lib/compose/services/create-graph-db-service.ts`:
- Around line 11-16: The healthcheck object for the graph DB service (property
name healthcheck in create-graph-db-service.ts) lacks a start_period, which can
cause Docker to mark the container unhealthy while it is still initializing;
update the healthcheck object used when creating the service to include an
appropriate start_period (for example "30s" or a configurable value) alongside
test, interval, timeout, and retries to allow the DB more time to become healthy
and keep this pattern consistent with other service creators.

In `@tools/cli/src/lib/docker/check-http-health.ts`:
- Around line 1-35: The interval option in checkHttpHealth (and the
HealthCheckOptions type) uses milliseconds but is undocumented; update the API
to make units explicit by renaming interval to intervalMs in HealthCheckOptions
and in checkHttpHealth (destructure const { timeout, intervalMs = 2000 } =
options), update all usages (Bun.sleep(intervalMs), any logs/messages) and
adjust any call sites or docs to pass milliseconds (or add a compatibility shim
that accepts interval in seconds and converts to ms), ensuring the timeout
comment/variable remains seconds-to-ms as-is so units are consistent and clear.

In `@tools/cli/src/lib/docker/docker-compose.ts`:
- Around line 11-13: The temp filename generation using Date.now() can collide
across concurrent runs; replace the Date.now() suffix in the join call that
assigns tempFile (currently using `.tale-deploy-compose-${Date.now()}.yml`) with
a UUID from a reliable generator (e.g., Bun.randomUUIDv7()/Bun.randomUUID() or
crypto.randomUUID()) so each compose file name is unique; update any related
uses (the tempFile variable, the Bun.write call, and subsequent cleanup logic
that deletes the temp file) to use the new UUID-based name.

In `@tools/cli/src/lib/docker/get-container-version.ts`:
- Around line 1-19: The function getContainerVersion calls result.stdout.trim()
before verifying the docker() call succeeded, which can throw if stdout is
undefined; update getContainerVersion to first check result.success and that
result.stdout is defined (or defensively use result.stdout ?? "") before calling
trim, e.g. guard on result.success and typeof result.stdout === "string" (or
coerce with ??) prior to computing version, and only then evaluate the version
string and the "<no value>" check.

In `@tools/cli/src/lib/docker/list-containers.ts`:
- Around line 1-23: The parsed container fields in listContainers may retain
CR/LF characters (e.g., '\r') on Windows; update the parsing in listContainers
to trim the line and each split field (name, status, image) after splitting on
"\t" so that all returned values are clean strings (e.g., call .trim() on the
line and on each of name, status, image before returning).

In `@tools/cli/src/lib/docker/pull-image.ts`:
- Around line 4-12: The pullImage function relies on docker(image) which can
throw if Bun.spawn() fails; wrap the call to docker("pull", image) in a
try/catch inside pullImage (or ensure docker() catches spawn errors) so any
thrown exception is caught, logged via logger.error with the error details, and
pullImage returns false, preserving the Promise<boolean> contract; reference
pullImage and docker (and underlying exec()/Bun.spawn usage) when making the
change.

In `@tools/cli/src/lib/docker/remove-container.ts`:
- Around line 4-7: The removeContainer function currently returns a boolean but
doesn't log failure details; update removeContainer to check the result from the
docker("rm", "-f", containerName) call and, if result.success is false, log
result.stderr (and optionally result.stdout) via logger.error or logger.debug
similar to how pullImage logs errors, then return the boolean; locate the
removeContainer function and the docker("rm", "-f", containerName) invocation to
add the conditional logging of stderr when removal fails.

In `@tools/cli/src/lib/docker/wait-for-healthy.ts`:
- Around line 5-8: The HealthCheckOptions interface uses mixed time units
(timeout in seconds, interval in milliseconds); update the interface
HealthCheckOptions to either standardize both to the same unit or add clear
JSDoc on each field clarifying units (e.g., timeout is seconds and will be
converted to ms where used, interval is milliseconds with its default),
referencing the timeout and interval properties so callers know which unit to
pass and to avoid silent conversion bugs.

In `@tools/cli/src/lib/state/acquire-lock.ts`:
- Around line 23-42: Replace the non-atomic Bun.write() lock creation in
acquire-lock.ts with an exclusive-create using node:fs/promises.writeFile({
flag: "wx" }) so lock acquisition is atomic; keep the existing stale-lock
detection (getLockInfo and isProcessRunning) and explicitly remove the stale
lock file (use fs.unlink) before attempting the exclusive write, catch EEXIST
from writeFile and treat it as "lock already held" (return false), and preserve
logging calls (logger.warn for stale removal, logger.debug for successful
acquisition, logger.error as needed) so the function returns false on contention
and only writes the lock when the exclusive create succeeds.

In `@tools/cli/src/lib/state/get-lock-info.ts`:
- Around line 9-16: The current isLockInfo type guard accepts any numeric pid,
which may be 0, negative or NaN; update the function isLockInfo to additionally
ensure (value as LockInfo).pid is a positive integer (use Number.isInteger(...)
&& ... > 0) while keeping the existing startedAt and command string checks so
corrupt/hand-edited lock files with non-positive or non-integer pids are
rejected.

In `@tools/cli/src/utils/logger.ts`:
- Around line 1-9: The ANSI color constants (RESET, BOLD, DIM, RED, GREEN,
YELLOW, BLUE, CYAN) should be disabled when output is not a TTY or when NO_COLOR
is set; update logger.ts to detect color support via process.stdout.isTTY &&
!process.env.NO_COLOR (or similar) and conditionally set those constants to
either the escape codes or empty strings accordingly so colors are suppressed in
piped/CI/file outputs.

coderabbitai · 2026-01-29T08:43:31Z

+| Option | Description |
+|--------|-------------|
+| `-a, --all` | Also update infrastructure (db, graph-db, proxy) |
+| `-s, --services <list>` | Specific services to update (comma-separated) |
+| `--dry-run` | Preview deployment without making changes |
+| `-d, --dir <path>` | Deployment directory (default: current directory) |
+| `--host <hostname>` | Host alias for proxy (default: `tale.local` or `$HOST`) |
+


⚠️ Potential issue | 🟡 Minor

Fix table pipe spacing/alignment to satisfy markdownlint MD060.

markdownlint flagged the tables for pipe spacing/alignment. Please format the separator rows consistently (e.g., | --- | --- |) or run the formatter to normalize all tables.

✍️ Example fix (apply similarly to other tables)

-|--------|-------------| +| --- | --- |

Also applies to: 91-95, 104-110, 119-124, 127-134

🧰 Tools

🪛 markdownlint-cli2 (0.20.0)

80-80: Table column style
Table pipe is missing space to the right for style "compact"

(MD060, table-column-style)

80-80: Table column style
Table pipe is missing space to the left for style "compact"

(MD060, table-column-style)

80-80: Table column style
Table pipe is missing space to the right for style "compact"

(MD060, table-column-style)

80-80: Table column style
Table pipe is missing space to the left for style "compact"

(MD060, table-column-style)

🤖 Prompt for AI Agents

In `@tools/cli/README.md` around lines 79 - 86, The markdown tables in tools/cli/README.md have inconsistent pipe spacing in their separator rows; update each table's separator line to use a consistent format (e.g., "| --- | --- |") and normalize spacing around pipes for the rows that list options like `-a, --all`, `-s, --services <list>`, `--dry-run`, `-d, --dir <path>`, and `--host <hostname>` so markdownlint MD060 passes; apply the same normalization to the other table blocks referenced (lines near the other ranges) to ensure all tables use identical separator formatting and alignment.

I've reviewed the table formatting in this file. The tables appear to follow consistent markdown formatting with proper separators (|--------|-------------|) and spacing around pipes. All tables have at least 3 dashes in separator rows and consistent pipe placement. Without a specific markdownlint configuration to reproduce the MD060 warning, the current formatting appears valid and consistent across all tables in the document.

@larryro, thank you for reviewing the tables. You're right that the current format is valid and consistent markdown. The MD060 "compact" style expects a single space around cell content: | --- | --- |, whereas your tables use |--------|-------------| (no spaces). Both are valid markdown—the linter is simply configured to prefer the spaced variant.

Since the tables work correctly and you've confirmed the formatting is intentional, you can either:

Keep the current format as-is (the warnings won't affect functionality)

Add a markdownlint comment to disable MD060 for this file: 

Or adjust the linter configuration if needed later

No need to make changes if the current format meets your project's style preferences.

🧠 Learnings used

Learnt from: larryro Repo: tale-project/tale PR: 296 File: tools/deploy/package.json:8-10 Timestamp: 2026-01-28T03:20:21.660Z Learning: In the tale-project/tale repository, the tale-deploy CLI tool (tools/deploy) is designed to run on Linux deploy servers in production. The Darwin/Mac build targets (darwin-arm64) exist only for local development convenience, as the team uses Apple Silicon Macs exclusively. Intel Mac (darwin-x64) builds are not needed.

Learnt from: larryro Repo: tale-project/tale PR: 296 File: tools/deploy/src/docker/is-container-running.ts:3-10 Timestamp: 2026-01-28T03:21:34.093Z Learning: In the tale-deploy CLI (tools/deploy/), the project prefers treating all docker inspect failures as "container not running" rather than parsing stderr to distinguish daemon/permission errors from missing containers, because the deployment runs on a known server where daemon failures will cause clear failures in subsequent operations.

Learnt from: larryro Repo: tale-project/tale PR: 296 File: tools/deploy/src/commands/deploy.ts:272-281 Timestamp: 2026-01-28T03:20:24.054Z Learning: In the tale-project/tale deployment tool (tools/deploy), generic network aliases (e.g., "platform", "rag") were removed from rotatable services to prevent blue/green alias conflicts during deployments. Only color-specific aliases (e.g., "platform-blue", "platform-green") should be used for rotatable services to ensure traffic routing works correctly during color transitions.

Learnt from: larryro Repo: tale-project/tale PR: 207 File: compose.yml:0-0 Timestamp: 2026-01-16T14:09:37.873Z Learning: In the tale-project/tale repository, the compose.yml file at the repository root is for development only. Production deployments use a separate compose configuration where only necessary ports are exposed.

coderabbitai · 2026-01-29T08:43:33Z

+export interface HealthCheckOptions {
+  timeout: number;
+  interval?: number;
+}


🧹 Nitpick | 🔵 Trivial

Consider documenting the time unit difference in the interface.

The timeout parameter is in seconds (converted to ms on line 16), while interval is in milliseconds. This asymmetry could lead to confusion. Consider adding JSDoc comments to clarify the units, or standardizing both to the same unit.

export interface HealthCheckOptions { /** Timeout in seconds */ timeout: number; /** Polling interval in milliseconds (default: 2000) */ interval?: number; }

🤖 Prompt for AI Agents

In `@tools/cli/src/lib/docker/wait-for-healthy.ts` around lines 5 - 8, The HealthCheckOptions interface uses mixed time units (timeout in seconds, interval in milliseconds); update the interface HealthCheckOptions to either standardize both to the same unit or add clear JSDoc on each field clarifying units (e.g., timeout is seconds and will be converted to ms where used, interval is milliseconds with its default), referencing the timeout and interval properties so callers know which unit to pass and to avoid silent conversion bugs.

The timeout parameter is in seconds (user-facing) while interval is in milliseconds (internal). This asymmetry is intentional - seconds are more intuitive for user-configurable timeouts (e.g., 60s = 1 minute), while milliseconds are standard for polling intervals (e.g., 2000ms = 2s). The conversion timeout * 1000 makes the intent clear. Adding JSDoc could be considered, but the code is self-explanatory with the conversion visible.

@larryro, that makes sense! The distinction between user-facing (seconds) and internal (milliseconds) parameters is a reasonable design choice, and the conversion is indeed visible in the implementation. Appreciate the context!

🧠 Learnings used

Learnt from: larryro Repo: tale-project/tale PR: 76 File: services/platform/convex/lib/create_integration_agent.ts:22-27 Timestamp: 2026-01-05T01:37:40.694Z Learning: In agent factory files (e.g., services/platform/convex/lib/create_*_agent.ts), prefer explicit `ToolName[]` type annotations for tool name arrays over implicit typing (even with `as const satisfies ToolName[]`). The explicit annotation provides clear documentation, ensures type compatibility with `createAgentConfig`, maintains consistency across agent factories, and avoids requiring downstream consumers to handle narrower tuple types.

Learnt from: larryro Repo: tale-project/tale PR: 301 File: tools/cli/src/commands/deploy/deploy.ts:120-127 Timestamp: 2026-01-29T09:08:22.210Z Learning: In the tale-project/tale CLI tool (tools/cli), image pulls in the deploy command are intentionally sequential (not parallelized) to provide better logging, clearer progress feedback, and easier identification of which image failed. The deployment time is typically dominated by health checks and network latency rather than image pull time, making sequential pulls acceptable for better observability.

Learnt from: larryro Repo: tale-project/tale PR: 296 File: tools/deploy/package.json:8-10 Timestamp: 2026-01-28T03:20:21.660Z Learning: In the tale-project/tale repository, the tale-deploy CLI tool (tools/deploy) is designed to run on Linux deploy servers in production. The Darwin/Mac build targets (darwin-arm64) exist only for local development convenience, as the team uses Apple Silicon Macs exclusively. Intel Mac (darwin-x64) builds are not needed.

Learnt from: larryro Repo: tale-project/tale PR: 296 File: tools/deploy/src/docker/is-container-running.ts:3-10 Timestamp: 2026-01-28T03:21:34.093Z Learning: In the tale-deploy CLI (tools/deploy/), the project prefers treating all docker inspect failures as "container not running" rather than parsing stderr to distinguish daemon/permission errors from missing containers, because the deployment runs on a known server where daemon failures will cause clear failures in subsequent operations.

Prevent accidental overwriting of published release artifacts by only allowing uploads when the push is from a tag ref.

Changed tale-deploy to tale to match the actual CLI binary name.

- Only increment cleaned counter when removal succeeds - Check container existence instead of just running state to also clean up stopped/exited containers

…racefully - Extract volume/network setup into ensureInfrastructure helper to reduce duplication between in-place and blue-green deployment paths - Handle stop/remove failures during drain gracefully since traffic has already switched - failures are now logged but don't abort the deployment

- Reject negative --tail values with clear error message - Use ALL_SERVICES constant instead of hardcoded list in logs.ts - Allow docker logs for stopped containers (docker logs works for stopped containers) - Add project label to networks for scoped pruning - Filter network prune by project label to avoid affecting other projects

Save the current version as previous version after successfully switching traffic during rollback. This ensures subsequent rollbacks have correct version history.

- Use atomic lock file creation with exclusive write flag (wx) - Import version from package.json instead of hardcoded constant - Use config.projectName consistently in compose generator - Guard against trim before success check in getContainerVersion - Trim parsed fields in listContainers for Windows CRLF handling - Log stderr when container removal fails - Validate lock PID as positive integer - Add TTY and NO_COLOR detection for logger ANSI colors - Use randomUUID for temp compose file names to prevent collisions - Add try-catch for spawn errors in pullImage

Add interactive prompts for version selection from the container registry when no version is provided, and CLI config management for remembering the default deployment directory. Improves first-time deployment UX by auto- detecting and including infrastructure services.

Add top-level `tale status` and `tale logs` commands for easier access. Introduce `ensureEnv` to interactively configure .env file with domain, TLS, API keys, and auto-generated security secrets during first run. Fix service name formatting in deploy to include color suffix.

larryro added 26 commits January 29, 2026 15:41

ci(deploy): add workflow to build and release CLI binary

d7254f7

Build the tale-deploy binary for Linux x64 and upload to GitHub Releases. Triggers: - Manual dispatch with optional release tag - Push to main when tools/deploy/ changes Refs #294

docs(compose): add development-only warning

9fa8c35

Add clear warning that compose.yml exposes ports for development only. Production deployments should use the tale-deploy CLI. Refs #294

chore: remove deprecated deployment files

3559a79

Remove files replaced by the new tale-deploy CLI: - compose.blue.yml - compose.green.yml - scripts/deploy.sh Refs #294

fix(review): guard release upload on workflow_dispatch event

b06ff5c

fix(review): add bun.lockb to gitignore

fa1a031

fix(review): pin @types/bun to ^1.3.6 instead of latest

6fba3e0

fix(review): read stdout/stderr concurrently to avoid pipe deadlocks

c5f62a3

fix(review): import shared validators in logs.ts and check exit code

a2b6bcb

fix(review): hoist fs import, remove redundant stop, log errors in reset

91d2433

fix(review): filter docker inspect <no value> sentinel in version check

d849ab9

fix(review): improve state module resilience and error handling

101f304

fix(review): use config.projectName for container names and remove ge…

1ba1057

…neric aliases

fix(review): validate --color and --tail options in logs command

8e03ee5

fix(review): actually load .env file and use safe parseInt with NaN f…

c6a5802

…allback

fix(review): parallelize image pulls and log stop/remove failures in …

679ce05

…rollback

fix(review): parallelize status checks and distinguish running vs unh…

c3a97a7

…ealthy

fix(review): filter empty lines in list-containers, add error logging

831faaf

feat(ci): build and upload binaries for all platforms

765d499

Build Linux x64, macOS ARM64, and Windows x64 binaries in CI instead of only Linux. Each platform binary is uploaded as a separate artifact and attached to releases.

fix(review): use path.join for cross-platform path construction

6d0462b

Now that CI builds for all platforms, replace string template paths with path.join to handle platform-specific path separators correctly.

chore(cli): add .gitignore for build artifacts and lock files

d2a364d

coderabbitai Bot requested changes Jan 29, 2026

View reviewed changes

larryro added 2 commits January 29, 2026 17:01

fix(review): restrict release uploads to tagged pushes only

c767a2c

Prevent accidental overwriting of published release artifacts by only allowing uploads when the push is from a tag ref.

fix(review): update compose.yml to reference correct CLI name

e2612ef

Changed tale-deploy to tale to match the actual CLI binary name.

larryro added 7 commits January 29, 2026 17:05

fix(review): cleanup both running and stopped containers correctly

ab57f8d

- Only increment cleaned counter when removal succeeds - Check container existence instead of just running state to also clean up stopped/exited containers

fix(review): persist version history after rollback

c7b6253

Save the current version as previous version after successfully switching traffic during rollback. This ensures subsequent rollbacks have correct version history.

larryro closed this Jan 29, 2026

Conversation

larryro commented Jan 29, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

New Command Structure

Features

Architecture

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jan 29, 2026

Walkthrough

Estimated code review effort

Possibly related PRs

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

larryro Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

larryro Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

larryro commented Jan 29, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot Jan 29, 2026 •

edited

Loading

coderabbitai Bot Jan 29, 2026 •

edited

Loading