Conversation
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
logpie
added a commit
that referenced
this pull request
Apr 26, 2026
CLUSTER C — bundle/build integrity (closes 1 CRITICAL + 4 IMPORTANT + 1 NOTE): CRITICAL closed: - `app.py:35`: `otto web` served checked-in bundle without verifying it matches source — developer skips `npm run web:build` → silent stale UI (codex-packaging-bundle #1). IMPORTANT closed: - Python build never built frontend; `pip install -e .` shipped whatever static was in tree (#2). Vite plugin emits `build-stamp.json` on every build with source-hash + timestamp + git commit; FastAPI startup verifies freshness via `verify_bundle_freshness()`. - `web:build` ran only Vite; `web:typecheck` advisory; no CI gates (#3). New `web:verify` script chains typecheck → build → committed-check. - Default cache headers underused hashed assets (#4). New `_CacheHeaderStaticFiles` subclass: `no-store` for shell + index.html, `public, max-age=31536000, immutable` for `/static/assets/*`. - Server didn't validate `index.html` referenced JS/CSS exist (#5). Startup parses index.html and asserts every referenced static path resolves; missing → fail-fast with `npm run web:build` guidance. NOTE closed: - `[tool.setuptools.package-data]` was flat (`static/*`, `static/assets/*`); future nested assets (fonts/, images/, locale/) would silently miss wheels. Now `static/**/*` recursive glob. Files: otto/web/bundle.py (new, 263 lines), otto/web/client/vite.config.ts (build-stamp emitter), scripts/build_stamp.py (CLI for manual stamp), scripts/check_bundle_committed.py (git-diff guard for CI), package.json (web:verify script), pyproject.toml (recursive package_data), otto/web/app.py (verify_bundle_freshness call + _CacheHeaderStaticFiles). Tests: tests/test_web_bundle_freshness.py (5 tests) + tests/test_web_cache_headers.py (2 tests, 8 actual checks). 15 new server-layer tests, all green. CLUSTER D — history pagination (closes 1 CRITICAL + 2 IMPORTANT): CRITICAL closed: - `total_rows=247` displayed but only first page rendered; power user with 200+ runs stuck on page 1 (heavy-user, codex-state-management #6, codex-long-string-overflow #3). IMPORTANT closed: - `/api/state` accepted `history_page` + `page_size`; client never sent `history_page` and rendered no controls. - Page-size selector now lives in the UI (10/25/50/100, default 25); server clamps to [1, 200] to refuse stale URLs requesting unbounded slices. Implementation: - `MissionControlFilters.history_page_size: int | None` for per-request override (server clamps to safe range). - App.tsx History pane: pagination footer (Page N of M · X runs · ←/→ · jump-to + page-size selector). URL persists `hp` + `ps` query params. - Filter changes reset page to 1. - Stale deep-link `?hp=99` (out-of-range) → "Page 99 doesn't exist; jump to page 1" with reset button. Files: otto/mission_control/{model.py,serializers.py} (history_page_size plumbing), otto/web/app.py (param wiring), otto/web/client/src/{App.tsx, api.ts,types.ts,styles.css} (pagination UI), tests/browser/test_history_pagination.py (10 paired Playwright tests, all green). Verification: - Browser suite: 25 passed (8 cluster A + 7 cluster B + 10 cluster D + new smoke set from cluster C cache headers via tests/test_web_cache_headers.py) - Default suite: 1091 passed (was 1076; +15 cluster C server tests) - npm run web:typecheck: clean - npm run web:build: 277.82 kB JS / 33.34 kB CSS (rebuilt with stamp) Note: cluster C agent hit an API overload during its final summary step but all files landed cleanly on disk; verified by independently running the new test files before committing. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
logpie
added a commit
that referenced
this pull request
Apr 26, 2026
============================================================ 5 IMPORTANT closures ============================================================ 1. W2-IMPORTANT-3 — Watcher cancel-vs-dispatch race Cancel POST landing between begin_command_drain and _dispatch_new was deferred to next tick — by then the queued task had already spawned. Added a `late_drain` callback into _dispatch_new called once at entry AND before each task's spawn evaluation; deduped against already-applied command_ids. Lock semantics in begin_command_drain are idempotent so the second call is safe. 2. Codex info-density #6 — Mission Focus active headline Working state showed only `<N> tasks in flight`. Now sorts live items by elapsed_s ascending, picks the hottest, and assembles `task-id · branch · elapsed · cost · last-event`. Segments dropped when missing or when cost is "…" placeholder. 3. Codex info-density #3 — Status badge tones .task-status was one gray pill regardless of status/stage. Added statusTone() helper mapping status+stage → success/running/warning/danger/neutral. CSS adds tinted bg + text colors per tone using existing palette tokens. 4. Codex error-empty-states #11 — Filter-blind empty state Empty board read "No work queued" even when filters were hiding tasks. Added BoardEmptyReason + computeBoardEmptyReason + filtersAreActive helpers. Banner now shows reason-specific copy: "No matching tasks." + Clear-filters link (filtered) vs "No work queued..." + Queue-job CTA (true-empty). Subtitle and per-column empty also reflect filter awareness. 5. Codex first-time-user #19 — Queued cards clickable TaskCard had disabled={!task.runId}, so queued tasks not yet picked up by watcher had a dead button. Users couldn't open Details. Added onSelectQueued prop; selectQueuedTask state. RunDetailPanel renders "Waiting for watcher" placeholder with title/status/branch/intent/reason + Start-watcher CTA when stopped. useEffect auto-promotes selection to real runId once the watcher dispatches (live row's queue_task_id matches). ============================================================ Tests added (5 new files, 14 total tests) ============================================================ - tests/test_queue_cancel_race.py (3 tests) - tests/browser/test_mission_focus_active_headline.py (3 tests) - tests/browser/test_task_status_badge_tones.py (2 tests) - tests/browser/test_task_board_filter_blind_empty.py (3 tests) - tests/browser/test_queued_card_clickable.py (3 tests) ============================================================ Test counts ============================================================ - Default suite: 1136 passed (was 1133; +3) - Browser suite: 102 passed (was 88; +14) - npm web:typecheck: clean - npm web:build: 305.39 kB JS / 40.27 kB CSS ============================================================ Tally ============================================================ CRITICAL: 22 of 22 closed (100%) IMPORTANT: ~53 of 132 closed (~40%) Followups: - statusTone could extend to history table + live-runs panel pills - TaskBoard "search input" data-testid for stable test selectors - Queued placeholder could surface cancel/remove actions Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
logpie
added a commit
that referenced
this pull request
Apr 26, 2026
Closes findings #6, #7, #8, #9, #10, #12, #14 from the codex-long-string-overflow hunter. Implementation: - Shared utilities .mc-wrap-anywhere + .mc-line-clamp-3 so future free-text fields opt in by class - Toast / status-banner / job-dialog status: max-height + overflow-y:auto so long strings never push siblings (dismiss/submit) offscreen - Confirm dialog body + bulk-row branch: wrap-anywhere - Review packet (#10): replaced silent ellipsis with 3-line clamp + wrap-anywhere; existing title= attribute provides full text on hover - Changed-file <li>: wrap-anywhere; .review-packet code: display:block + pre-wrap so diff_command snippet wraps inside the drawer (inline code runs explicitly re-set to display:inline so override doesn't break) - Diff toolbar (#14): nowrap+ellipsis with title tooltip on right span; min-width:0 on flex children prevents toolbar drift; matches audit's suggested affordance Tests (tests/browser/test_long_string_overflow.py — 7 new): - test_toast_wraps_long_url - test_last_error_banner_clamps_long_message - test_job_dialog_status_does_not_push_submit_offscreen - test_confirm_dialog_wraps_long_branch_names - test_review_packet_long_text_wraps_consistently - test_changed_file_list_wraps_long_paths - test_diff_toolbar_branch_truncates_long_names Each test asserts (a) no document-level horizontal scroll, (b) the overflowing element fits within its parent's width, (c) the long string remains in DOM (CSS overflow, not JS truncation). Test counts: - Default: 1136 passing (no regression) - Browser: 109 passing (was 102; +7) - npm web:typecheck: clean - npm web:build: clean CRITICAL: 22 of 22 closed (100%) IMPORTANT: ~60 of 132 closed (~45%) Followups (deferred): - Long-string findings #2/#3/#4/#5/#11/#13 need React state/render changes (perf + virtualization) — not CSS-only. Bigger cluster. - Future PRs that add free-text fields should apply .mc-wrap-anywhere explicitly rather than deriving fresh rules. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
logpie
added a commit
that referenced
this pull request
Apr 26, 2026
============================================================
W5-CRITICAL-1 — Merge preflight ignored untracked files in project root
============================================================
Real safety bug found by live W5 rerun. User had untracked
DIRTY_FILE.txt; clicked Merge → HTTP 200 + ping.py landed in main.
Fix: merge action handler now consults _project_is_user_dirty (which
already correctly filters Otto-owned untracked files) BEFORE the
merge runs. On dirty: returns HTTP 409 with structured
{reason, files} payload. UI surfaces block in confirm dialog.
Tests:
- tests/test_merge_preflight_dirty_tree.py (server)
- tests/browser/test_merge_blocked_dirty_tree.py (browser)
- blocks on user-untracked file in root
- allows when only Otto-owned untracked
- blocks on tracked-file modification
- blocks on staged changes
- confirm dialog shows block reason + file list
============================================================
5 IMPORTANT — Evidence trustworthiness cluster
============================================================
#3 — Proof drawer cached + no run_id validation
service.py: _proof_provenance helper extracts generated_at, run_id,
session_id, branch, head_sha, file_mtime, sha256, run_id_matches
flag. Client cache invalidates on (run_id|version|sha256|mtime) key
change. New <ProofProvenance> component renders metadata + warning
banner on run_id mismatch.
#4 — Per-round evidence not visible
service.py _certification_summary gains rounds field via new
_certification_round_history helper. Client <CertificationRoundTabs>
component: tab strip with verdict/duration; per-round detail with
diagnosis, failing story IDs, fix commits.
#5 — Live vs Final indicator (already covered cluster B; verified)
#6 — Binary artifacts decoded as garbage UTF-8
service.py artifact_content does MIME detection via magic bytes
(PNG/GIF/JPG/PDF/WEBM/WEBP/MP4) + null-byte sniff; returns
{previewable, mime_type, size_bytes}. New /api/runs/{id}/artifacts/
{i}/raw route serves binary via FileResponse. Client routes image
MIMEs to <img src=raw>, video to <video>, others to "no text
preview · download" card.
#7 — Artifact list missing size/mtime/hash
serializers.serialize_artifact extended with size_bytes, mtime
(ISO UTC), sha256 (full hex; client truncates to 12 chars). SHA
computation capped at 16 MB so list rendering stays cheap.
#8 — Visual evidence has no manifest
certifier writes sibling <artifact>.manifest.json with schema_version,
captured_at, run_id, session_id, round, story_id, story_status,
sha256, size_bytes, viewport, browser, certifier_mode, kind.
Tests (15 new, all green):
- tests/test_proof_provenance.py (9 server tests)
- tests/browser/test_certification_round_tabs.py (4 browser tests)
- tests/browser/test_artifact_binary_preview.py (2 browser tests)
============================================================
Test counts
============================================================
- Default: 1183 (was 1168; +15)
- Browser: 165 (was 153; +12)
- npm web:typecheck: clean
- npm web:build: 327KB JS / 49KB CSS
============================================================
Tally
============================================================
CRITICAL: 25 of 25 closed (100%)
IMPORTANT: ~94 of 132 closed (~71%)
Followups:
- mimetypes for .json may return application/json on some platforms;
if certifier emits custom-extension JSON, extend allowlist
- Manifest viewport/browser fields are placeholders — agent SDK
doesn't expose hook at screenshot capture time; fill once cert
owns capture path
- TUI test_mission_control_status_cells_use_theme_colors: pre-existing
STALE-vs-RUNNING heartbeat test failure unrelated to this work
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR fixes the actionable findings from the retried code-health audit:
Audit artifacts remain local under
audits/2026-04-24-0043/and were not committed to keep the PR focused.Validation
uv run pytest tests/test_paths.py tests/test_manifest.py tests/test_merge_orchestrator.py tests/test_cli_merge.py tests/test_queue_runner.py tests/test_env_bypass.py tests/test_otto_as_user_nightly.py -q-> 158 passeduv run python scripts/e2e_runner.py B1 B5 B13-> 3/3 passeduv run pytest -q --maxfail=10-> 911 passeduv run ruff check otto scripts tests --select F401,F821,F841-> passeduv run python -m compileall -q otto scripts tests-> passeduv run python -m pip check-> passedgit diff --check-> passed