Menubar and CLI hardening from multi-agent audit#257
Merged
Conversation
Two passes of validators across CLI accuracy, dashboard UX, menubar Swift, performance, security, and end-to-end smoke tests on real session data. Data-correctness fixes: - parseLocalDate rejects month/day overflow. JS Date silently rolled Feb 31 to Mar 3, so --from 2026-02-31 --to 2026-03-15 quietly dropped sessions on Feb 28 - Mar 2. Now throws "Invalid date" with a clear reason. Leap-day case covered (2024-02-29 valid, 2025-02-29 rejected). - CSV/JSON exports use the active currency's natural decimal places. The previous round2 helper produced ¥412.37 in CSV while the dashboard rendered ¥412 — finance teams comparing the two surfaces saw a discrepancy. New roundForActiveCurrency consults Intl.NumberFormat for the right precision (0 for JPY/KRW/CLP, 2 for USD/EUR, etc). - Copilot toolRequests is Array.isArray-guarded in both modern and legacy event branches. Previously a corrupt session with toolRequests=null or a string aborted the whole file's parse loop and silently dropped every legitimate call after it. - Codex token_count dedup uses a null sentinel for prevCumulativeTotal so the first event is never confused with a duplicate. Sessions that emit only last_token_usage (no total_token_usage) report cumulativeTotal=0 on every event; with the previous 0-initialized prev, the first event matched the dedup guard and was dropped. - LiteLLM pricing values are clamped to [0, 1] per token via safePerTokenRate. Defense in depth against a tampered upstream JSON shipping negative or absurdly large per-token costs that would otherwise propagate into all cost totals. Performance: - Cursor SQLite parse no longer pegs at minutes on multi-GB DBs. Two changes: per-conversation user-message buffer uses an index pointer instead of Array.shift() (which was O(n) per call); and a real ROWID cutoff via subquery limits the scan to the most recent 250k bubbles with a stderr warning so power users get a partial report rather than a stalled CLI. - Spawned codeburn CLI subprocesses are terminated when the calling Task is cancelled. Without this, rapid period/provider tab clicks in the menubar cancelled the Task but left the subprocess running to completion, piling up zombie processes. UX: - Dashboard period switch flips to loading and clears projects synchronously before reloadData runs, eliminating the frame where the new period label rendered over the old period's projects. - Optimize findings tab paginates 3-at-a-time with j/k scroll. With 4 new detectors plus 7 originals, 8-10 findings * 6 lines was scrolling the StatusBar off the alt buffer top. - Custom --from/--to ranges hide the period tab strip and disable the 1-5 / arrow keys so a stray period press no longer abandons the user's explicit range. A "Custom range: X to Y" banner replaces the tab strip. - OpenCode storage-format warning is per-table-set, rate-limited to once per process, and points the user at OpenCode's migration step or the issue tracker. The previous all-or-nothing check fired the generic "format not recognized" string for any schema mismatch. Menubar / OAuth: - Both Claude and Codex bootstrap (Reconnect button) now honour the usageBlockedUntil 429 backoff that refreshIfBootstrapped respects. Spamming Reconnect during sustained rate-limit windows previously hammered the upstream endpoint on every click. - Codex Retry-After HTTP header is parsed (delta-seconds plus IMF-fixdate fallback) so we don't over-back-off when ChatGPT tells us a shorter window than our 5-minute floor. - Both credential cache files are written via SafeFile.write (O_CREAT | O_EXCL | O_NOFOLLOW with explicit 0600) so there is no race window where the temp file briefly exists at default umask, and a symlink at the destination cannot redirect the write. Reads now route through SafeFile.read with a 64 KiB cap, closing the symlink-follow gap on Data(contentsOf:). CI signal: - TypeScript strict typecheck (tsc --noEmit) is now zero errors. The six errors in src/providers/copilot.ts came from a discriminated-union catch-all branch whose `data: Record<string, unknown>` shape TS picked over the specific event branches when narrowing on `type`. Removed the catch-all; runtime falls through unknown event types via the existing if/else chain. Tests added: 16 new (now 555 total) - date-range-filter: month/day/year overflow rejection, leap-day correctness - currency-rounding: convertCost no-rounding contract, roundForActiveCurrency for USD/JPY/KRW/EUR - providers/copilot: malformed toolRequests does not abort the parse - providers/cursor-bubble-dedup: re-parse after token mutation does not double-count, single parse yields one call per bubble - providers/codex: first event with cumulativeTotal=0 not dropped, consecutive zero-cumulative duplicates still deduped
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Two passes of validators across CLI accuracy, dashboard UX, menubar Swift, performance, security, and end-to-end smoke tests against real session data on a power-user machine.
Data correctness
parseLocalDaterejects month/day overflow.--from 2026-02-31 --to 2026-03-15previously rolled to Mar 3 and silently dropped sessions on Feb 28 - Mar 2. Now throws with a clear reason. Leap-day correct (2024-02-29valid;2025-02-29rejected).roundForActiveCurrencyrounds to 0 places for JPY/KRW/CLP and 2 for USD/EUR.toolRequestsisArray.isArray-guarded in both modern and legacy event branches. A corrupt session withtoolRequests: nullor"..."previously threw inside.mapand aborted the whole file's parse loop, dropping every legitimate call after the bad event.prevCumulativeTotal. Sessions that emit onlylast_token_usagereportcumulativeTotal=0on every event; with the prior 0-initialized prev, the first event matched the dedup guard and was dropped.[0, 1]per token viasafePerTokenRate. Defense in depth against a tampered upstream JSON propagating negative or absurd costs.Performance
Array.shift()— O(n) per call). RealROWIDcutoff via subquery limits the scan to the most recent 250k bubbles with a stderr warning.codeburnCLI subprocesses are terminated on Task cancellation. Rapid period/provider tab clicks in the menubar were cancelling the Task but leaving the subprocess running to completion.UX
projectssynchronously beforereloadDataruns, eliminating the frame where the new period label rendered over the old period's numbers.--from/--toranges hide the period tab strip and disable 1-5 / arrow keys so a stray press doesn't silently abandon the user's explicit range. A "Custom range: X to Y" banner replaces the tab strip.Menubar / OAuth
usageBlockedUntil429 backoff. Spamming Reconnect during sustained rate-limits no longer hammers the upstream endpoint on every click.Retry-AfterHTTP header is parsed (delta-seconds + IMF-fixdate fallback) so we don't over-back-off.SafeFile.write(O_CREAT | O_EXCL | O_NOFOLLOWwith explicit0600) — no race window at default umask, no symlink-follow at the destination. Reads route throughSafeFile.readwith a 64 KiB cap.CI signal
tsc --noEmitis now zero errors. Six pre-existing errors insrc/providers/copilot.tscame from a permissive catch-all branch in the discriminated union. Removed it; runtime safely falls through unknown event types via the existingif/elsechain.Tests
16 new, 555 total, all passing.
date-range-filter— month/day/year overflow rejection, leap-day correctnesscurrency-rounding—convertCostno-rounding contract;roundForActiveCurrencyfor USD/JPY/KRW/EURproviders/copilot— malformedtoolRequestsdoes not abort the parseproviders/cursor-bubble-dedup— re-parse after token mutation does not double-countproviders/codex— first event withcumulativeTotal=0not dropped; consecutive zero-cumulative duplicates still dedupedValidation
npm test— 41 files, 555 tests, 0 failurestsc --noEmit— 0 errorsswift build— cleanTest plan
tsc --noEmitclean