Improve episodic memory ranking with reinforcement and decay by coe0718 · Pull Request #2 · ghostwright/phantom

coe0718 · 2026-03-31T00:47:18Z

What Changed

This PR improves episodic memory retrieval without adding a second memory pipeline.

adds a small deterministic ranking helper for episodic memories
blends semantic match with importance, reinforcement from repeated access, and decay over time
increments episodic access_count on recall so reinforcement reflects actual usage
filters stale, low-signal episodic memories before they are injected into prompt context
adds focused tests for ranking behavior and context filtering
documents the updated episodic-memory behavior

Why

This is a first focused step toward the memory strategy discussed in #1.

Phantom already has good capture and search primitives, but episodic retrieval was mostly recency-biased and prompt injection could still include old one-off memories with little durable value. This change keeps TypeScript in the plumbing lane: ranking, decay, retrieval, and prompt shaping. It does not move semantic reasoning out of the Agent SDK or judges.

How I Tested

Commands run:

/home/coemedia/.bun/bin/bun test src/memory/__tests__/ranking.test.ts
/home/coemedia/.bun/bin/bun test src/memory/__tests__/episodic.test.ts
/home/coemedia/.bun/bin/bun test src/memory/__tests__/context-builder.test.ts
bash -lc 'export PATH="$HOME/.bun/bin:$PATH" && /home/coemedia/.bun/bin/bun test'
/home/coemedia/.bun/bin/bun run typecheck
./node_modules/.bin/biome check src/memory/ranking.ts src/memory/episodic.ts src/memory/context-builder.ts src/memory/__tests__/ranking.test.ts src/memory/__tests__/episodic.test.ts src/memory/__tests__/context-builder.test.ts

Notes:

Full test suite passes once Bun is on PATH for subprocess-spawn tests.
Full repo lint currently reports pre-existing issues in src/mcp/__tests__/dynamic-handlers.test.ts, unrelated to this PR.

Checklist

Tests pass (bun test)
Lint passes (bun run lint)
Typecheck passes (bun run typecheck)
No secrets or .env files included
Files stay under 300 lines
No Cardinal Rule violations (TypeScript does plumbing only, the Agent SDK does reasoning)
No default exports or barrel files added

mcheemaa

This is a really strong first contribution. Thank you for spending time understanding the codebase before jumping into code - the issue discussion in #1, the phased approach, and the attention to the Cardinal Rule all show a deep read of the project.

A few things I appreciated:

The extraction to ranking.ts is the right refactor. The episodic store was accumulating scoring logic that belongs in a pure-function module. Clean separation.
Catching the updateAccessCounts bug (private method not incrementing access_count) is a sharp find. Reinforcement would never have worked from the recall path without this fix.
The math is solid. Exponential decay with configurable half-lives, log-saturating reinforcement, sensible bypass rules for high-importance episodes.
The context filter is exactly the right scope - a single filter() call that keeps stale low-signal episodes from burning prompt tokens.

Two optional suggestions (neither blocks merge):

In calculateEpisodeContextScore, the weightedAverage call passes a dummy zero for the third argument since the function takes 3 inputs but you only need 2. A brief comment or a 2-arg helper would help readability.
The hoursSince IIFE has a redundant Number.isNaN guard - Date.parse already returns NaN for invalid strings and the Number.isFinite check handles it.

Verified locally (789 tests pass, typecheck and lint clean) and deployed to a test VM. Memory recall is working correctly with the new ranking in place.

Approving as-is. Welcome to Phantom.

coe0718 · 2026-03-31T10:13:04Z

I’ll look into your suggestions and add another PR for it at some point unless someone else gets it first. Sent from my iPhoneOn Mar 31, 2026, at 12:52 AM, Muhammad Ahmed Cheema ***@***.***> wrote: Merged #2 into main. —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: ***@***.***>

…ement-decay Improve episodic memory ranking with reinforcement and decay

Complete security review and implementation of fixes for Nextcloud Talk integration based on comprehensive security audit findings. HIGH PRIORITY fixes (security-critical): - ghostwright#1: Implement replay attack protection with LRU cache (5-minute TTL) - ghostwright#2: Add 64KB request size limit before body buffering - ghostwright#4: Replace Date.now() with crypto.randomUUID() for unique IDs - ghostwright#7: Fix JSON unwrap logic for ActivityStreams Note objects - ghostwright#11: Replace 'Error:' text sniffing with runtime error events Logic and security fixes: - ghostwright#3: Fix msgId/msg name collision in error handling - ghostwright#5: Improve parseConversationId to handle colons in tokens - ghostwright#6: Reject webhooks without target.id instead of silent fallback - ghostwright#8: Normalize emoji to avoid variation selector validation issues - ghostwright#9: Handle 404/409 reaction responses as success conditions - ghostwright#10: Make setReaction return boolean for proper error handling - ghostwright#12: Improve bot loop guard with actorId checking Best practices and polish: - ghostwright#13: Make port configurable instead of hardcoded 3200 - ghostwright#14: Move webhookPath default normalization to constructor - ghostwright#15: Fix health check path precedence (check webhook first) - ghostwright#16: Add exponential backoff retry for 5xx/429 responses - ghostwright#17: Add URL validation and encoding for talkServer config - ghostwright#18: Document HMAC signing asymmetry (inbound vs outbound) - ghostwright#20: Import randomUUID explicitly from node:crypto - ghostwright#21: Add reactions: true to channel capabilities - ghostwright#22: Namespace environment variables with NEXTCLOUD_ prefix Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…k adapter Implements full test suite for nextcloud.ts addressing all critical areas identified in the nextcloud-talk-review document. 943 lines of tests covering security, functionality, and edge cases. Test coverage by category: 1. Signature verification (Fix ghostwright#1, ghostwright#18) - Security Critical - Valid HMAC signature acceptance - Invalid HMAC signature rejection - Replay attack protection via nonce cache - Nonce cache size limits (1000 entries, FIFO eviction) - Nonce expiration and periodic pruning (5-minute TTL) - Asymmetric signing (inbound: random+body, outbound: random+content) 2. Request size limits (Fix ghostwright#2) - Security Critical - Content-Length validation before buffering - Double-check after reading (missing Content-Length) - 64 KB limit enforcement (Nextcloud caps at 32k chars) 3. JSON unwrapping (Fix ghostwright#7) - Functionality Critical - ActivityStreams Note objects unwrap correctly - Plain text passes through unchanged - Literal JSON-like text not corrupted (only Note type unwraps) - Invalid JSON fallback to plain text 4. parseConversationId (Fix ghostwright#5) - Correctness Critical - Valid conversationId format parsing - Missing prefix returns null - Tokens containing colons handled correctly (indexOf+slice) - Thread-scoped ID to room token extraction 5. Bot loop guard (Fix ghostwright#12) - Multi-Bot Safety - Application actor filtering (actorType === "Application") - Self-filtering (actorId === config.botId) - Person messages processed normally - Multi-bot room scenarios 6. Retry and backoff (Fix ghostwright#16) - Resilience - 429 rate limiting with Retry-After header - 5xx server errors with exponential backoff + jitter - Network error retry logic - Non-retryable 4xx handling 7. Reaction error handling (Fix ghostwright#9) - 404 on remove treated as success - 409 on add treated as success - 5xx retry for reaction operations 8. URL validation and encoding (Fix ghostwright#17) - talkServer scheme removal (http://, https://) - Trailing slash removal - URL-encoding of roomToken and messageId 9. Target validation (Fix ghostwright#6) - Missing target.id rejection (no silent fallback) 10. Emoji normalization (Fix ghostwright#8) - Variation selector removal (U+26A0 vs U+26A0 U+FE0F) 11. Unique message IDs (Fix ghostwright#4) - crypto.randomUUID() vs Date.now() - Uniqueness across concurrent calls 12. Config normalization (Fix ghostwright#13, ghostwright#14) - webhookPath default in constructor - Configurable port - Session window configuration 13. Health check (Fix ghostwright#15) - Path precedence (webhook before health) 14. Message ID extraction - Numeric and string ID handling - Missing ID handling 15. Time-window session coalescing - Recent session continuation - New session creation - Parent message ID handling 16. Capabilities declaration (Fix ghostwright#21) - reactions: true declared All tests use bun:test with mocked dependencies and follow existing patterns from webhook.test.ts, slack.test.ts, and email.test.ts. Related: nextcloud-talk-review.md Issue ghostwright#19

…rnal refs (B.1.4 fix-and-review #2) Section A: env-only gate too strict on the agent_ready and first_dm_sent heartbeats. SLACK_TRANSPORT === "http" is the actual signal that the agent is in an HTTP-transport deployment with a host metadata gateway listening; METADATA_BASE_URL alone misses the case where the channel factory falls back to the link-local default. Fixed both call sites in src/index.ts and src/channels/slack-introduction.ts; both now also fall back to DEFAULT_METADATA_BASE_URL so an unset env is not a gating failure. Section B: heartbeat.test.ts refactored away from the mock(...) as unknown as FetchImpl cast pattern. The wrapper carries internal reset state that has shifted across Bun minor versions and broke a single sibling test on Bun 1.3.13 (passes locally on 1.3.5). Replaced makeFetchOk / makeFetchStatus / makeFetchThrows with plain async functions typed directly as FetchImpl. No mock() involvement, no cast, no per-runtime state. Should pass on every Bun release. Section D: dashboard URL is now parameterized. composeIntroductionText accepts an optional dashboardUrl, dropping the "manage me" line entirely when unset. sendIntroductionDm reads PHANTOM_DASHBOARD_URL via a resolveDashboardUrl helper that validates with new URL(...) and drops a malformed value with a warning rather than injecting unparseable text into the user-facing DM. Caller in phantom-rootfs sets the env var so cloud deployments get the cloud URL; self-hosters with no env var see a shorter, valid DM. Section D (continued): scrubbed internal milestone references from every comment new in the diff. "Phase B.1.4" prefixes dropped, "the canary user" rephrased to "the user", "phantomd" replaced with "the host metadata gateway" or "the host gateway", "architect plan section 12 (option A)" replaced with an inline description of the contract. The diff against main now matches zero internal-data leak patterns under the orchestrator's grep. Tests: - New slack-introduction tests assert the SLACK_TRANSPORT-based gate: unset / socket = no heartbeat, http = heartbeat with default URL when METADATA_BASE_URL is unset, http = heartbeat with set URL when both are set. Plus three new dashboard-URL tests covering valid URL, unset, and malformed. - New heartbeat tests exercise the same FetchImpl signature without mock(); existing assertions intact. Gates: bun lint clean, bun typecheck clean, 1957 / 1957 tests pass.

Improve episodic memory ranking with reinforcement and decay

2446a3c

mcheemaa mentioned this pull request Mar 31, 2026

[codex] Fix dynamic handler test env cleanup #5

Closed

Merge branch 'main' into memory-ranking-reinforcement-decay

beb9041

mcheemaa self-requested a review March 31, 2026 04:43

mcheemaa approved these changes Mar 31, 2026

View reviewed changes

mcheemaa merged commit bbf89ef into ghostwright:main Mar 31, 2026
1 check passed

This was referenced Mar 31, 2026

Improve semantic memory reinforcement and contradiction recall #3

Open

Add durable startup memory context #4

Open

Ari4ka approved these changes Mar 31, 2026

View reviewed changes

coe0718 mentioned this pull request Mar 31, 2026

[codex] Follow up ranking readability suggestions #15

Merged

mutusfa pushed a commit to mutusfa/phantom that referenced this pull request Apr 14, 2026

Merge pull request ghostwright#2 from coe0718/memory-ranking-reinforc…

4f29385

…ement-decay Improve episodic memory ranking with reinforcement and decay

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve episodic memory ranking with reinforcement and decay#2

Improve episodic memory ranking with reinforcement and decay#2
mcheemaa merged 2 commits intoghostwright:mainfrom
coe0718:memory-ranking-reinforcement-decay

coe0718 commented Mar 31, 2026 •

edited by mcheemaa

Loading

Uh oh!

mcheemaa left a comment

Uh oh!

Uh oh!

coe0718 commented Mar 31, 2026 via email

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

coe0718 commented Mar 31, 2026 • edited by mcheemaa Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What Changed

Why

How I Tested

Checklist

Uh oh!

mcheemaa left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coe0718 commented Mar 31, 2026 via email

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

coe0718 commented Mar 31, 2026 •

edited by mcheemaa

Loading