Skip to content

feat(email): implement automatic retry for email sending#28

Merged
Israeltheminer merged 3 commits into
mainfrom
enable-dlq
Dec 19, 2025
Merged

feat(email): implement automatic retry for email sending#28
Israeltheminer merged 3 commits into
mainfrom
enable-dlq

Conversation

@Israeltheminer

@Israeltheminer Israeltheminer commented Dec 19, 2025

Copy link
Copy Markdown
Collaborator

Summary by CodeRabbit

  • New Features

    • Automatic retry mechanism for email delivery with exponential backoff scheduling (up to 3 attempts) before marking messages as failed.
    • Enhanced tracking and visibility into delivery attempts and error details.
  • Improvements

    • Improved input validation and error handling in workflow processing.
    • Better state management for failed messages with detailed metadata.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai

coderabbitai Bot commented Dec 19, 2025

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Walkthrough

This PR introduces retry semantics for email messaging across the platform. Changes add an optional retryCount parameter to database schema, conversation message updates, and email-sending functions. Exponential backoff retry logic is implemented for both SMTP and API-based email sends with a maximum of 3 retries before marking messages as failed. Related updates include removing dynamic delay logic from message scheduling to use immediate dispatch (0 delay), enforcing input validation for timestamp calculations, and reorganizing exports in workflow processing records.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25–30 minutes

  • send_message_via_api.ts and send_message_via_smtp.ts: Both files implement similar retry mechanics with exponential backoff, retry scheduling, state mutations, and error branching logic. While the patterns repeat, each implementation contains dense logic with multiple state paths (retry vs. permanent failure) that warrant careful verification of timing, state consistency, and message metadata updates.
  • send_message_via_email.ts: Changes to scheduling behavior (removing dynamic delay, using immediate dispatch) and provider persistence logic require verification of impact on message flow and state consistency.
  • calculate_cutoff_timestamp.ts: New validation logic for negative backoffHours and DST-safe millisecond arithmetic should be verified for correctness and edge cases.

Possibly related PRs

  • refactor(workflow-processing): add atomic claim/lock mechanism and flatten helpers structure #23: Modifies calculateCutoffTimestamp with input validation and DST-safe arithmetic changes, directly related to the same function updated in this PR.
  • talecorp/poc2#172: Extends retry semantics by adding optional retryCount parameters to updateConversationMessageInternal, sendMessageViaSMTPInternal, and sendMessageViaAPIInternal—the foundational infrastructure that this PR builds upon.

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro (Legacy)

📥 Commits

Reviewing files that changed from the base of the PR and between 68a416a and b72d343.

📒 Files selected for processing (10)
  • services/platform/convex/conversations.ts (1 hunks)
  • services/platform/convex/email_providers.ts (4 hunks)
  • services/platform/convex/model/conversations/send_message_via_email.ts (4 hunks)
  • services/platform/convex/model/conversations/update_conversation_message.ts (2 hunks)
  • services/platform/convex/model/email_providers/send_message_via_api.ts (4 hunks)
  • services/platform/convex/model/email_providers/send_message_via_smtp.ts (4 hunks)
  • services/platform/convex/model/workflow_processing_records/calculate_cutoff_timestamp.ts (1 hunks)
  • services/platform/convex/model/workflow_processing_records/index.ts (1 hunks)
  • services/platform/convex/model/workflow_processing_records/is_record_processed.ts (0 hunks)
  • services/platform/convex/schema.ts (1 hunks)
💤 Files with no reviewable changes (1)
  • services/platform/convex/model/workflow_processing_records/is_record_processed.ts
🧰 Additional context used
📓 Path-based instructions (4)
**/*

📄 CodeRabbit inference engine (.cursor/rules/workspace_rules.mdc)

Use English only for ALL user-facing content including UI components, labels, buttons, dialogs, forms, toast messages, error messages, success messages, comments, documentation, README files, variable names, function names, and type names

Files:

  • services/platform/convex/email_providers.ts
  • services/platform/convex/model/workflow_processing_records/index.ts
  • services/platform/convex/conversations.ts
  • services/platform/convex/model/conversations/update_conversation_message.ts
  • services/platform/convex/model/workflow_processing_records/calculate_cutoff_timestamp.ts
  • services/platform/convex/schema.ts
  • services/platform/convex/model/email_providers/send_message_via_api.ts
  • services/platform/convex/model/email_providers/send_message_via_smtp.ts
  • services/platform/convex/model/conversations/send_message_via_email.ts
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.cursor/rules/workspace_rules.mdc)

**/*.{ts,tsx,js,jsx}: Use Vercel AI SDK with OpenAI - import from 'ai' and '@ai-sdk/openai', never use raw OpenAI SDK or OpenRouter
Never hallucinate API keys - always use environment variables and existing .env configuration
Use camelCase for function names (e.g., getUserData)
Use SCREAMING_SNAKE_CASE for constants (e.g., API_BASE_URL, MAX_RETRIES)
Use feature flags with enums (TypeScript) or const objects (JavaScript) with UPPERCASE_WITH_UNDERSCORE naming
Implement error handling with try-catch pattern: check for result.error and display descriptive toast messages using result.error as title

Files:

  • services/platform/convex/email_providers.ts
  • services/platform/convex/model/workflow_processing_records/index.ts
  • services/platform/convex/conversations.ts
  • services/platform/convex/model/conversations/update_conversation_message.ts
  • services/platform/convex/model/workflow_processing_records/calculate_cutoff_timestamp.ts
  • services/platform/convex/schema.ts
  • services/platform/convex/model/email_providers/send_message_via_api.ts
  • services/platform/convex/model/email_providers/send_message_via_smtp.ts
  • services/platform/convex/model/conversations/send_message_via_email.ts
**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/workspace_rules.mdc)

**/*.{ts,tsx}: Use kebab-case for file names (e.g., user-profile.tsx)
Use PascalCase for component names (e.g., UserProfile)
Use descriptive messages as toast title (never generic 'Error'), with optional description for additional context only
Follow component structure: 'use client' directive, imports, interface Props, hooks, effects, event handlers, then render
Prioritize data fetching methods in order: Server Actions (preferred), Route Handlers (when needed), Client-side (minimal use)
Use React.memo for expensive components to optimize performance
Use Next.js Image component for all images instead of native img tags
Use dynamic imports for code splitting

Files:

  • services/platform/convex/email_providers.ts
  • services/platform/convex/model/workflow_processing_records/index.ts
  • services/platform/convex/conversations.ts
  • services/platform/convex/model/conversations/update_conversation_message.ts
  • services/platform/convex/model/workflow_processing_records/calculate_cutoff_timestamp.ts
  • services/platform/convex/schema.ts
  • services/platform/convex/model/email_providers/send_message_via_api.ts
  • services/platform/convex/model/email_providers/send_message_via_smtp.ts
  • services/platform/convex/model/conversations/send_message_via_email.ts
services/*/convex/*.ts

📄 CodeRabbit inference engine (.cursor/rules/workspace_rules.mdc)

Thin wrapper API modules (like services/platform/convex/documents.ts) may export multiple Convex functions as wrappers that delegate to model helpers, but must not contain business logic and must only perform argument/return validation and delegation

Files:

  • services/platform/convex/email_providers.ts
  • services/platform/convex/conversations.ts
  • services/platform/convex/schema.ts
🧠 Learnings (9)
📚 Learning: 2025-12-15T14:01:55.275Z
Learnt from: larryro
Repo: tale-project/tale PR: 18
File: services/platform/convex/workflow/actions/conversation/helpers/update_conversations.ts:7-10
Timestamp: 2025-12-15T14:01:55.275Z
Learning: In Convex action helpers (services/platform/convex/workflow/actions/**/helpers/*.ts), using Record<string, unknown> for update parameters is acceptable when field validation is handled at the mutation level in Convex. This provides flexibility for dynamic field updates while keeping validation centralized at the mutation layer.

Applied to files:

  • services/platform/convex/model/workflow_processing_records/index.ts
📚 Learning: 2025-12-02T08:13:51.424Z
Learnt from: CR
Repo: tale-project/tale PR: 0
File: .cursor/rules/workspace_rules.mdc:0-0
Timestamp: 2025-12-02T08:13:51.424Z
Learning: Applies to services/*/convex/*.ts : Thin wrapper API modules (like `services/platform/convex/documents.ts`) may export multiple Convex functions as wrappers that delegate to model helpers, but must not contain business logic and must only perform argument/return validation and delegation

Applied to files:

  • services/platform/convex/model/workflow_processing_records/index.ts
📚 Learning: 2025-12-02T08:13:24.290Z
Learnt from: CR
Repo: tale-project/tale PR: 0
File: .cursor/rules/convex_rules.mdc:0-0
Timestamp: 2025-12-02T08:13:24.290Z
Learning: Applies to convex/**/convex/*.{ts,tsx} : Thin wrapper API modules (e.g., `services/platform/convex/documents.ts`) may export multiple Convex functions as thin wrappers around model-layer helpers. These files must not contain business logic, should only validate args/returns and delegate to the model layer, must use snake_case for file names and camelCase for each exported function.

Applied to files:

  • services/platform/convex/model/workflow_processing_records/index.ts
📚 Learning: 2025-12-02T08:13:24.290Z
Learnt from: CR
Repo: tale-project/tale PR: 0
File: .cursor/rules/convex_rules.mdc:0-0
Timestamp: 2025-12-02T08:13:24.290Z
Learning: Applies to convex/**/*.{ts,tsx} : When touching existing modules that export multiple functions, split them into separate files so each file exports exactly one function matching the file name, unless the module is an explicitly designated thin wrapper file.

Applied to files:

  • services/platform/convex/model/workflow_processing_records/index.ts
📚 Learning: 2025-12-02T08:13:51.424Z
Learnt from: CR
Repo: tale-project/tale PR: 0
File: .cursor/rules/workspace_rules.mdc:0-0
Timestamp: 2025-12-02T08:13:51.424Z
Learning: Applies to convex/**/*.ts : Split existing Convex modules that export multiple functions into separate files, each exporting exactly one function, except for thin wrapper modules

Applied to files:

  • services/platform/convex/model/workflow_processing_records/index.ts
📚 Learning: 2025-11-30T12:29:39.745Z
Learnt from: CR
Repo: tale-project/poc2 PR: 0
File: .cursor/rules/workspace_rules.mdc:0-0
Timestamp: 2025-11-30T12:29:39.745Z
Learning: Applies to services/**/convex/*.ts : Thin wrapper API modules in services may export multiple Convex functions as thin wrappers that delegate to model helpers, must use snake_case file names and camelCase export names, and must not contain business logic

Applied to files:

  • services/platform/convex/model/workflow_processing_records/index.ts
📚 Learning: 2025-11-30T03:53:00.316Z
Learnt from: CR
Repo: tale-project/poc2 PR: 0
File: .cursor/rules/convex_rules.mdc:0-0
Timestamp: 2025-11-30T03:53:00.316Z
Learning: Applies to convex/**/!(model)/*.ts : Exception: Thin wrapper API modules (e.g., `services/platform/convex/documents.ts`) may export multiple Convex functions if they only validate args/returns and delegate to the model layer; must use snake_case for file names and camelCase for function names

Applied to files:

  • services/platform/convex/model/workflow_processing_records/index.ts
📚 Learning: 2025-11-30T12:29:39.745Z
Learnt from: CR
Repo: tale-project/poc2 PR: 0
File: .cursor/rules/workspace_rules.mdc:0-0
Timestamp: 2025-11-30T12:29:39.745Z
Learning: Applies to convex/*.ts : Split existing Convex modules that export multiple functions into separate files with one function per file, except for explicitly designated thin wrapper modules

Applied to files:

  • services/platform/convex/model/workflow_processing_records/index.ts
📚 Learning: 2024-10-12T03:10:42.147Z
Learnt from: larryro
Repo: talecorp/tale PR: 1314
File: features/platform/nest/src/queue/at-most-once-email-notification-queue.ts:9-36
Timestamp: 2024-10-12T03:10:42.147Z
Learning: In the `features/platform/nest/src/queue/at-most-once-email-notification-queue.ts` file, it's important to maintain the relationship between `NotificationType` and its corresponding data in `AtMostOnceEmailNotificationQueueDataType`. Refactoring to a generic approach that loses this relationship is undesirable.

Applied to files:

  • services/platform/convex/model/conversations/send_message_via_email.ts
🧬 Code graph analysis (1)
services/platform/convex/model/workflow_processing_records/calculate_cutoff_timestamp.ts (2)
services/platform/convex/model/workflow_processing_records/index.ts (1)
  • BACKOFF_NEVER_REPROCESS (9-9)
services/platform/convex/model/workflow_processing_records/constants.ts (1)
  • BACKOFF_NEVER_REPROCESS (20-20)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build Docker Image
🔇 Additional comments (21)
services/platform/convex/conversations.ts (1)

375-397: LGTM!

The addition of the optional retryCount parameter to updateConversationMessageInternal is consistent with the existing wrapper pattern and correctly delegates to the model layer.

services/platform/convex/schema.ts (1)

452-462: LGTM!

The retryCount field is appropriately added as optional to track retry attempts. The placement near deliveryState is logical, and the descriptive comment aids clarity.

services/platform/convex/model/conversations/update_conversation_message.ts (2)

17-17: LGTM!

The retryCount parameter addition follows the established pattern for optional fields in this function.


34-34: LGTM!

The conditional assignment to updateData correctly mirrors the handling of other optional fields.

services/platform/convex/model/workflow_processing_records/calculate_cutoff_timestamp.ts (2)

14-28: LGTM! Input validation is thorough.

The validation correctly handles edge cases:

  • Rejects non-finite numbers (NaN, Infinity)
  • Rejects negative values except the sentinel BACKOFF_NEVER_REPROCESS

Note: The check backoffHours !== BACKOFF_NEVER_REPROCESS in the first condition (line 16) is technically redundant since -1 is finite, but it provides clear defensive coding.


35-36: LGTM! DST-safe calculation.

Using millisecond arithmetic (Date.now() - backoffHours * 60 * 60 * 1000) avoids DST issues that can occur with Date.setHours().

services/platform/convex/model/conversations/send_message_via_email.ts (4)

1-8: LGTM! Documentation accurately describes the retry behavior.

The updated comments clearly document the retry semantics: immediate dispatch, exponential backoff (30s, 60s, 120s), and failure after 3 retries.


120-143: LGTM! API path correctly configured for retry.

The API send path now:

  • Dispatches immediately (delay: 0)
  • Includes retryCount: 0 for the initial attempt
  • Passes providerId explicitly

145-168: LGTM! SMTP path mirrors API path.

The SMTP send path is consistently updated with immediate dispatch, retryCount: 0, and explicit providerId.


170-176: Good addition: Provider persistence on conversation.

Updating the conversation's providerId when not already set ensures future replies use the correct provider.

services/platform/convex/email_providers.ts (2)

713-743: LGTM! SMTP internal action updated with retry support.

The sendMessageViaSMTPInternal action:

  • Adds optional retryCount parameter for tracking attempts
  • Documentation clearly describes the retry behavior

745-775: LGTM! API internal action mirrors SMTP changes.

The sendMessageViaAPIInternal action is consistently updated with retryCount support and matching documentation.

services/platform/convex/model/email_providers/send_message_via_smtp.ts (4)

24-26: LGTM! Constants follow naming conventions.

MAX_RETRIES and BACKOFF_DELAYS_MS correctly use SCREAMING_SNAKE_CASE as per coding guidelines.


250-273: LGTM! Retry scheduling with exponential backoff.

The implementation correctly:

  • Schedules retry with calculated delay from BACKOFF_DELAYS_MS
  • Passes incremented retryCount to the next attempt
  • Returns { success: false, messageId: '' } to avoid scheduler errors

276-298: LGTM! Permanent failure handling.

After exhausting retries:

  • Logs the permanent failure with total attempts
  • Updates message to failed state with error metadata
  • Returns failure response without throwing

237-248: The metadata merge implementation correctly preserves existing fields while allowing new metadata to overwrite specific keys. Lines 35-43 in update_conversation_message.ts show the spread operator pattern (...existingMetadata, ...args.metadata), which ensures that fields like lastError and nextRetryAt are updated as needed for retry tracking while other metadata fields remain intact.

services/platform/convex/model/workflow_processing_records/index.ts (1)

47-48: Re-exports are valid. The is_record_processed.ts file exists and correctly exports both isRecordProcessed (function) and IsRecordProcessedArgs (interface), confirming the refactored exports in the index file are properly sourced.

services/platform/convex/model/email_providers/send_message_via_api.ts (4)

21-23: LGTM!

Retry constants are well-defined with clear naming following SCREAMING_SNAKE_CASE convention. The backoff delays (30s, 60s, 120s) align with the documentation.


42-45: LGTM!

Clean handling of the optional retryCount parameter with proper default initialization using nullish coalescing.


289-309: LGTM!

The retry scheduling correctly forwards all arguments with the incremented retryCount. Using ctx.scheduler.runAfter ensures the retry is durable and will execute even if the current action completes.


276-286: The metadata handling in lines 276-286 correctly preserves existing fields. The updateConversationMessageInternal mutation implementation in services/platform/convex/model/conversations/update_conversation_message.ts explicitly merges metadata using the spread operator ({ ...existingMetadata, ...(args.metadata) }) before calling db.patch(). No changes needed.

Likely an incorrect or invalid review comment.

@Israeltheminer Israeltheminer merged commit a009c38 into main Dec 19, 2025
2 checks passed
@Israeltheminer Israeltheminer deleted the enable-dlq branch December 19, 2025 14:03
larryro added a commit that referenced this pull request May 17, 2026
…— validator tightening

Closes round-5 findings #27, #28, #34, #35, #36.

- `tts/queries.ts` getMessageChunks return validator narrows `format` and
  `error` from `v.optional(v.string())` to the closed unions built from
  `audioFormatLiterals` and `ttsErrorCodeLiterals`. The schema's writer
  validator already uses those unions; the query was the only seam where
  a future drift could fan out unnoticed.
- `tts/queries.ts` getVoiceModeEffective now falls back to a prefix-only
  `userPreferences` lookup when the thread has no `organizationId`
  (legacy / edge rows). A user who toggled voice ON globally previously
  got silently-off voice on those threads.
- `lib/shared/schemas/providers.ts` `defaultVoice` and `voicesByLocale`
  values now reject all-whitespace strings (`.regex(/\S/)`) so `'   '`
  no longer slips through `.min(1)` and surfaces as UNKNOWN_VOICE at
  synth time.
- `lib/shared/schemas/providers.ts` locale-regex docs explicitly note
  the narrow BCP-47 subset (ISO-639-1 + optional ISO-3166-1 alpha-2);
  script subtags (`zh-Hans`), 3-letter codes (`fil`), and UN region
  codes (`en-419`) are intentionally out of scope. Adds a follow-up
  pointer in the comment so future widening is a deliberate, lockstep
  change with the resolver.
- `lib/shared/schemas/providers.ts` superRefine now uses the `forEach`
  index instead of `data.models.indexOf(model)` (O(n²) → O(n)) and
  points the error `path` at the actually-missing field
  (`voicesByLocale` when the operator only typed an empty map, else
  `defaultVoice`), so the operator's editor jumps to the right line.
larryro added a commit that referenced this pull request May 17, 2026
…— validator tightening

Closes round-5 findings #27, #28, #34, #35, #36.

- `tts/queries.ts` getMessageChunks return validator narrows `format` and
  `error` from `v.optional(v.string())` to the closed unions built from
  `audioFormatLiterals` and `ttsErrorCodeLiterals`. The schema's writer
  validator already uses those unions; the query was the only seam where
  a future drift could fan out unnoticed.
- `tts/queries.ts` getVoiceModeEffective now falls back to a prefix-only
  `userPreferences` lookup when the thread has no `organizationId`
  (legacy / edge rows). A user who toggled voice ON globally previously
  got silently-off voice on those threads.
- `lib/shared/schemas/providers.ts` `defaultVoice` and `voicesByLocale`
  values now reject all-whitespace strings (`.regex(/\S/)`) so `'   '`
  no longer slips through `.min(1)` and surfaces as UNKNOWN_VOICE at
  synth time.
- `lib/shared/schemas/providers.ts` locale-regex docs explicitly note
  the narrow BCP-47 subset (ISO-639-1 + optional ISO-3166-1 alpha-2);
  script subtags (`zh-Hans`), 3-letter codes (`fil`), and UN region
  codes (`en-419`) are intentionally out of scope. Adds a follow-up
  pointer in the comment so future widening is a deliberate, lockstep
  change with the resolver.
- `lib/shared/schemas/providers.ts` superRefine now uses the `forEach`
  index instead of `data.models.indexOf(model)` (O(n²) → O(n)) and
  points the error `path` at the actually-missing field
  (`voicesByLocale` when the operator only typed an empty map, else
  `defaultVoice`), so the operator's editor jumps to the right line.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant