add a document table to persist all the uploaded document names and sizes by alvinsjoy · Pull Request #9 · PointerRAG/Project

alvinsjoy · 2026-04-28T14:57:29Z

No description provided.

coderabbitai · 2026-04-28T14:57:44Z

Warning

Rate limit exceeded

@alvinsjoy has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 42 minutes and 26 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 425181c1-49e9-47a0-ba18-9da00dccab37

📥 Commits

Reviewing files that changed from the base of the PR and between 8892ef6 and 267abff.

📒 Files selected for processing (3)

app/api/upload/route.ts
backend/services/vector_service.py
components/chat-area.tsx

📝 Walkthrough

Walkthrough

Replaces Chat.documentCount with a Document model across backend, API, database, data layer, actions, and UI; ingestion now returns storage_key and file_size, uploads persist Document records (id from storage_key, size as BigInt), and deletions remove Document rows and vector chunks by storage_key.

Changes

Cohort / File(s)	Summary
API Upload Handler `app/api/upload/route.ts`	POST upload now reads `storage_key` and `file_size` from ingestion response and creates a `Document` record (id from `storage_key`, size -> BigInt) instead of incrementing `chat.documentCount`.
Backend Ingestion `backend/api/v1/ingestion_routes.py`, `backend/schemas/ingestion.py`, `backend/services/ingestion_service.py`	Ingestion route generates `storage_key`; service returns `storage_key` and `file_size` (bytes); chunk IDs/metadata use `storage_key` as `source_id`.
Vector/Delete Flow `backend/api/v1/vector_routes.py`, `backend/services/vector_service.py`	Delete endpoints and service switched from filename-based deletion to `storage_key`-based deletion (filter on `source_id`).
Data Layer & Types `lib/data/chat.ts`, `lib/types.ts`	Adds exported `Document` type and `Chat.documents: Document[]`; queries now include related `documents` and derive `documentCount` from Prisma `_count.documents`.
Actions / Deletion `lib/actions/chat.ts`	`deleteDocumentAction` signature changed to accept `documentId` (storage_key); backend DELETE uses `storage_key`; after backend success, deletes `Document` row by primary key instead of decrementing Chat counter.
Frontend UI `components/chat-area.tsx`, `app/chat/loading.tsx`	Chat-area seeds documents from `currentChat.documents`, formats file size, removes document-count subtitle, compacts upload/document chip UI; loading header skeleton made more compact.
Database Schema & Migrations `prisma/schema.prisma`, `prisma/migrations/...add_document_model/migration.sql`, `prisma/migrations/...change_size_to_bigint/migration.sql`, `prisma/migrations/...remove_document_filename/migration.sql`	Removes `Chat.documentCount`, adds `Document` model (name, size as BIGINT, createdAt, chatId FK with cascade), migrates size -> BIGINT, and removes deprecated `filename` column.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant AppAPI as App API (Next.js route)
  participant Python as Ingestion Backend
  participant DB as Prisma/Database
  participant Vector as Vector Service

  Client->>AppAPI: POST /api/upload (file, chatId, auth)
  AppAPI->>Python: POST /ingest (stream file, chat_id, generated storage_key)
  Python-->>AppAPI: 200 { chat_id, filename, storage_key, file_size, ... }
  AppAPI->>DB: prisma.document.create({ id: storage_key, name: filename, size: BigInt(file_size), chatId })
  AppAPI->>Vector: (optional) notify vector or rely on Python to upsert chunks using storage_key
  AppAPI-->>Client: 200 OK (document persisted)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

use streaming API route to driectly upload to backend without nextjs server loads, use rich color toasts #8: Modified the app/api/upload/route.ts POST handler for document tracking; this PR replaces counter incrementing with Document record creation using backend-provided metadata.
Dev #3: Related changes to upload/document persistence flow and frontend actions; both PRs touch upload, persistence, and document-state handling.

Poem

🐰
Hop, I carried keys of storage bright,
From count to catalog, each file takes flight.
Chunks know their source, sizes tallied true,
I nibble bugs and leave the code anew. ✨

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 57.14% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check	❓ Inconclusive	No description was provided by the author, making it impossible to assess relevance to the changeset.	Add a pull request description explaining the motivation, implementation details, and any breaking changes or migration notes.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main change—introducing a Document table to persist uploaded document metadata (names and sizes).
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch dev

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

backend/schemas/ingestion.py (1)

23-31: ⚠️ Potential issue | 🟡 Minor

Update the example to include the new file_size field.

The json_schema_extra example is missing the newly added file_size field. This could cause confusion for API consumers reviewing the generated OpenAPI documentation.

📝 Proposed fix

     class Config:
         json_schema_extra = {
             "example": {
                 "chat_id": "uuid-123",
                 "filename": "architecture_guide.pdf",
+                "file_size": 1048576,
                 "chunks_created": 15,
                 "document_ids": ["arch_guide_chunk_0", "arch_guide_chunk_1"],
                 "pages_processed": 5
             }
         }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@backend/schemas/ingestion.py` around lines 23 - 31, The json_schema_extra
example in the ingestion model is missing the new file_size field; update the
example object inside json_schema_extra (the "example" dict) to include a
file_size key with a realistic numeric value (e.g., file_size: 102400) so the
generated OpenAPI docs show the new property for consumers.

🧹 Nitpick comments (1)

prisma/migrations/20260428143008_add_document_model/migration.sql (1)

88-88: Add uniqueness for document identity per chat.

Without a composite unique key, retries can create duplicate document rows for the same (chatId, filename), which hurts idempotency and document counts.

Index addition

 -- CreateIndex
 CREATE INDEX "Document_chatId_idx" ON "Document"("chatId");
+
+-- CreateIndex
+CREATE UNIQUE INDEX "Document_chatId_filename_key" ON "Document"("chatId", "filename");

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@prisma/migrations/20260428143008_add_document_model/migration.sql` at line
88, The migration currently creates a non-unique index ("Document_chatId_idx")
on Document(chatId) but needs a composite uniqueness constraint to prevent
duplicate documents per chat; modify the migration to add a composite unique
index or UNIQUE constraint on (chatId, filename) for the Document table (e.g.,
create a UNIQUE INDEX or ALTER TABLE ADD CONSTRAINT unique on chatId+filename)
so (chatId, filename) is enforced unique and idempotent — reference the Document
table and the chatId and filename columns when applying the change and
update/remove the existing non-unique index name accordingly.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@components/chat-area.tsx`:
- Around line 57-65: The documents state (documents/useState) is only
initialized once and isn't reset when currentChat changes; update the effect
that currently resets localMessages (the useEffect around localMessages reset)
to also reset documents by calling setDocuments with currentChat?.documents
mapped into UploadedDocument shape (id, name, size via formatFileSize, filename,
uploading:false) whenever currentChat changes so documents reflect the newly
selected chat.

In `@prisma/migrations/20260428143008_add_document_model/migration.sql`:
- Around line 96-97: The migration adds the FK constraint ALTER TABLE "Chat" ADD
CONSTRAINT "Chat_userId_fkey" ... referencing "user", but existing Chat rows
have userId='guest' so the FK will fail; before adding this FK either insert a
corresponding 'guest' row into the newly created "user" table or backfill Chat
rows with a valid user id. Modify the migration to (1) after creating the "user"
table but before the ALTER TABLE, INSERT a 'guest' user record with the same id
used by Chat, or (2) run an UPDATE on the "Chat" table to set userId to an
existing user id for orphaned rows, then add the ALTER TABLE statement. Ensure
the insert/backfill runs in the same migration and precedes the ALTER TABLE
"Chat" ADD CONSTRAINT "Chat_userId_fkey" FOREIGN KEY ("userId") ... statement.
- Line 74: The migration uses "size" INTEGER which can overflow for large files;
update the migration to alter the column type from INTEGER to BIGINT (change
"size" INTEGER NOT NULL to "size" BIGINT NOT NULL) and ensure any related DDL
(constraints/defaults) that reference "size" are preserved; also update the
corresponding Prisma model field (the Document model's size attribute) to
BigInt/BigInt? type so the ORM and DB types stay in sync.

---

Outside diff comments:
In `@backend/schemas/ingestion.py`:
- Around line 23-31: The json_schema_extra example in the ingestion model is
missing the new file_size field; update the example object inside
json_schema_extra (the "example" dict) to include a file_size key with a
realistic numeric value (e.g., file_size: 102400) so the generated OpenAPI docs
show the new property for consumers.

---

Nitpick comments:
In `@prisma/migrations/20260428143008_add_document_model/migration.sql`:
- Line 88: The migration currently creates a non-unique index
("Document_chatId_idx") on Document(chatId) but needs a composite uniqueness
constraint to prevent duplicate documents per chat; modify the migration to add
a composite unique index or UNIQUE constraint on (chatId, filename) for the
Document table (e.g., create a UNIQUE INDEX or ALTER TABLE ADD CONSTRAINT unique
on chatId+filename) so (chatId, filename) is enforced unique and idempotent —
reference the Document table and the chatId and filename columns when applying
the change and update/remove the existing non-unique index name accordingly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: f96d3e3c-338f-4660-9bdb-57c9c39e4391

📥 Commits

Reviewing files that changed from the base of the PR and between 6061c86 and eba8ee7.

📒 Files selected for processing (9)

app/api/upload/route.ts
backend/schemas/ingestion.py
backend/services/ingestion_service.py
components/chat-area.tsx
lib/actions/chat.ts
lib/data/chat.ts
lib/types.ts
prisma/migrations/20260428143008_add_document_model/migration.sql
prisma/schema.prisma

coderabbitai · 2026-04-28T15:02:05Z

+    "id" TEXT NOT NULL,
+    "name" TEXT NOT NULL,
+    "filename" TEXT NOT NULL,
+    "size" INTEGER NOT NULL,


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat prisma/migrations/20260428143008_add_document_model/migration.sql

Repository: PointerRAG/Project

Length of output: 2876

🏁 Script executed:

# Check the schema file to understand size field context and consistency find . -name "schema.prisma" -o -name "*schema*" -type f | head -20

Repository: PointerRAG/Project

Length of output: 116

🏁 Script executed:

cat prisma/schema.prisma

Repository: PointerRAG/Project

Length of output: 2532

🏁 Script executed:

# Check if there are any file size limits or constraints defined in the codebase rg -i "file.*size|max.*upload|size.*limit" --type ts --type js --type tsx --type jsx | head -30

Repository: PointerRAG/Project

Length of output: 90

🏁 Script executed:

# Search for file size constraints without specifying file types rg -i "file.*size|max.*upload|size.*limit|INTEGER|BIGINT" prisma/ src/ app/ --max-count 20

Repository: PointerRAG/Project

Length of output: 506

🏁 Script executed:

cat app/api/upload/route.ts

Repository: PointerRAG/Project

Length of output: 3386

🏁 Script executed:

# Search for upload size limits or constraints rg -i "limit|max|size" app/api/upload/ --max-count 15

Repository: PointerRAG/Project

Length of output: 175

Change size field to BIGINT to prevent overflow.

PostgreSQL INTEGER has a maximum value of ~2.14 GB, which is insufficient for document storage. Files larger than this limit would cause integer overflow and data corruption. BIGINT safely handles arbitrarily large file sizes.

Type adjustment

- "size" INTEGER NOT NULL, + "size" BIGINT NOT NULL,

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

"size" INTEGER NOT NULL,

"size" BIGINT NOT NULL,

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@prisma/migrations/20260428143008_add_document_model/migration.sql` at line 74, The migration uses "size" INTEGER which can overflow for large files; update the migration to alter the column type from INTEGER to BIGINT (change "size" INTEGER NOT NULL to "size" BIGINT NOT NULL) and ensure any related DDL (constraints/defaults) that reference "size" are preserved; also update the corresponding Prisma model field (the Document model's size attribute) to BigInt/BigInt? type so the ORM and DB types stay in sync.

Copilot

Pull request overview

Adds first-class document metadata tracking to the chat system by introducing a Document model/table and plumbing file size through the ingestion/upload flow, enabling the UI to render uploaded document chips based on persisted records.

Changes:

Replaces Chat.documentCount column with a Chat -> Document[] relation and counts via Prisma _count.
Extends backend ingestion response to include file_size, and creates Document records on upload + deletes them on document removal.
Updates chat UI to initialize document chips from currentChat.documents and tweaks header/chip styling.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
prisma/schema.prisma	Removes `documentCount`, adds `Document` model and `Chat.documents` relation.
prisma/migrations/20260428143008_add_document_model/migration.sql	Migration to drop `documentCount` and create `Document` table (currently includes additional unrelated table creation/FKs).
lib/types.ts	Adds `Document` type and `Chat.documents`.
lib/data/chat.ts	Fetches documents and document counts via Prisma includes + `_count`.
lib/actions/chat.ts	Deletes `Document` records when removing a document (no longer decrements `documentCount`).
components/chat-area.tsx	Seeds document chips from persisted documents; adjusts header and chip styling.
backend/services/ingestion_service.py	Adds `file_size` to ingestion response.
backend/schemas/ingestion.py	Adds `file_size` field to `IngestResponse`.
app/api/upload/route.ts	Creates `Document` row after successful ingestion using returned `filename` + `file_size`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-28T15:18:08Z

+  const [documents, setDocuments] = useState<UploadedDocument[]>(
+    (currentChat?.documents ?? []).map((d) => ({
+      id: d.id,
+      name: d.name,
+      size: formatFileSize(d.size),
+      filename: d.filename,
+      uploading: false,
+    })),
+  );


documents state is initialized from currentChat?.documents, but it won’t update when navigating to a different chat or after router.refresh() changes currentChat props. This can cause the document chips to show the previous chat’s documents or miss newly persisted documents. Add an effect keyed on currentChat?.id (and/or currentChat?.documents) to reset documents from the latest props when the active chat changes.

Copilot · 2026-04-28T15:18:08Z

+  // Delete the Document record from Prisma
+  await prisma.document.deleteMany({
+    where: { chatId, filename },
  });


deleteDocumentAction deletes Document rows by (chatId, filename). If a user uploads multiple documents with the same filename into the same chat, a single remove action will delete all matching records (and the vector delete endpoint also targets filename). If you intend single-document deletion, consider using a unique storage key per upload (persisted in Document.filename) and deleting by that, or enforce uniqueness on (chatId, filename) to make the current behavior explicit.

Copilot · 2026-04-28T15:18:09Z

+-- CreateTable
+CREATE TABLE "user" (
+    "id" TEXT NOT NULL,
+    "name" TEXT NOT NULL,
+    "email" TEXT NOT NULL,
+    "emailVerified" BOOLEAN NOT NULL,
+    "image" TEXT,
+    "createdAt" TIMESTAMP(3) NOT NULL,
+    "updatedAt" TIMESTAMP(3) NOT NULL,
+
+    CONSTRAINT "user_pkey" PRIMARY KEY ("id")
+);


This migration is creating the Better Auth tables ("user", "session", "account", "verification") even though the repo already has a Better Auth SQL migration that creates them. Applying this Prisma migration after Better Auth migrations will fail with "relation already exists". Regenerate/split the migration so Prisma only adds the new Document table (and any Chat/Message changes), and ensure the auth tables are managed by a single migration source of truth.

Copilot · 2026-04-28T15:18:09Z

+-- AlterTable
+ALTER TABLE "Chat" DROP COLUMN "documentCount",
+ALTER COLUMN "userId" DROP DEFAULT;
+


This migration drops the "Chat".userId default (previously 'guest' in the initial migration) and adds a foreign key to "user". If any existing Chat rows have userId values that don’t exist in "user" (e.g. 'guest'), the FK add will fail at migration time. Consider a data migration step (backfill/delete/ensure a matching user row) or avoid adding the FK until data is consistent.

Copilot · 2026-04-28T15:18:09Z

+  // 5. Create a Document record in Prisma
+  const filename = result?.filename || "unknown";
+  const fileSize = result?.file_size || 0;
+
+  await prisma.document.create({
    data: {
-      documentCount: { increment: 1 },
-      updatedAt: new Date(),
+      name: filename,
+      filename: filename,
+      size: BigInt(fileSize),
+      chatId,


The schema comment says Document.name is the original display name and Document.filename is the backend storage name, but this code sets both to result.filename (which the backend describes as the original filename). Either persist the original name separately from a stable storage key returned by the backend, or rename/adjust the schema fields/comments so they reflect what’s actually stored. Also, silently defaulting to "unknown" / 0 risks writing bad rows if the backend response changes—prefer validating required fields and returning an error instead.

Copilot · 2026-04-28T15:18:09Z

+  await prisma.document.deleteMany({
+    where: { chatId, filename },
  });



Previously document deletes updated Chat.updatedAt (via prisma.chat.update). Now the Document row is deleted but the Chat timestamp won’t change, which can leave the sidebar ordering / "last updated" time stale after document add/remove (getUserChats orders by updatedAt). Consider touching the Chat (e.g., updating updatedAt) when adding/removing documents to preserve the existing UX behavior.

Suggested change

// Touch the chat so updatedAt reflects document removal and chat ordering stays fresh

await prisma.chat.update({

where: { id: chatId },

data: { updatedAt: new Date() },

});

Copilot

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 6 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-28T15:38:21Z

+  name      String   // original display name (e.g. "report.pdf")
+  filename  String   // backend storage name


Document.filename is documented as a backend storage name, but the backend ingestion API and delete endpoint operate on the original upload filename. Either update this field’s meaning/comment to match actual usage, or rename/split the fields so name (display) and filename (storage key) are truly distinct.

Suggested change

name String // original display name (e.g. "report.pdf")

filename String // backend storage name

name String // display name shown in the UI

filename String // original upload filename used by ingestion/delete APIs

Copilot · 2026-04-28T15:38:22Z

+  // 5. Create a Document record in Prisma
+  const filename = result?.filename || "unknown";
+  const fileSize = result?.file_size || 0;
+
+  await prisma.document.create({
    data: {
-      documentCount: { increment: 1 },
-      updatedAt: new Date(),
+      name: filename,
+      filename: filename,
+      size: BigInt(fileSize),
+      chatId,


This stores the same value into both name and filename. Since the backend currently returns the original upload filename, this makes filename redundant and contradicts the schema’s intent of separating display name vs storage key. Consider persisting the original filename to name and using a distinct storage identifier for filename (or drop one of the fields).

Copilot · 2026-04-28T15:38:22Z

+  await prisma.document.create({
    data: {
-      documentCount: { increment: 1 },
-      updatedAt: new Date(),
+      name: filename,
+      filename: filename,
+      size: BigInt(fileSize),
+      chatId,
    },
  });


Document upload no longer updates Chat.updatedAt. Since the chat list is ordered by updatedAt, uploads may not bump the chat in recency ordering (previously this was updated when incrementing documentCount). Consider updating the parent chat’s updatedAt (or using a dedicated “lastActivityAt”) when creating a Document.

Copilot · 2026-04-28T15:38:22Z

    """Successful response for document ingestion."""
    chat_id: str = Field(..., description="Chat ID the documents were ingested into")
    filename: str = Field(..., description="Original filename of the uploaded document")
+    file_size: int = Field(..., description="File size in bytes")
    chunks_created: int = Field(..., description="Number of text chunks created from the document")
    document_ids: List[str] = Field(..., description="List of document IDs for the created chunks")
    pages_processed: int = Field(default=1, description="Number of pages processed (for PDFs)")


IngestResponse now includes required file_size, but the json_schema_extra example below doesn’t include it. Consider adding file_size to the example so the generated OpenAPI docs stay accurate.

Copilot · 2026-04-28T15:38:23Z

+  - You are about to drop the column `documentCount` on the `Chat` table. All the data in the column will be lost.
+
+*/
+-- AlterTable
+ALTER TABLE "Chat" DROP COLUMN "documentCount",


Dropping Chat.documentCount will irreversibly lose existing counts, and switching to a Document table means existing chats with uploaded docs will now show 0 documents unless you backfill Document rows. If this repo has any existing data, consider a data migration/backfill plan (or keep the column temporarily) so document counts don’t regress after deploy.

Suggested change

- You are about to drop the column `documentCount` on the `Chat` table. All the data in the column will be lost.

*/

-- AlterTable

ALTER TABLE "Chat" DROP COLUMN "documentCount",

- Existing `Chat.documentCount` values are preserved in this migration to avoid regressing document counts for pre-existing chats.

*/

-- AlterTable

ALTER TABLE "Chat"

Copilot · 2026-04-28T15:38:23Z

+ALTER TABLE "Chat" ADD CONSTRAINT "Chat_userId_fkey" FOREIGN KEY ("userId") REFERENCES "user"("id") ON DELETE CASCADE ON UPDATE CASCADE;
+


Adding a foreign key from Chat.userId to "user" can fail on deploy if any existing Chat rows have userId values that don’t exist in the auth "user" table (e.g., legacy 'guest' rows). Before adding this constraint, ensure the data is cleaned/backfilled (or create a corresponding user row) so the migration is guaranteed to succeed.

Copilot

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 6 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-28T16:09:19Z

+  // 5. Create a Document record in Prisma
+  //    Use storage_key as the Document ID for direct lookup during deletion
+  const filename = result?.filename || "unknown";
+  const storageKey = result?.storage_key || filename;


Falling back to filename when storage_key is missing can create non-unique Document IDs and break deletion (and can cause Prisma unique constraint failures). Treat storage_key as required (or generate a UUID here) and return an error if the backend didn’t provide it.

Suggested change

const storageKey = result?.storage_key || filename;

const storageKey =

typeof result?.storage_key === "string" ? result.storage_key.trim() : "";

if (!storageKey) {

console.error("Python ingestion response missing storage_key:", result);

return NextResponse.json(

{ error: "Invalid response from backend: missing storage_key" },

{ status: 502 },

);

}

Copilot · 2026-04-28T16:09:20Z

+-- CreateTable
+CREATE TABLE "user" (
+    "id" TEXT NOT NULL,
+    "name" TEXT NOT NULL,
+    "email" TEXT NOT NULL,


This migration creates the auth tables (user, session, etc.). Those tables are already created by better-auth_migrations/2025-11-28T17-20-59.231Z.sql, so re-creating them here will fail on existing databases. Remove these CREATE TABLE statements from the Prisma migration and only apply the incremental changes needed for this PR.

Copilot · 2026-04-28T16:09:20Z

+    "filename" TEXT NOT NULL,
+    "size" INTEGER NOT NULL,


The Document table is created with a filename column and size INTEGER, but later migrations immediately change size to BIGINT and drop filename. Consider squashing/editing this migration to create the table in its final shape (name + size BIGINT, no filename) to avoid unnecessary churn and potential drift.

Suggested change

"filename" TEXT NOT NULL,

"size" INTEGER NOT NULL,

"size" BIGINT NOT NULL,

Copilot · 2026-04-28T16:09:20Z

+      name: d.name,
+      size: Number(d.size),
+      createdAt: new Date(d.createdAt).toISOString(),


Casting BigInt to Number can lose precision for values above Number.MAX_SAFE_INTEGER. If you want to preserve exact sizes, consider returning size as a string (or bigint) through your types and only formatting for display at the UI layer.

Copilot · 2026-04-28T16:09:20Z

        id: `${Date.now()}-${index}-${file.name}`,
        name: file.name,
        size: (file.size / 1024).toFixed(2) + " KB",
-        filename: file.name,
        uploading: true,
      } satisfies UploadedDocument,
    }));


The optimistic document id is generated client-side, but deletes now use doc.id as the backend storage_key/Prisma Document.id. After upload succeeds you need to replace this optimistic id with the server-provided storage_key (or fully re-sync documents from refreshed currentChat.documents), otherwise delete will call the backend/DB with an ID that doesn’t exist.

Copilot · 2026-04-28T16:09:21Z

+        uploading: false,
+      })),
+    );
  }, [messages, currentChat?.id]);


The documents state is only reset when currentChat?.id changes, so after router.refresh() the updated currentChat.documents won't repopulate local state (same chat id). This leaves the UI with stale/optimistic document IDs and breaks later delete calls. Update the effect dependencies (or derive documents directly from props) so local state syncs when the documents list changes.

Suggested change

}, [messages, currentChat?.id]);

}, [messages, currentChat?.id, currentChat?.documents]);

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

lib/actions/chat.ts (1)

161-174: ⚠️ Potential issue | 🔴 Critical

Scope the document delete to the current chat before calling the backend.

chatId ownership is checked, but documentId is never tied back to that chat. A caller can pass a valid chat they own plus a document ID from another chat; the vector delete becomes a no-op for the owned chat, then prisma.document.delete({ id: documentId }) still removes the other record.

🔒 Proposed fix

+  const document = await prisma.document.findFirst({
+    where: { id: documentId, chatId },
+    select: { id: true },
+  });
+
+  if (!document) {
+    throw new Error("Document not found");
+  }
+
   // Tell the Python backend to delete all chunks for this document
-  const url = `${BACKEND_API_BASE}/vector/document/${chatId}?storage_key=${encodeURIComponent(documentId)}`;
+  const url = `${BACKEND_API_BASE}/vector/document/${chatId}?storage_key=${encodeURIComponent(document.id)}`;
   const response = await fetch(url, { method: "DELETE" });
@@
   // Delete the Document record from Prisma by its unique ID
   await prisma.document.delete({
-    where: { id: documentId },
+    where: { id: document.id },
   });

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@lib/actions/chat.ts` around lines 161 - 174, The document delete is not
scoped to the chat so a user can delete a document from another chat; before
calling the backend (using BACKEND_API_BASE and fetch) verify ownership by
looking up the document with the current chatId (e.g. await
prisma.document.findFirst({ where: { id: documentId, chatId } })) and throw if
not found; only after that successful check proceed to call the Python backend
and then remove the Prisma record (use prisma.document.delete or
prisma.document.deleteMany with the same id+chatId where clause to ensure the DB
delete is also scoped).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/services/vector_service.py`:
- Around line 247-253: delete_documents_by_storage_key currently calls
get_collection(chat_id) which auto-creates collections; modify
delete_documents_by_storage_key to avoid creating collections by using a
non-creating lookup (e.g., add an optional create=False parameter to
get_collection or add a new helper like find_collection/get_existing_collection)
and return immediately (no-op) if no collection exists, otherwise proceed with
the delete logic; update references to get_collection in
delete_documents_by_storage_key to use the new non-creating behavior so orphaned
empty collections are not created.

---

Outside diff comments:
In `@lib/actions/chat.ts`:
- Around line 161-174: The document delete is not scoped to the chat so a user
can delete a document from another chat; before calling the backend (using
BACKEND_API_BASE and fetch) verify ownership by looking up the document with the
current chatId (e.g. await prisma.document.findFirst({ where: { id: documentId,
chatId } })) and throw if not found; only after that successful check proceed to
call the Python backend and then remove the Prisma record (use
prisma.document.delete or prisma.document.deleteMany with the same id+chatId
where clause to ensure the DB delete is also scoped).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 5ea1858a-fec8-453d-9ee3-dbadc7588894

📥 Commits

Reviewing files that changed from the base of the PR and between eba8ee7 and 8892ef6.

📒 Files selected for processing (14)

app/api/upload/route.ts
app/chat/loading.tsx
backend/api/v1/ingestion_routes.py
backend/api/v1/vector_routes.py
backend/schemas/ingestion.py
backend/services/ingestion_service.py
backend/services/vector_service.py
components/chat-area.tsx
lib/actions/chat.ts
lib/data/chat.ts
lib/types.ts
prisma/migrations/20260428150627_change_size_to_bigint/migration.sql
prisma/migrations/20260428155457_remove_document_filename/migration.sql
prisma/schema.prisma

✅ Files skipped from review due to trivial changes (2)

prisma/migrations/20260428150627_change_size_to_bigint/migration.sql
lib/types.ts

🚧 Files skipped from review as they are similar to previous changes (5)

backend/schemas/ingestion.py
app/api/upload/route.ts
prisma/schema.prisma
backend/services/ingestion_service.py
components/chat-area.tsx

…e deletion

add a document table to keep all the uploaded document names and sizes

eba8ee7

alvinsjoy requested a review from Copilot April 28, 2026 14:57

coderabbitai Bot reviewed Apr 28, 2026

View reviewed changes

Copilot started reviewing on behalf of alvinsjoy April 28, 2026 15:11 View session

use bigint for file size and useeffect for document state management

e1fefe2

Copilot AI reviewed Apr 28, 2026

View reviewed changes

update the loading skeletons similar to changed topbar of chat-area

261b1ee

alvinsjoy requested a review from Copilot April 28, 2026 15:30

Copilot started reviewing on behalf of alvinsjoy April 28, 2026 15:31 View session

alvinsjoy changed the title ~~add a document table to keep all the uploaded document names and sizes~~ add a document table to persist all the uploaded document names and sizes Apr 28, 2026

Copilot AI reviewed Apr 28, 2026

View reviewed changes

use uuids for document operations instead of filenames

8892ef6

alvinsjoy requested a review from Copilot April 28, 2026 16:02

Copilot started reviewing on behalf of alvinsjoy April 28, 2026 16:03 View session

Copilot AI reviewed Apr 28, 2026

View reviewed changes

coderabbitai Bot reviewed Apr 28, 2026

View reviewed changes

Comment thread backend/services/vector_service.py

fix optimistic ui state bugs and add collection existence check befor…

267abff

…e deletion

alvinsjoy merged commit ddd57f0 into main Apr 28, 2026

+  // Touch the chat so updatedAt reflects document removal and chat ordering stays fresh
+  await prisma.chat.update({
+    where: { id: chatId },
+    data: { updatedAt: new Date() },
+  });

		name String // original display name (e.g. "report.pdf")
		filename String // backend storage name

		ALTER TABLE "Chat" ADD CONSTRAINT "Chat_userId_fkey" FOREIGN KEY ("userId") REFERENCES "user"("id") ON DELETE CASCADE ON UPDATE CASCADE;

-  const storageKey = result?.storage_key || filename;
+  const storageKey =
+    typeof result?.storage_key === "string" ? result.storage_key.trim() : "";
+  if (!storageKey) {
+    console.error("Python ingestion response missing storage_key:", result);
+    return NextResponse.json(
+      { error: "Invalid response from backend: missing storage_key" },
+      { status: 502 },
+    );
+  }

	"filename" TEXT NOT NULL,
	"size" INTEGER NOT NULL,
	"size" BIGINT NOT NULL,

	}, [messages, currentChat?.id]);
	}, [messages, currentChat?.id, currentChat?.documents]);

Conversation

alvinsjoy commented Apr 28, 2026

Uh oh!

coderabbitai Bot commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot commented Apr 28, 2026 •

edited

Loading