PointerRAG

PointerRAG is a Retrieval-Augmented Generation (RAG) system that allows users to chat with their documents. It consists of a modern Next.js frontend and a robust FastAPI backend powered by ChromaDB AND PostgreSQL.

Features

Chat Interface: Real-time chat with AI assistance.
Persistent History: Chat sessions and messages are saved in PostgreSQL.
Document Ingestion: Upload PDF, TXT, and Markdown files with structural preservation using block-level text extraction.
RAG Pipeline:
- Automatic text chunking, context cleaning, and generation of dense vector embeddings using BAAI/bge-base-en-v1.5.
- Vector search using ChromaDB, paired with a CrossEncoder (ms-marco-MiniLM) reranking layer that strictly filters irrelevant context to prevent AI hallucinations.
- Post-processing layer to format numbered lists and improve structural clarity for generated responses.
- Generative AI via a highly optimized local Pointer-Generator Network (T5-based) featuring encoder-output caching for ultra-low latency inference and extended target limits to prevent output truncation.
Backend API: Fast and scalable API built with FastAPI.

Tech Stack

Frontend

Framework: Next.js 15 (App Router)
Styling: Tailwind CSS, Shadcn UI
Icons: Lucide React

Backend

Framework: FastAPI
Primary Database: PostgreSQL (Chat History)
Vector Database: ChromaDB (Document Embeddings)
ORM/Database: SQLAlchemy (Python) & Prisma (Schema Management)
Embeddings: BAAI/bge-base-en-v1.5 (via Sentence Transformers)
PDF Processing: PyMuPDF (fitz)

Getting Started

Prerequisites

Node.js 18+
Python 3.10+
PostgreSQL (Running locally or hosted)

1. Environment Configuration

Copy the provided .env.example to create .env files in both the root directory and backend/ directory, and ensure the DATABASE_URL is set to your Supabase connection string:

DATABASE_URL="postgresql://postgres:<YOUR_PASSWORD>@db.<YOUR_SUPABASE_REF>.supabase.co:5432/postgres"

Prisma Configuration: The project uses Prisma to manage the database schema. After setting your DATABASE_URL, you need to pull the schema and generate the client:

npx prisma db pull
npx prisma generate

2. Backend Setup

Navigate to the project root and install dependencies:

# Activate your virtual environment first
pip install -r backend/requirements.txt

Initialize the Database: Run the initialization script to create the necessary tables (Chat, Message) in Postgres:

python scripts/init_db.py

Start the backend server:

uvicorn backend.main:app --reload

The API will be available at http://127.0.0.1:8000.

Swagger UI: http://127.0.0.1:8000/docs
API Reference: See docs/API_REFERENCE.md

3. Frontend Setup

Install dependencies and start the development server:

npm install
npm run dev

Open http://localhost:3000 in your browser.

Database Management Scripts

The project includes utility scripts in the scripts/ folder to help manage the database:

Initialize Database: Creates tables if they don't exist.
```
python scripts/init_db.py
```
Reset Database: WARNING - Drops all Chat and Message tables and recreates them. Use this to clear all history.
```
python scripts/reset_db.py
```
Test API: Runs a quick verification to ensure the Backend API is working and creating chats correctly.
```
python scripts/test_chat_api.py
```

Documentation

Comprehensive API documentation and usage examples are available in the docs/ folder:

API Reference: Detailed endpoints guide.
Curl Examples: Ready-to-use scripts for testing Ingestion, Search, and Stats.

Project Structure

pointerRAG/
├── app/                  # Next.js App Router pages
├── backend/              # Python FastAPI Backend
│   ├── api/              # API Routes (Ingestion, Chat, Vector)
│   ├── core/             # Configuration & Database Models
│   ├── schemas/          # Pydantic Schemas
│   ├── services/         # Business Logic
│   └── main.py           # Entry point
├── components/           # React Components
├── docs/                 # Documentation & Examples
├── prisma/               # Database Schema (Reference)
├── scripts/              # DB Management Utilities
└── public/               # Static Assets

Name		Name	Last commit message	Last commit date
Latest commit History 98 Commits
app		app
backend		backend
better-auth_migrations		better-auth_migrations
checkpoint-1876		checkpoint-1876
components		components
docs		docs
hooks		hooks
lib		lib
prisma		prisma
public		public
scripts		scripts
sql		sql
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
better_auth_migrate.log		better_auth_migrate.log
check_db.py		check_db.py
components.json		components.json
debug_rag.py		debug_rag.py
docker-compose.yml		docker-compose.yml
eslint.config.mjs		eslint.config.mjs
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
prisma.config.ts		prisma.config.ts
proxy.ts		proxy.ts
test_api.py		test_api.py
test_doc.txt		test_doc.txt
tsconfig.json		tsconfig.json
verify_rag.py		verify_rag.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PointerRAG

Features

Tech Stack

Frontend

Backend

Getting Started

Prerequisites

1. Environment Configuration

2. Backend Setup

3. Frontend Setup

Database Management Scripts

Documentation

Project Structure

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PointerRAG

Features

Tech Stack

Frontend

Backend

Getting Started

Prerequisites

1. Environment Configuration

2. Backend Setup

3. Frontend Setup

Database Management Scripts

Documentation

Project Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages