A modern web application for interactive conversations with PDF documents using AI
PDFChat is a sophisticated Flask web application that transforms how you interact with PDF documents. Built with a modern tech stack, it allows you to upload PDF files and engage in natural language conversations about their content. The application leverages advanced AI models to understand your questions and provide accurate, contextual responses based on the document's content.
Key Highlights:
- 🤖 AI-powered document understanding
- 💬 Real-time chat interface with conversation history
- 📄 Multi-document support with organized management
- 👤 User authentication and profile management
- 🎨 Modern, responsive UI with Tailwind CSS
- 🔒 Secure file handling and user data protection
The application follows these steps to respond to your questions:
- PDF Loading: The application reads PDF documents and extracts their text content.
- Text Chunking: The extracted text is divided into smaller chunks that can be effectively processed using RecursiveCharacterTextSplitter.
- Vectorization: The application uses language models to generate vector representations (embeddings) of the text chunks.
- Similarity Matching: When you ask a question, the application compares it with the text chunks and identifies the most semantically similar ones.
- Response Generation: The selected chunks are passed to the language model, which generates a response based on the relevant content from the PDF.
- 🔐 User Authentication: Secure registration, login, and profile management
- 📚 Document Management: Upload, organize, and manage multiple PDF documents
- 💬 Intelligent Chat: Contextual conversations with PDF content using AI
- 📱 Responsive Design: Modern UI that works seamlessly across all devices
- 🔄 Conversation History: Persistent chat sessions with document context
- 👤 User Profiles: Personal dashboard with account settings and preferences
- 🔒 Secure Storage: Safe handling of user documents and data
- ⚡ Real-time Responses: Fast AI-powered answers to your questions
- 📤 Export Features: Download documents and chat transcripts
- 🎨 Modern Interface: Clean design built with Tailwind CSS
- Flask: Modern Python web framework with Blueprint organization
- SQLAlchemy: ORM for database management with SQLite
- Flask-JWT-Extended: JWT token-based authentication
- LangChain: Advanced framework for LLM applications
- FAISS: High-performance vector similarity search
- OpenAI API: GPT models for intelligent document understanding
- Flask-Migrate: Database migration management
- Tailwind CSS: Utility-first CSS framework for modern styling
- Vanilla JavaScript: Clean, modern ES6+ for dynamic interactions
- Responsive Design: Mobile-first approach with adaptive layouts
- Component Architecture: Modular JavaScript for maintainability
- SQLite: Lightweight database for development and small deployments
- File Storage: Secure local file handling with organized uploads
- Environment Configuration: Flexible config management for different environments
Note: When using the OpenAI API, ensure that you have configured your API key correctly in the .env file.
- Python 3.8 or higher
- Git
- OpenAI API key
-
Clone the repository:
git clone https://github.com/Bagusdevaa/Chat-with-PDF.git cd PDFChat -
Create and activate virtual environment:
# Windows python -m venv pdfchat pdfchat\Scripts\activate # macOS/Linux python3 -m venv pdfchat source pdfchat/bin/activate
-
Install dependencies:
pip install -r requirements.txt
-
Environment configuration: Create a
.envfile in the project root:OPENAI_API_KEY=your_openai_api_key_here SECRET_KEY=your_secret_key_here DATABASE_URL=sqlite:///pdfchat.db FLASK_ENV=development
-
Initialize the database:
flask db upgrade
-
Build CSS (if modifying styles):
npm install npm run build-css
Option 1: Using Python directly
python server.pyOption 2: Using Flask CLI
flask runOption 3: Using the batch file (Windows)
start_pdfchat.batThe application will be available at http://localhost:5000
- Register an account at
/register - Login with your credentials
- Upload a PDF document from the documents page
- Start chatting with your document!
- Create Account: Register with your email and create a secure password
- Upload Documents: Use the document management interface to upload PDF files
- Start Conversations: Click on any document to begin an AI-powered conversation
- Ask Questions: Type natural language questions about your document content
- Manage Profile: Update your account settings and preferences
- Document Library: Organized view of all your uploaded PDFs
- Chat Interface: Intuitive messaging system with AI responses
- Profile Management: Update personal information and account settings
- Conversation History: Access previous chats and continue where you left off
- Responsive Design: Works seamlessly on desktop, tablet, and mobile devices
PDFChat/
├── app/ # Main application package
│ ├── __init__.py # Flask app factory and configuration
│ ├── response.py # Standardized API response utilities
│ ├── controller/ # Business logic controllers
│ │ └── usercontroller.py # User management logic
│ ├── models/ # Database models and data processing
│ │ ├── __init__.py
│ │ ├── user.py # User authentication model
│ │ ├── documents.py # Document management model
│ │ ├── conversation.py # Chat conversation model
│ │ ├── message.py # Chat message model
│ │ └── pdf_processor.py # PDF processing and AI integration
│ ├── routes/ # URL route handlers
│ │ ├── __init__.py
│ │ ├── main.py # Main page routes
│ │ ├── auth.py # Authentication routes
│ │ └── api.py # REST API endpoints
│ ├── static/ # Static assets
│ │ ├── css/ # Tailwind CSS files
│ │ ├── js/ # Frontend JavaScript modules
│ │ ├── img/ # Images and graphics
│ │ └── uploads/ # User uploaded files
│ └── templates/ # Jinja2 HTML templates
│ ├── landing.html # Landing page
│ ├── login.html # Login page
│ ├── signup.html # Registration page
│ ├── documents.html # Document management
│ ├── conversation.html # Chat interface
│ ├── profile.html # User profile
│ └── *.html # Other templates
├── migrations/ # Database migration files
├── docs/ # Documentation and diagrams
├── config.py # Application configuration
├── server.py # Application entry point
├── requirements.txt # Python dependencies
├── package.json # Node.js dependencies for CSS
├── tailwind.config.js # Tailwind CSS configuration
├── start_pdfchat.bat # Windows batch startup script
└── README.md # Project documentation
POST /api/auth/register- User registrationPOST /api/auth/login- User loginGET /api/auth/profile- Get user profilePUT /api/auth/profile- Update user profilePUT /api/auth/change-password- Change password
GET /api/documents- List user documentsPOST /api/documents/upload- Upload new documentGET /api/documents/{id}- Get document detailsDELETE /api/documents/{id}- Delete document
GET /api/conversations/{document_id}- Get document conversationsPOST /api/conversations- Create new conversationGET /api/conversation/{id}/messages- Get conversation messagesPOST /api/conversation/{id}/message- Send message
- Fork the repository
- Create a feature branch:
git checkout -b feature-name - Make your changes and test thoroughly
- Commit your changes:
git commit -am 'Add new feature' - Push to the branch:
git push origin feature-name - Submit a Pull Request
If you modify Tailwind styles:
npm run build-css
# or for development with watch mode
npm run watch-cssWhen you modify models:
flask db migrate -m "Description of changes"
flask db upgrade- File Validation: Only PDF files are accepted with proper validation
- User Authentication: JWT token-based secure authentication
- Data Privacy: User documents are stored securely and isolated
- API Security: Rate limiting and input validation on all endpoints
- Environment Variables: Sensitive data stored in environment variables
This project is licensed under the MIT License - see the LICENSE file for details.
- OpenAI for providing the GPT models
- LangChain community for the excellent framework
- Flask and Python ecosystem contributors
- Tailwind CSS for the utility-first CSS framework
Note: This application requires an OpenAI API key for AI functionality. Make sure to obtain one from OpenAI's website and configure it properly in your environment variables.
