Skip to content

LayXRain/Extract-video-notes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NoteAbstract — Video Course to Structured Notes

A local web application that transforms video lectures into structured, readable notes using local speech recognition (faster-whisper) and AI (Claude / OpenAI).

How It Works

Video File / URL  ➜  FFmpeg (extract audio)  ➜  faster-whisper (transcribe)  ➜  Claude/GPT (generate notes)  ➜  Markdown download

Everything runs locally except the final AI note-generation step (requires Claude or OpenAI API key).

System Requirements

  • Python 3.10–3.12 (3.11 recommended)
  • FFmpegDownload here
  • API key: Claude (Anthropic) or OpenAI (for note generation only)
  • OS: Windows, macOS, or Linux

Quick Start

# 1. Create and activate a virtual environment
python -m venv venv
venv\Scripts\activate       # Windows
# source venv/bin/activate  # macOS / Linux

# 2. Install dependencies
pip install -r requirements.txt

# 3. Start the app
python main.py

Then open http://127.0.0.1:8000 in your browser.

Setup

  1. Open the Settings panel (⚙️ button in the header)
  2. Enter your Claude API key (or OpenAI key)
  3. Choose your preferred ASR model and language defaults
  4. Click Save Settings

Usage

Upload a video file

  • Drag & drop a video file onto the upload zone, or click to browse
  • The job starts automatically after selection

Paste a video URL

  • Paste a YouTube, Bilibili, Vimeo, or any yt-dlp supported URL
  • Click Process URL

Watch progress

  • Real-time progress bar and log output show each stage
  • Cancel anytime with the Cancel button

View & download notes

  • Notes appear automatically when processing completes
  • Structured format: title, summary, chapters, definitions, action items
  • Click Download .md to save as Markdown

ASR Model Sizes

Model Size Speed Accuracy Best For
tiny ~75 MB ⚡⚡⚡⚡ ★★ Quick tests, short clips
base ~140 MB ⚡⚡⚡ ★★★ Default — best balance
small ~460 MB ⚡⚡ ★★★★ Longer lectures, higher quality
medium ~1.5 GB ★★★★★ Maximum accuracy, needs 8GB+ RAM

Project Structure

noteabstract/
├── main.py                 # FastAPI app, routes, WebSocket, pipeline
├── config.py               # Settings management (.env)
├── requirements.txt
├── static/
│   └── index.html          # Complete SPA frontend
├── processors/
│   ├── video.py            # FFmpeg audio extraction
│   ├── asr.py              # faster-whisper transcription
│   ├── llm.py              # Claude/OpenAI note generation
│   └── exporter.py         # Markdown export
├── models/
│   └── job.py              # SQLite database models
├── utils/
│   └── downloader.py       # yt-dlp video download wrapper
├── storage/                # Auto-created data directory
└── README.md

Troubleshooting

"FFmpeg not found"

"No API key configured"

  • Add your Claude or OpenAI API key in Settings

"Not enough memory for model"

  • Switch to a smaller ASR model (tiny or base) in Settings

Download fails

  • Some videos are geo-restricted, private, or require login
  • Try a different URL or download the video manually and upload the file

License

MIT

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors