Skip to content

photostructure/node-vector-bench

Repository files navigation

node-vector-bench

Benchmarking framework for comparing Node.js vector search libraries that run locally on Linux, Windows, and macOS.

Designed to simulate real-world media management use cases with synthetic datasets that mimic the cluster structure of real embeddings.

THIS IS A WORK IN PROGRESS AND MAY ONLY BE DIRECTIONALLY USEFUL

I'd been trying to find optimal configuration defaults for each library to balance indexing speed, query speed, storage, and recall, but given that the user may not know the final size of the corpus, there's a bit of chicken-and-egg going on here.

The current approach is to run both USearch and LanceDB with default settings, and let them figure out what we should do.

In any event, I'm quite sure that the current settings are suboptimal. If you have experience with these libraries and suggestions for better configuration parameters either during index build time or query time, please open an issue or PR!

TL;DR

sqlite-vec USearch LanceDB DuckDB VSS
Algorithm Brute force HNSW IVF_FLAT / IVF_PQ HNSW
Recall 100% (exact) High, consistent across scales Degrades at scale without tuning High
Query speed Slow at scale Fast Fast with tuned index Moderate
Build speed Fast (insert only) Slow (graph construction) Fast (centroid training) Slow
Memory Disk-backed (SQLite pager) Full index in RAM Memory-mapped on-disk Aggressive RAM usage
Set and forget Yes Yes No (needs per-scale numPartitions/nprobes) Yes
Large scale Impractical above ~100k Works well Works if tuned Excluded (OOM risk)
Best for Small datasets needing exact results Read-heavy workloads with build time budget Large datasets that exceed RAM Small datasets only

Quick start

npm install
npm run prepare      # generate synthetic datasets + ground truth
npm run bench:xs     # 1k vectors, 128d
npm run bench:s      # 10k vectors, 512d
npm run bench:m      # 100k vectors, 512d
npm run bench:l      # 500k vectors, 512d
npm run bench:xl     # 1M vectors, 512d
npm run bench:xxl    # 2M vectors, 512d

Profiles

Dimensions match real-world embedding models used in self-hosted media management.

Size Index vectors Dim Model class Held-out queries k values
xs 1k 128 dlib faces 200 10
s 10k 512 CLIP / FaceNet 200 1, 10
m 100k 512 CLIP at scale 200 1, 10
l 500k 512 CLIP large library 200 1, 10
xl 1M 512 CLIP large collection 200 1, 10
xxl 2M 512 CLIP power user 200 1, 10

Custom profiles: create a JSON file in profiles/ and run npm run bench -- profiles/my-profile.json.

Results

Charts are auto-generated after each benchmark run.

Scaling (across dataset sizes)

QPS Scaling k=10

Recall Scaling k=10

Build Time Scaling

Index Size Scaling

Peak Memory Scaling

Per-profile detail (XXL profile shown)

QPS k=10

Recall k=10

Build Time

Index Size

Peak Memory

Methodology

  • GMM synthetic data: Vectors are generated using a Gaussian Mixture Model rather than uniform random points on the unit sphere. Cluster counts scale proportionally with dataset size (~250 vectors per cluster) to maintain consistent difficulty across profiles. See profiles/*.json for per-profile cluster counts and src/dataset.ts for the generator. Uniform-on-sphere data is pathological for ANN algorithms — IVF and HNSW both plateau well below 90% recall because there is no exploitable cluster structure.
  • Default configurations: Each library runs with out-of-the-box defaults (no per-scale tuning). This tests what a user gets without manual optimization. Library-specific settings are in the profile JSONs under usearch, lancedb, etc.
  • Held-out queries: Query vectors are held out from the same generated distribution (tail split, like ann-benchmarks' train/test split), so queries share the same cluster structure as the index but are never present in it. This avoids self-match bias while ensuring ANN recall is meaningful.
  • Ground truth: Exact brute-force search via sqlite-vec (L2 distance). Recall is only computed when the ANN library's metric is compatible with L2; mismatches are rejected at startup.
  • MaybePromise runners: Runner methods return T | Promise<T>. Sync runners (sqlite-vec, USearch) avoid async overhead; async runners (LanceDB) return Promises. The harness awaits all calls uniformly.
  • 64-bit PRNG: Uses BigInt-based splitmix64 (full 2^64 period) for vector generation, sufficient for XL datasets (512M+ random draws).
  • Index construction: USearch supports multi-threaded batch insertion via its C++ bindings. LanceDB and sqlite-vec insertions are single-threaded; sqlite-vec uses a transaction wrapper for batch commits.
  • Reproducibility: Deterministic seed (42) for all vector generation. Index and query vectors are split from a single generation pass.

Why not zvec?

zvec (@zvec/zvec on npm) is Alibaba's in-process vector database built on Proxima. As of February 2026, it only supports Linux (x86_64, ARM64) and macOS (ARM64) — no Windows. PhotoStructure needs to run on all three platforms, so zvec is excluded.

Why not DuckDB + VSS?

DuckDB with its VSS extension provides HNSW indexing via CREATE INDEX ... USING HNSW. We evaluated it (@duckdb/node-api, MIT license) and found three problems:

  1. Memory consumption: DuckDB is an OLAP engine designed to use all available RAM for analytical queries. At 500k vectors (512d), it nearly OOMed a development machine. For a desktop app running on consumer hardware alongside other applications, this is disqualifying.
  2. Query overhead: Each vector search requires serializing the query vector as a SQL array literal ([1.0,2.0,...]::FLOAT[512] — ~3KB of text per query). The @duckdb/node-api bindings don't yet support binding Float32Array directly to FLOAT[] parameters (issue #182). At 100k vectors, DuckDB VSS achieved 39 QPS vs usearch at 367 QPS.
  3. Index bloat: HNSW persistence is experimental (SET hnsw_enable_experimental_persistence = true). The on-disk index was 2.5x larger than other libraries (488 MB vs ~200 MB at 100k vectors).

The runner is still included in xs/s/m profiles for reference, but excluded from l/xl/xxl to avoid OOM crashes.

LanceDB insert limitations

LanceDB's Node.js table.add() accepts either row objects (Record<string, unknown>[]) or Apache Arrow tables. However, passing Arrow tables constructed from the project's own apache-arrow module silently fails — searches return empty results. This is because @lancedb/lancedb bundles its own apache-arrow instance, and the IPC serialization path can't handle Data objects from a different module. The instanceof Table duck-type check passes, but the underlying class hierarchies differ.

There are no parallelism, concurrency, or batch-size options exposed through the Node.js API. The NAPI binding serializes everything into a single Arrow IPC buffer per add() call. Concurrent add() calls from JS would serialize at LanceDB's internal write lock.

The runner uses row objects with vectors.subarray() (zero-copy view) fed to Array.from() (unavoidable float-to-number boxing), batched at 10k rows.

Adding a new library

  1. Create a runner in src/runners/ extending BenchmarkRunner from src/runners/base.ts
  2. Implement setup(), buildIndex(), search(), and cleanup() (sync or async)
  3. Register it in src/harness.ts
  4. Add it to your profile's libraries array

About

Vector bench

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published