Skip to content

LeapOCR/leapocr-js

Repository files navigation

LeapOCR JavaScript SDK

npm version License: Apache 2.0 TypeScript

Official JavaScript/TypeScript SDK for LeapOCR - Transform documents into structured data using AI-powered OCR.

Overview

LeapOCR provides enterprise-grade document processing with AI-powered data extraction. This SDK offers a JavaScript/TypeScript-native interface for seamless integration into your Node.js and browser applications.

Installation

npm install leapocr
# or
yarn add leapocr
# or
pnpm add leapocr

Quick Start

Prerequisites

Basic Example

import { LeapOCR } from "leapocr";

// Initialize the SDK with your API key
const client = new LeapOCR({
  apiKey: process.env.LEAPOCR_API_KEY,
});

// Submit a document for processing
const job = await client.ocr.processURL("https://example.com/document.pdf", {
  format: "structured",
  model: "standard-v2",
  schema: {
    type: "object",
    properties: {
      title: { type: "string" },
      total_pages: { type: "number" },
    },
    required: ["title"],
  },
});

// Wait for processing to complete
const result = await client.ocr.waitUntilDone(job.jobId);

// Get the full results
const fullResult = await client.ocr.getJobResult(job.jobId);
console.log("Extracted data:", fullResult.pages);

Key Features

  • TypeScript First - Full type safety with comprehensive TypeScript definitions
  • Multiple Processing Formats - Structured data extraction or markdown output
  • Flexible Model Selection - Choose from standard, pro, or custom AI models
  • Custom Schema Support - Define extraction schemas for your specific use case
  • Built-in Retry Logic - Automatic handling of transient failures
  • Universal Runtime - Works in Node.js and modern browsers
  • Direct File Upload - Efficient multipart uploads for local files

Specify a model in the processing options. Defaults to standard-v2.

Usage Examples

Processing from URL

const client = new LeapOCR({
  apiKey: process.env.LEAPOCR_API_KEY,
});

const job = await client.ocr.processURL("https://example.com/invoice.pdf", {
  format: "structured",
  model: "standard-v2",
  schema: {
    type: "object",
    properties: {
      invoice_number: { type: "string" },
      invoice_date: { type: "string" },
      total_amount: { type: "number" },
    },
    required: ["invoice_number", "total_amount"],
  },
});

const status = await client.ocr.waitUntilDone(job.jobId);

if (status.status === "completed") {
  const result = await client.ocr.getJobResult(job.jobId);
  console.log(`Credits used: ${result.credits_used}`);
  console.log("Data:", result.pages);
}

Processing Local Files

import { readFileSync } from "fs";

const client = new LeapOCR({
  apiKey: process.env.LEAPOCR_API_KEY,
});

const job = await client.ocr.processFile("./invoice.pdf", {
  format: "structured",
  model: "pro-v2",
  schema: {
    type: "object",
    properties: {
      invoice_number: { type: "string" },
      total_amount: { type: "number" },
      invoice_date: { type: "string" },
      vendor_name: { type: "string" },
    },
  },
});

const status = await client.ocr.waitUntilDone(job.jobId);
if (status.status === "completed") {
  const result = await client.ocr.getJobResult(job.jobId);
  console.log("Extracted data:", result.pages);
}

Custom Schema Extraction

const schema = {
  type: "object",
  properties: {
    patient_name: { type: "string" },
    date_of_birth: { type: "string" },
    medications: {
      type: "array",
      items: {
        type: "object",
        properties: {
          name: { type: "string" },
          dosage: { type: "string" },
        },
      },
    },
  },
};

const job = await client.ocr.processFile("./medical-record.pdf", {
  format: "structured",
  schema,
});

Output Formats

Format Description Use Case
structured Single JSON object Extract specific fields across entire document
markdown Text per page Convert document to readable text

Monitoring Job Progress

// Poll for status updates
const pollInterval = 2000; // 2 seconds
const maxAttempts = 150; // 5 minutes max
let attempts = 0;

while (attempts < maxAttempts) {
  const status = await client.ocr.getJobStatus(job.jobId);

  console.log(
    `Status: ${status.status} (${status.progress?.toFixed(1)}% complete)`,
  );

  if (status.status === "completed") {
    const result = await client.ocr.getJobResult(job.jobId);
    console.log("Processing complete!");
    break;
  }

  await new Promise((resolve) => setTimeout(resolve, pollInterval));
  attempts++;
}

Using Template Slugs

// Process a document using a predefined template
const job = await client.ocr.processFile("./invoice.pdf", {
  templateSlug: "my-invoice-template",
});

const result = await client.ocr.waitUntilDone(job.jobId);
const fullResult = await client.ocr.getJobResult(job.jobId);
console.log("Extracted data:", fullResult.pages);

Deleting Jobs

// Delete a completed job to free up resources
await client.ocr.deleteJob(job.jobId);
console.log("Job deleted successfully");

For more examples, see the examples/ directory.

Configuration

Custom Configuration

import { LeapOCR } from "leapocr";

const client = new LeapOCR({
  apiKey: "your-api-key",
  baseURL: "https://api.leapocr.com", // optional
  timeout: 30000, // 30 seconds (optional)
});

Environment Variables

export LEAPOCR_API_KEY="your-api-key"
export LEAPOCR_BASE_URL="https://api.leapocr.com"  # optional

Error Handling

The SDK provides typed errors for robust error handling:

import {
  AuthenticationError,
  ValidationError,
  JobFailedError,
  TimeoutError,
  NetworkError,
} from "leapocr";

try {
  const result = await client.ocr.waitUntilDone(job.jobId);
} catch (error) {
  if (error instanceof AuthenticationError) {
    console.error("Authentication failed - check your API key");
  } else if (error instanceof ValidationError) {
    console.error("Validation error:", error.message);
  } else if (error instanceof NetworkError) {
    // Retry the operation
    console.error("Network error, retrying...");
  } else if (error instanceof JobFailedError) {
    console.error("Processing failed:", error.message);
  } else if (error instanceof TimeoutError) {
    console.error("Operation timed out");
  }
}

Error Types

  • AuthenticationError - Invalid API key or authentication failures
  • AuthorizationError - Permission denied for requested resource
  • RateLimitError - API rate limit exceeded
  • ValidationError - Input validation errors
  • FileError - File-related errors (size, format, etc.)
  • JobError - Job processing errors
  • JobFailedError - Job completed with failure status
  • TimeoutError - Operation timeouts
  • NetworkError - Network/connectivity issues (retryable)
  • APIError - General API errors

API Reference

Full API documentation is available in the TypeScript definitions.

Core Methods

// Initialize SDK
new LeapOCR(config: ClientConfig)

// Process documents
client.ocr.processURL(url: string, options?: UploadOptions): Promise<UploadResult>
client.ocr.processFile(filePath: string, options?: UploadOptions): Promise<UploadResult>
client.ocr.processFileBuffer(buffer: Buffer, filename: string, options?: UploadOptions): Promise<UploadResult>
client.ocr.processFileStream(stream: Readable, filename: string, options?: UploadOptions): Promise<UploadResult>

// Job management
client.ocr.getJobStatus(jobId: string, signal?: AbortSignal): Promise<JobStatus>
client.ocr.getJobResult(jobId: string, options?: { page?: number; pageSize?: number; signal?: AbortSignal }): Promise<OCRJobResult>
client.ocr.waitUntilDone(jobId: string, options?: PollOptions): Promise<JobStatus>
client.ocr.deleteJob(jobId: string): Promise<void>

Processing Options

interface UploadOptions {
  format?: "structured" | "markdown";
  model?: OCRModel;
  schema?: Record<string, any>;
  instructions?: string;
  templateSlug?: string;
}

Development

Prerequisites

  • Node.js 18+
  • pnpm 9+

Setup

# Clone the repository
git clone https://github.com/leapocr/leapocr-js.git
cd leapocr-js

# Install dependencies
pnpm install

# Build the SDK
pnpm build

Common Tasks

pnpm build              # Build all packages
pnpm test               # Run unit tests
pnpm lint               # Run linters
pnpm format             # Format code
pnpm dev                # Development mode with watch

Running Examples

# Set your API key
export LEAPOCR_API_KEY="your-api-key"

# Run basic example
cd examples/basic
pnpm install
pnpm start

# Run advanced example
cd examples/advanced
pnpm install
pnpm start

Contributing

We welcome contributions! Please follow these guidelines:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'feat: add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

See CONTRIBUTING.md for detailed guidelines.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support & Resources


Version: 0.0.5

About

LeapOCR Javascript/Typescript SDK

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors