DocAI API Documentation

DocAI provides a REST API for extracting structured data from business documents programmatically. This is useful for automation workflows in Make, n8n, Zapier, or custom integrations.

Authentication

The programmatic API key flow is planned (see the beta note above). Create and manage keys in the Developer Settings page; today they are not accepted by protected routes — use Clerk session authentication instead.

Authorization: Bearer sk_docai_<your_key>   # planned — not accepted yet

API keys start with sk_docai_. Keep them secret — they are shown only once at creation.

Rate Limits

Anonymous requests: 5 per day. Authenticated users: 100 per day.

On limit exceeded, the API returns HTTP 429 with a Retry-After hint.

Error Format

{
  "detail": "File too large. Maximum upload size is 15 MB.",
  "request_id": "abc123"
}

Sync Document Extraction

POST/api/analyze

Upload a document and get extraction results synchronously. Best for small documents (< 2 pages).

Request

Multipart form data:

FieldTypeDescription
filefilerequiredPDF, PNG, JPG, TIFF, BMP, or WEBP (max 15MB)
filenamestringoptionalOverride the displayed filename

Example

curl -X POST https://docai.example.com/api/analyze \
  -H "Authorization: Bearer sk_docai_..." \
  -F "file=@purchase_order.pdf"

Response

{
  "request_id": "3fa8...",
  "filename": "purchase_order.pdf",
  "document_type": "purchase_order",
  "document_type_confidence": 0.94,
  "summary": "Purchase order from Acme Corp to Widget Co.",
  "language": "en",
  "extracted_fields": [
    {
      "field": "po_number",
      "value": "PO-2026-0042",
      "confidence": 0.97,
      "page": 1,
      "evidence": "PO-2026-0042"
    }
  ],
  "warnings": [],
  "ocr": { "pages": 1, "tokens": 312, "language": "eng+pol" },
  "llm": { "status": "success", "provider": "openai", "model": "gpt-4o-mini" }
}

Streaming Extraction (SSE)

POST/api/analyze/stream

Same as sync, but streams Server-Sent Events showing processing progress. Best for real-time UIs.

Create Async Job

POST/api/jobs

Upload a document and get a job ID. Process happens in the background.

curl -X POST https://docai.example.com/api/jobs \
  -H "Authorization: Bearer sk_docai_..." \
  -F "file=@invoice.pdf"
{ "job_id": "abc123", "status": "queued", "request_id": "xyz..." }

Get Job Status

GET/api/jobs/{job_id}

curl https://docai.example.com/api/jobs/abc123 \
  -H "Authorization: Bearer sk_docai_..."
{
  "id": "abc123",
  "status": "completed",
  "progress_pct": 100,
  "document_type": "invoice",
  "pages_processed": 2
}

Possible statuses: queued, processing, completed, failed, cancelled.

Get Job Result

GET/api/jobs/{job_id}/result

Returns full extraction result once job status is completed.

curl https://docai.example.com/api/jobs/abc123/result \
  -H "Authorization: Bearer sk_docai_..."

Export JSON

POST/api/export/json

Send extraction result as JSON body, get clean export file.

curl -X POST https://docai.example.com/api/export/json \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk_docai_..." \
  -d '{"request_id":"...","extracted_fields":[...]}' \
  -o export.json

Export CSV

POST/api/export/csv

curl -X POST https://docai.example.com/api/export/csv \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk_docai_..." \
  -d '{"request_id":"...","extracted_fields":[...]}' \
  -o fields.csv

Export XLSX

POST/api/export/xlsx

Returns an Excel workbook with Summary, Fields, Line Items, Warnings, and Metadata sheets.

List History

GET/api/history

curl https://docai.example.com/api/history \
  -H "Authorization: Bearer sk_docai_..."

Delete All History

DELETE/api/history

Permanently deletes all saved analyses for your account.

List API Keys

GET/api/keys

Create API Key

POST/api/keys

curl -X POST https://docai.example.com/api/keys \
  -H "Authorization: Bearer <clerk_session_token>" \
  -H "Content-Type: application/json" \
  -d '{"name": "My automation key"}'
{
  "id": "...",
  "name": "My automation key",
  "key": "sk_docai_...",   // shown ONCE — save it now
  "prefix": "sk_docai_abc",
  "scopes": "extract",
  "created_at": "2026-06-10T12:00:00Z"
}

Revoke API Key

DELETE/api/keys/{key_id}

Create Webhook

POST/api/webhooks

curl -X POST https://docai.example.com/api/webhooks \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{"name": "My webhook", "endpoint_url": "https://my.app/hook", "event_types": ["document.completed"]}'

The response includes a secret for signature verification — save it now, it won't be shown again.

Test Webhook

POST/api/webhooks/{webhook_id}/test

Verify Webhook Signature

Every delivery includes these headers:

import hashlib, hmac

def verify(secret, body_str, timestamp, signature_header):
    msg = f"{timestamp}.{body_str}"
    digest = hmac.new(secret.encode(), msg.encode(), hashlib.sha256).hexdigest()
    return hmac.compare_digest(f"sha256={digest}", signature_header)

Use with Make / n8n / Zapier

Use the HTTP Request module in Make, n8n, or Zapier to call /api/jobs with your API key. Poll /api/jobs/{job_id} until status is completed, then fetch the result.

Alternatively, configure a webhook to receive results automatically — no polling needed.