DocAI API Documentation
DocAI provides a REST API for extracting structured data from business documents programmatically. This is useful for automation workflows in Make, n8n, Zapier, or custom integrations.
Authentication
The programmatic API key flow is planned (see the beta note above). Create and manage keys in the Developer Settings page; today they are not accepted by protected routes — use Clerk session authentication instead.
Authorization: Bearer sk_docai_<your_key> # planned — not accepted yet
API keys start with sk_docai_. Keep them secret — they are shown only once at creation.
Rate Limits
Anonymous requests: 5 per day. Authenticated users: 100 per day.
On limit exceeded, the API returns HTTP 429 with a Retry-After hint.
Error Format
{
"detail": "File too large. Maximum upload size is 15 MB.",
"request_id": "abc123"
}
Sync Document Extraction
POST/api/analyze
Upload a document and get extraction results synchronously. Best for small documents (< 2 pages).
Request
Multipart form data:
| Field | Type | Description | |
|---|---|---|---|
file | file | required | PDF, PNG, JPG, TIFF, BMP, or WEBP (max 15MB) |
filename | string | optional | Override the displayed filename |
Example
curl -X POST https://docai.example.com/api/analyze \ -H "Authorization: Bearer sk_docai_..." \ -F "file=@purchase_order.pdf"
Response
{
"request_id": "3fa8...",
"filename": "purchase_order.pdf",
"document_type": "purchase_order",
"document_type_confidence": 0.94,
"summary": "Purchase order from Acme Corp to Widget Co.",
"language": "en",
"extracted_fields": [
{
"field": "po_number",
"value": "PO-2026-0042",
"confidence": 0.97,
"page": 1,
"evidence": "PO-2026-0042"
}
],
"warnings": [],
"ocr": { "pages": 1, "tokens": 312, "language": "eng+pol" },
"llm": { "status": "success", "provider": "openai", "model": "gpt-4o-mini" }
}
Streaming Extraction (SSE)
POST/api/analyze/stream
Same as sync, but streams Server-Sent Events showing processing progress. Best for real-time UIs.
Create Async Job
POST/api/jobs
Upload a document and get a job ID. Process happens in the background.
curl -X POST https://docai.example.com/api/jobs \ -H "Authorization: Bearer sk_docai_..." \ -F "file=@invoice.pdf"
{ "job_id": "abc123", "status": "queued", "request_id": "xyz..." }
Get Job Status
GET/api/jobs/{job_id}
curl https://docai.example.com/api/jobs/abc123 \ -H "Authorization: Bearer sk_docai_..."
{
"id": "abc123",
"status": "completed",
"progress_pct": 100,
"document_type": "invoice",
"pages_processed": 2
}
Possible statuses: queued, processing, completed, failed, cancelled.
Get Job Result
GET/api/jobs/{job_id}/result
Returns full extraction result once job status is completed.
curl https://docai.example.com/api/jobs/abc123/result \ -H "Authorization: Bearer sk_docai_..."
Export JSON
POST/api/export/json
Send extraction result as JSON body, get clean export file.
curl -X POST https://docai.example.com/api/export/json \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk_docai_..." \
-d '{"request_id":"...","extracted_fields":[...]}' \
-o export.json
Export CSV
POST/api/export/csv
curl -X POST https://docai.example.com/api/export/csv \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk_docai_..." \
-d '{"request_id":"...","extracted_fields":[...]}' \
-o fields.csv
Export XLSX
POST/api/export/xlsx
Returns an Excel workbook with Summary, Fields, Line Items, Warnings, and Metadata sheets.
List History
GET/api/history
curl https://docai.example.com/api/history \ -H "Authorization: Bearer sk_docai_..."
Delete All History
DELETE/api/history
Permanently deletes all saved analyses for your account.
List API Keys
GET/api/keys
Create API Key
POST/api/keys
curl -X POST https://docai.example.com/api/keys \
-H "Authorization: Bearer <clerk_session_token>" \
-H "Content-Type: application/json" \
-d '{"name": "My automation key"}'
{
"id": "...",
"name": "My automation key",
"key": "sk_docai_...", // shown ONCE — save it now
"prefix": "sk_docai_abc",
"scopes": "extract",
"created_at": "2026-06-10T12:00:00Z"
}
Revoke API Key
DELETE/api/keys/{key_id}
Create Webhook
POST/api/webhooks
curl -X POST https://docai.example.com/api/webhooks \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{"name": "My webhook", "endpoint_url": "https://my.app/hook", "event_types": ["document.completed"]}'
The response includes a secret for signature verification — save it now, it won't be shown again.
Test Webhook
POST/api/webhooks/{webhook_id}/test
Verify Webhook Signature
Every delivery includes these headers:
X-DocAI-Event— event type (e.g.,document.completed)X-DocAI-Timestamp— ISO 8601 UTC timestampX-DocAI-Signature—sha256=<hmac>
import hashlib, hmac
def verify(secret, body_str, timestamp, signature_header):
msg = f"{timestamp}.{body_str}"
digest = hmac.new(secret.encode(), msg.encode(), hashlib.sha256).hexdigest()
return hmac.compare_digest(f"sha256={digest}", signature_header)
Use with Make / n8n / Zapier
Use the HTTP Request module in Make, n8n, or Zapier to call
/api/jobs with your API key. Poll /api/jobs/{job_id}
until status is completed, then fetch the result.
Alternatively, configure a webhook to receive results automatically — no polling needed.