Documentation

Custom Extraction Schemas API Reference

Save, manage, and reuse JSON Schema definitions for structured data extraction

Reusable Schemas

Define extraction schemas once and reference them by ID when calling the Extraction API. Each schema includes an optional prompt hint that is automatically applied during extraction. Schemas are scoped to your organization and names must be unique within your tenant.

Schema Management

POST/api/v2/custom-extraction-schemas

Create a new extraction schema. Returns 201 Created. Schema name must be unique within your tenant.

Request

json
{
"name": "Invoice Line Items",
"description": "Extracts vendor, line items, and totals from invoice images",
"schema": {
"type": "object",
"properties": {
"vendor": {"type": "string", "description": "Vendor or company name"},
"invoice_number": {"type": "string"},
"line_items": {
"type": "array",
"items": {
"type": "object",
"properties": {
"description": {"type": "string"},
"quantity": {"type": "integer"},
"unit_price": {"type": "number"}
}
}
},
"total": {"type": "number", "description": "Invoice total amount"}
},
"required": ["vendor", "line_items", "total"]
},
"prompt_hint": "This is a scanned invoice document"
}
// name: required, 1-255 chars, must be unique within tenant
// description: optional, max 1000 chars
// schema: required, JSON Schema with root type "object"
// Max depth: 3, max 50 properties per object,
// max 200 total nodes, max 32KB
// Allowed types: string, integer, number, boolean, array, object
// Supported composition: anyOf only (no $ref, allOf, oneOf)
// prompt_hint: optional, max 2000 chars — automatically used
// as the prompt when extracting with this schema (if no
// explicit prompt is provided in the extract request)

Response

json
{
"id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"name": "Invoice Line Items",
"description": "Extracts vendor, line items, and totals from invoice images",
"schema": {
"type": "object",
"properties": {
"vendor": {"type": "string", "description": "Vendor or company name"},
"invoice_number": {"type": "string"},
"line_items": {
"type": "array",
"items": {
"type": "object",
"properties": {
"description": {"type": "string"},
"quantity": {"type": "integer"},
"unit_price": {"type": "number"}
}
}
},
"total": {"type": "number", "description": "Invoice total amount"}
},
"required": ["vendor", "line_items", "total"]
},
"prompt_hint": "This is a scanned invoice document",
"field_count": 4,
"created_by": "u1234567-e29b-41d4-a716-446655440000",
"created_at": "2026-04-13T10:30:00Z",
"updated_at": "2026-04-13T10:30:00Z"
}
// field_count: number of top-level properties in the schema
GET/api/v2/custom-extraction-schemas

List saved extraction schemas for your tenant, newest first

Request

json
// Query parameters:
?limit=50 // optional, 1-500
&offset=0 // optional, default: 0

Response

json
{
"items": [
{
"id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"name": "Invoice Line Items",
"description": "Extracts vendor, line items, and totals from invoice images",
"field_count": 4,
"created_by": "u1234567-e29b-41d4-a716-446655440000",
"created_at": "2026-04-13T10:30:00Z",
"updated_at": "2026-04-13T10:30:00Z"
}
],
"total_count": 1,
"limit": 50,
"offset": 0,
"has_more": false
}
// List response returns summaries (no full schema body).
// Use GET /{schema_id} to retrieve the full schema.
GET/api/v2/custom-extraction-schemas/{schema_id}

Get a saved extraction schema by ID, including the full schema body

Response

json
{
"id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"name": "Invoice Line Items",
"description": "Extracts vendor, line items, and totals from invoice images",
"schema": {
"type": "object",
"properties": {
"vendor": {"type": "string", "description": "Vendor or company name"},
"invoice_number": {"type": "string"},
"line_items": {"type": "array", "items": {"type": "object", "properties": {"description": {"type": "string"}, "quantity": {"type": "integer"}, "unit_price": {"type": "number"}}}},
"total": {"type": "number", "description": "Invoice total amount"}
},
"required": ["vendor", "line_items", "total"]
},
"prompt_hint": "This is a scanned invoice document",
"field_count": 4,
"created_by": "u1234567-e29b-41d4-a716-446655440000",
"created_at": "2026-04-13T10:30:00Z",
"updated_at": "2026-04-13T10:30:00Z"
}
// Returns 404 if schema not found for your tenant
PUT/api/v2/custom-extraction-schemas/{schema_id}

Update an existing extraction schema. Full replacement — all fields must be provided.

Request

json
{
"name": "Invoice Line Items v2",
"description": "Updated schema with currency field",
"schema": {
"type": "object",
"properties": {
"vendor": {"type": "string"},
"invoice_number": {"type": "string"},
"currency": {"type": "string", "enum": ["USD", "EUR", "GBP"]},
"line_items": {
"type": "array",
"items": {
"type": "object",
"properties": {
"description": {"type": "string"},
"quantity": {"type": "integer"},
"unit_price": {"type": "number"}
}
}
},
"total": {"type": "number"}
},
"required": ["vendor", "line_items", "total"]
},
"prompt_hint": "This is a scanned invoice or receipt"
}
// name: required, 1-255 chars
// description: optional, max 1000 chars (omit or null to clear)
// schema: required, same validation rules as create
// prompt_hint: optional, max 2000 chars (omit or null to clear)
// Note: This is PUT (full replace), not PATCH

Response

json
{
"id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"name": "Invoice Line Items v2",
"description": "Updated schema with currency field",
"schema": { ... },
"prompt_hint": "This is a scanned invoice or receipt",
"field_count": 5,
"created_by": "u1234567-e29b-41d4-a716-446655440000",
"created_at": "2026-04-13T10:30:00Z",
"updated_at": "2026-04-13T11:00:00Z"
}
DELETE/api/v2/custom-extraction-schemas/{schema_id}

Delete a saved extraction schema. Requires DELETE permission on IMAGES.

Response

json
{
"message": "Custom extraction schema deleted successfully"
}
// Returns 404 if schema not found for your tenant

Error Responses

Common Errors

  • 400 CUSTOM_SCHEMA_NAME_EXISTS: A schema with this name already exists in your tenant
  • 400 CUSTOM_SCHEMA_INVALID_SCHEMA: Schema validation failed (depth, node count, unsupported types/keywords, empty body)
  • 400 CUSTOM_SCHEMA_INVALID_NAME: Name is blank or exceeds 255 characters
  • 400 CUSTOM_SCHEMA_INVALID_DESCRIPTION: Description exceeds 1000 characters
  • 400 CUSTOM_SCHEMA_INVALID_PROMPT_HINT: Prompt hint exceeds 2000 characters
  • 403: Missing required permission (EDIT for create/update, VIEW for read, DELETE for delete)
  • 404 CUSTOM_SCHEMA_NOT_FOUND: Schema not found for your tenant