Skip to main content

Endpoint

POST /generate/from-upload
Upload an existing document (PDF or image), detect its schema using vision AI, and generate verified photo variations.

Request

This endpoint uses multipart/form-data.
file
file
required
PDF or image file (PNG, JPG) to upload.
presets
string
default:"full_picture,folded_skewed,blurry"
Comma-separated list of preset names.

Response

run_id
string
Unique identifier for this generation run
detected_schema
object
Schema detected from the uploaded document.
ground_truth
object
Flattened field_name -> value dict used for verification.
photos
array
Array of verified photo results.

Example

curl -X POST http://localhost:8000/generate/from-upload \
  -F "file=@/path/to/invoice.pdf" \
  -F "presets=full_picture,blurry"
Response
{
  "run_id": "h8i9j0k1l2",
  "detected_schema": {
    "document_type": "dispatch_guide",
    "header": {
      "doc_number": "00098765",
      "emitter_name": "ACME FOODS S.A."
    },
    "items": [
      {"pos": 1, "description": "HARINA DE TRIGO", "qty": 40, "unit": "UN"}
    ],
    "totals": {"subtotal": 500000, "tax": 95000, "total": 595000},
    "confidence": 0.93
  },
  "ground_truth": {
    "doc_number": "00098765",
    "emitter_name": "ACME FOODS S.A.",
    "item_1_description": "HARINA DE TRIGO",
    "item_1_qty": "40",
    "subtotal": "500000",
    "total": "595000"
  },
  "photos": [
    {
      "image_path": "output/h8i9j0k1l2/photo_full_picture.png",
      "verified": true,
      "attempts": 1
    }
  ]
}