Document Model
Every document in penquify is aDocument object with a DocHeader and a list of DocItem entries. The header contains metadata (doc number, date, emitter, receiver, references) and the items contain line-level data (code, description, qty, unit, price).
Documents are rendered to HTML using Jinja2 templates, then screenshotted to PNG and PDF via Playwright.
The document model is intentionally flat and logistics-focused. It covers dispatch guides (guia de despacho), invoices, purchase orders, and bills of lading.
Photo Variations
APhotoVariation describes how a generated photo should look. It controls every aspect of the simulated capture:
- Camera: device model, year, lens equivalent
- Framing: document coverage, background, angle, skew, rotation
- Paper condition: curvature, folds, wrinkles, corner bends
- Artifacts: motion blur, glare, hand shadow, JPEG compression
- Damage: stains, dirt, torn edges
- Failure modes: cropped header, missing areas, overexposure
- Multi-page: staples, stacked sheets
Ground Truth Verification
Penquify’s verification pipeline ensures generated photos actually contain the correct data:- Blind extraction: A vision model (Gemini 2.5 Flash) reads the generated photo and extracts field values. It never sees the expected values.
- Programmatic comparison: Extracted values are compared against the source schema using normalized string matching. No model is involved in comparison.
- Retry on mismatch: If fields are wrong (image gen errors), penquify retries up to N times, emphasizing the mismatched fields in the prompt.
- Occlusion is OK: Fields that are intentionally hidden (by crop, stain, blur, etc.) are not treated as errors.
Occlusion Manifest
For each generated photo, penquify produces an occlusion manifest that explains why each field is or isn’t visible:visible— field was correctly extractedoccluded_by_crop(top 10-15%)— field hidden by intentional cropobscured_by_coffee_stain(upper_right)— field covered by stainblurred_by_motion(horizontal)— field illegible due to motion blurdistorted_by_extreme_angle— field warped beyond readabilityhallucinated_or_garbled_by_image_gen— image generation error (mismatch)