The Verification Pipeline
Penquify’s verification system ensures that generated photos contain the correct document data. The key design principle: the extraction model never sees ground truth values.Blind extraction
A separate Gemini 2.5 Flash call receives the generated photo and a list of field names. It extracts values with confidence scores. It does NOT know the expected values.
Programmatic comparison
Extracted values are compared against the source schema in Python code. No model is involved. Values are normalized (strip whitespace, lowercase, remove
$, commas, dots).Classify results
Each field gets a status:
match— extracted value matches ground truthmismatch— extracted value differs (image gen error)illegible— model can’t read it (confidence below 0.5)not_visible— field not in frame (cropped, occluded)
Extraction Prompt
The extraction model receives:- The generated photo
- A JSON list of field names to look for
value— what it read (ornull)confidence— 0.0 to 1.0reason—null,"blurry","cropped","occluded", or"not_in_frame"
Comparison Logic
Comparison is pure Python — no model involved:| Condition | Status |
|---|---|
value is None or confidence == 0 + reason is cropped/not_in_frame | not_visible |
value is None or confidence == 0 + other reason | illegible |
confidence < 0.5 | illegible |
normalize(extracted) == normalize(expected) | match |
| Otherwise | mismatch |
Verification Result
Verified Generation
Thegenerate_verified_photo() function combines generation + verification + retry:
- Generate photo
- Verify against schema
- If mismatches exist and retries remain, regenerate with emphasis on wrong fields
- Return result with
verified: true/false, attempt count, verification details, and occlusion manifest