Overview
Penquify was designed for OCR benchmarking. Generate documents with known data, apply controlled degradations, and compare your model’s extractions against ground truth — with the occlusion manifest telling you which fields are fairly testable.Benchmarking Workflow
Define documents with known data
Create documents where every field value is known (the ground truth).
Generate photos with controlled variations
Apply specific presets to create photos with known degradation levels.
Example: Benchmark Matrix
Scoring Your Model
Load the ground truth and occlusion manifest, then score:Metrics to Track
| Metric | Description |
|---|---|
| Visible field accuracy | Correct extractions / total visible fields |
| Partial read rate | Partially correct extractions on illegible fields |
| Hallucination rate | Non-null extractions on occluded/not_visible fields |
| Degradation curve | Accuracy across easy -> medium -> hard presets |
| Per-field robustness | Which fields are most sensitive to degradation |