What the models actually see.
Inputs, generated scans, task files, CSVs, and prompts for the current paperwork suite. Oracle solutions are intentionally not shown here.
Shown
Source files, generated document images, task instructions, prompts, and known trap categories.
Hidden
ground_truth.json, expected_artifacts.json, manual readings, and calibration notes.
Basic Invoice Folder
A small scanned-invoice folder with a quote distractor, partial payment, and an under-review stamp.
Credit Note And Vendor Hold
A scanned folder with a credit note, partial payment, missing PO, and inactive-vendor warning.
Duplicate Risk Mix
A larger mixed folder combining earlier-looking scans, a previous-invoices file, credit note, quote, and duplicate-risk lookup.
Tax ID Collision
A compact case around vendor identity and tax calculation conflicts.
PO Revision
A generated scan case where split payments and the latest purchase-order revision decide the outcome.
Messy Intake Folder
A chaotic intake folder where the model must identify active sources, ignore stale files, write manifests, and preserve incoming sources.
Email Attachment Intake
A versioning workflow with email context, revised invoice attachments, old references, and non-invoice screenshots.
Remittance Split
A remittance workflow where one payment has to be mapped across multiple final invoices while ignoring drafts and a proforma.
Credit Offset Packet
A credit-offset packet with duplicate scans, a credit memo, statement distractor, inactive vendor, and cancelled PO.