Case Transparency

What the models actually see.

Inputs, generated scans, task files, CSVs, and prompts for the current paperwork suite. Oracle solutions are intentionally not shown here.

Shown

Source files, generated document images, task instructions, prompts, and known trap categories.

Hidden

ground_truth.json, expected_artifacts.json, manual readings, and calibration notes.

P01The Paperwork Trial

Basic Invoice Folder

A small scanned-invoice folder with a quote distractor, partial payment, and an under-review stamp.

4 images6 text filesGenerated invoice images
customer vs vendorquote is not an invoicepartial paymentunder-review stamp
P02The Paperwork Trial

Credit Note And Vendor Hold

A scanned folder with a credit note, partial payment, missing PO, and inactive-vendor warning.

4 images6 text filesGenerated invoice images
credit note is not payablepartial paymentmissing POinactive vendor
P03The Paperwork Trial

Duplicate Risk Mix

A larger mixed folder combining earlier-looking scans, a previous-invoices file, credit note, quote, and duplicate-risk lookup.

8 images7 text filesGenerated invoice images
duplicate-risk lookupmixed foldersquote and credit note distractorspartial payment
P04The Paperwork Trial

Tax ID Collision

A compact case around vendor identity and tax calculation conflicts.

1 images6 text filesGenerated invoice images
vendor tax ID conflicttax rounding mismatchstatement distractor
P05The Paperwork Trial

PO Revision

A generated scan case where split payments and the latest purchase-order revision decide the outcome.

1 images6 text filesGenerated invoice images
split paymentcancelled PO revisionquote distractor
W04Paperwork Workflow

Messy Intake Folder

A chaotic intake folder where the model must identify active sources, ignore stale files, write manifests, and preserve incoming sources.

4 images9 text filesAgentic file workflow
old bank exportduplicate vendor filedraft PO listnon-invoice scan
W05Paperwork Workflow

Email Attachment Intake

A versioning workflow with email context, revised invoice attachments, old references, and non-invoice screenshots.

4 images6 text filesAgentic file workflow
superseded invoicerevised attachmentold payment referenceproforma distractor
W06Paperwork Workflow

Remittance Split

A remittance workflow where one payment has to be mapped across multiple final invoices while ignoring drafts and a proforma.

4 images7 text filesAgentic file workflow
single payment splitdraft bank exportproforma distractoractive remittance batch
W07Paperwork Workflow

Credit Offset Packet

A credit-offset packet with duplicate scans, a credit memo, statement distractor, inactive vendor, and cancelled PO.

5 images6 text filesAgentic file workflow
credit offsetduplicate scanstatement distractorinactive vendor