Codex reference: strong workflow closure
Codex is not a local LM Studio run. It is kept as a reference line for what stronger agentic tooling does on the same public cases.
83.3% Practical score
7/9 Resolved
8/9 Core pass
City Plan SVG passed Visual sample
What Worked
- Best current practical score across the full public case set.
- Strong at preserving protected input folders while producing required artifacts.
- Most failures were narrow near misses rather than broad document misunderstanding.
Where It Broke
- Still failed one case strictly and had proof/evidence misses.
- Not directly comparable to local-only LM Studio runs.
- Useful as a ceiling/reference, not as the point of the site.
Readout
The reference run shows what clean workflow closure looks like; local models are measured against the same artifacts, not against marketing claims.