PBN Research Lab

Tool

Notebook-first paint-by-number R&D: compare pipelines, score outputs, iterate locally with checkpointed runs.

PBN Research Lab


Hypothesis


I can compare paint-by-number style pipelines (A/B/C) with repeatable metrics, optional human rubric scores, and logged runs on disk, so improvement is data-shaped instead of eyeballing the latest PNG folder.




Why This Matters to Me


This is lab tooling: judgment-heavy image work benefits from a tight loop (run → see variants → score → log). The goal is to learn what “good PBN” means before any product wrapper exists.




Who It's For


Me (or anyone cloning the repo) running local experiments on my images and my run logs. Not for a public SaaS, automated hosting, or customer support—those stay out of scope in the PRD.




What It Does


  • Local web UI plus CLI to run pipelines, view previews, and read auto metrics
  • Human sliders (e.g. subject clarity, paintability) blended with auto scores per configured weights
  • One-click logging to `assets/output/` (runs, manifests) for traceability
  • Sweeps for budgeted search when the UI is not enough



Existing Options


Product / approachPriceUser BaseStrengthLimitation
Ad hoc scripts / notebooks$0universalFlexibleDrift; hard to compare runs
ML experiment trackersVariesteamsStrong for model metricsOverkill; may not map to PBN human judgments
Consumer PBN appsFree–$large; unknownProductized outputOpaque; not a research surface

Gap: a local-first loop tuned to PBN—variants, side-by-sides, logged judgment—in one place.