Guildhall 0.6.0
Guildhall 0.6.0 turns the construction model into product behavior: project journey, blueprint, framing, trade work, inspection, change orders, punch list, and scoped learning. It also adds the first full policy and learning loop: bounded recovery, typed decision packets, reflection, scoped memory, project playbooks, and a replayable model comparison harness.
What changed
- Failure classification: worker blockers and no-progress paths now produce typed classifications before task state changes.
- Project construction model: Guildhall now treats planning, task shaping, implementation, review, and release readiness as visible construction layers instead of a flat task queue.
- Durable artifact rule: narration-only progress is not enough. Agents must create or update a blueprint, decision, question, diff, verification result, review finding, change order, or learning record.
- Bounded recovery playbooks: common failures map to named recovery paths with allowed tools, path bounds, max turns, and stop conditions.
- Typed handoff packets: review and gate-check stages can inspect compact decision packets instead of asking the worker to restate the whole diff.
- Reflection and learning: completed work, blockers, playbook outcomes, user corrections, and model-lane failures can emit learning candidates.
- Scoped memory: the coordinator routes lessons into project memory, project skills, user/global preferences, product suggestions, model-lane recommendations, or task-audit-only records.
- Memory and habits UI: Settings now exposes project memories, cross-project preferences, project playbooks, and Guildhall product ideas without adding new approval steps to the main task flow.
- Product feedback drafts: product ideas can open a prefilled GitHub issue draft with the suggestion, evidence, project path, and source id.
- Model comparison harness: replay scenarios now record outcome, cost, false escalations, false approvals, playbook results, and packet quality.
- Clearer Thread control surface: imported notes now ask for a
task briefinstead of vague "shaping," crowded Thread phases collapse into compact operation rows, and task-scoped questions let you ask for context without accidentally answering the question. - Project memory check-in: setup and existing-work review can show "What Guildhall knows right now" as a current snapshot from files and setup state, while the editable project direction remains the durable owner input.
- Readiness language cleanup: setup blockers now point to "readiness checks" instead of the confusing
Open Readylabel.
Target Reconciliation
The earlier 0.6.0 planning notes also proposed a full project-manager layer: release shaping, active tranche selection, and Work/Release views that explain why each task belongs now versus later.
This release does not claim that full layer yet. It lands the construction and policy substrate that layer needs: visible construction mode, blueprint sanity review, bounded recovery playbooks, typed handoff evidence, scoped learning, and calmer high-volume Thread operations. The remaining project-manager work should build on this foundation rather than ship as fake planning UI.
Why it matters
Earlier releases made Guildhall's state visible. This release makes failures compound into better future behavior without turning every run into a hidden training exercise.
The design keeps a hard separation between:
- project-specific facts and procedures
- user preferences that may apply across projects
- product suggestions for Guildhall itself
That separation matters because learning should make Guildhall calmer and more useful, not spooky.
Proof
The release branch was walked against the real Looma + Knit project. The proof covered:
- classification of a recoverable focused typecheck failure
- bounded
repair_touched_file_failurerecovery - project-memory creation and later reuse
- project-skill activation and trigger-scoped injection
- an inert product suggestion from a failed playbook
- imported-note task-brief language and compact high-volume Thread rows
- task-scoped "ask for context" replies that keep the original question open
- project memory check-in copy on the Font Something existing-work review
- readiness-gate recovery after installing
pixiand correcting the stale app gate frompnpm run typechecktopnpm run build - user inspection, reset, and "use only here" behavior
- full test, docs, and model-comparison validation
Validation
pnpm typecheckpnpm docs:check-help-syncpnpm docs:buildpnpm model:bakeoffpnpm testpnpm test:ui