Use de-identified or synthetic material first. Do not upload private records, patient data, credentials, or source material you cannot share. The point is to test the shape of the receipt, not to make the first run high-risk.
Doctor / clinical admin
Can a summary keep its uncertainty attached?
Demo: take a de-identified intake note, a clinic policy excerpt, and a reviewer note. Use gather to preserve source fragments and receipts, then use forum to route the review steps into a causal ledger.
Proof to inspect: every sentence in the summary should point back to an intake fragment, policy line, reviewer/tool handoff, or an explicit UNVERIFIABLE limit.
- Break test: remove one source fragment and see whether the digest or ledger still catches the missing support.
- Do not claim: clinical validation, diagnosis, or care advice.
Artist / studio
Can a creative branch stay inspectable?
Demo: start in the studio with a public or self-owned source image. Keep the source asset, prompt, seed, transform branch, critique note, and export gate together instead of treating the final render as a detached image.
Proof to inspect: a later viewer should be able to tell what was made, what was transformed, what was selected, and why the export was allowed.
- Break test: change the source or seed and check whether the artifact still falsely reads as the same work.
- Best fit: artists, creative technologists, provenance researchers, gallery/workflow people.
Media / newsroom
Can every public sentence find its source?
Demo: gather a small public-source packet for a story: original article, transcript, image source, correction, and conflict note. Then use a ledgered review pass to decide which public claims survive.
Proof to inspect: each published sentence should map to a witnessed source, a conflict note, an editorial decision, or UNVERIFIABLE. The model should not be the only witness to its own source check.
- Break test: swap one quote or image source after the review and see whether the check catches drift.
- Repos: gather, forum, and crucible.
Token economy / routing
Can the ledger show where model calls bought evidence?
Demo: route a multi-step agent task through forum. Keep the deterministic routing, budget stop, escalation, and delivery pass in the ledger. Use index to map the workspace before asking a stronger model to reason over it.
Proof to inspect: expensive model calls should have a reason: evidence coverage, route uncertainty, intent verification, or a specific unresolved question. Confident prose without a ledger reason is waste.
- Break test: force a cheap route where escalation is needed, or escalate every step, then compare ledger quality and token spend.
- Useful report: the first place where the ledger is too bulky, too thin, or too hard to replay.
Reasoning / public claims
Can the final authority live outside the model?
Demo: write a thesis with measurable claims, falsification conditions, and source substrate. Use crucible as the release-candidate/dev-parallel judgment organ: steelman the claim, measure what holds, report MATCH, DRIFT, or UNVERIFIABLE, then refine the weakest axis.
Proof to inspect: the answer should not be "the model thinks this is right." The answer should be a record a person can inspect: what claim was tested, what evidence supported it, what refuted it, what drifted, and what remained unverified.
- Break test: make one claim too vague to measure and require the system to say UNVERIFIABLE rather than rescue it.
- Public state: crucible is public, lowercase, release-candidate/dev-parallel, and not claimed stable or PyPI-published from this page.