LaunchVerdict · live demo (seeded)

Three releases. Three verdicts.

You shipped three times this week. Instead of a 40-widget dashboard, here is one card per release: did it help the flow that matters, or hurt it — and what to do now. One clearly regressed (roll back), one dropped on too thin a sample to trust yet (hold and watch), one improved (keep shipping). All computed live from seeded telemetry by the same engine that runs on real repos.

This week's verdicts
↩ ROLL IT BACKacme/checkout-web · r-1042
Onboarding completion fell 71%→53% after r-1042
onboarding · before
71%
after
53%
after-rate 95% CI: [48%, 58%]
Causechanged app/onboarding/Step2.tsx; Novus flagged: Step 2 'Continue' button moved below the fold on 375px viewports
Do thisRevert r-1042 or fix: Step 2 'Continue' button moved below the fold on 375px viewports
high confidenceCorrelation under a known cause (the diff), not proven causation. Short windows and confounders apply.
⏸ HOLDacme/checkout-web · r-1044
Signup completion fell 80%→45% after r-1044 — hold and watch (thin sample)
signup · before
80%
after
45%
after-rate 95% CI: [32%, 60%]
Causechanged app/signup/Plan.tsx; Novus flagged: Plan toggle hidden behind a tooltip on first paint
Do thisDon't roll back yet — the drop is real but the sample is thin. Hold r-1044 and recheck once more users hit the flow or fix: Plan toggle hidden behind a tooltip on first paint.
low confidenceCorrelation under a known cause (the diff), not proven causation. Short windows and confounders apply.
✓ KEEP SHIPPINGacme/checkout-web · r-1039
Search completion rose 60%→70% after r-1039 — keep it
search · before
60%
after
70%
after-rate 95% CI: [67%, 73%]
Causechanged app/search/ranking.ts
Do thisKeep shipping. No action needed.
high confidenceCorrelation under a known cause (the diff), not proven causation. Short windows and confounders apply.
How the call is made

We treat a release as the cut point on the event timeline. We compare each flow's completion rate in the 7 days before vs after the cut with a two-proportion z-test, take the most-moved flow, and require both significance (p<0.05) and a meaningful drop before we say roll back. The number is a correlation under a known cause — the diff — not a proof. We label the confidence and never fabricate a call when the data is thin.