Transparency
pred.io publishes its own quality numbers — every quarter, regardless of whether they flatter the platform. The methodology is committed publicly before any private-beta synthesis lands; this page is where the results show up.
Methodology
Three legs, run continuously:
- Leg 1 — Blinded expert comparison. Each cycle, three senior contributors and the platform each write a synthesis on a topic seed drawn at random from an external committee's pool. Three external paid reviewers (publicly named, rotated annually) rate every entry, blinded, on factual accuracy, novelty of insight, calibration honesty, and decision-usefulness.
- Leg 2 — Resolution-tracked predictions. Every forward-looking claim in a published synthesis is logged with its stated probability and scored when the event resolves. The Brier score is reported against two fixed cohorts — a pre-launch baseline cohort and the currently-active senior contributor population.
- Leg 3 — Retraction rate. Fraction of published syntheses requiring retraction in the window. Public; target less than 2% monthly.
See architecture/03-synthesis-and-llm.md §7 for the full commitment.
Live retraction rate
Trailing 30 days, live namespace (excludes showcase content). Computed at request time from the synthesis table.
No syntheses published in the trailing window yet.
Target: <2% monthly. Trending up triggers an LLM-role audit (which model? which prompt version? which Territory?).
How to read these numbers
If leg 1's score is worse than the median external-reviewer rating of contributor-written syntheses for two consecutive cycles, synthesis publishing is paused and the team jointly diagnoses and re-architects. This commitment is in the published methodology and is not negotiable per-cycle.
Questions on methodology: transparency@pred.io.