In March 2026, a major geopolitical event generated $529M in prediction market volume on Polymarket. YES shares on the outcome had traded between $0.01 and $0.05 for weeks while signals were publicly visible — carrier group deployments, diplomatic escalations, intelligence community warnings. One trader made $515,000 in a single day buying shares one hour before news broke.
Omen Analytica — a Bayesian signal detection system designed to surface exactly these asymmetric opportunities — generated zero alerts. Complete miss.
The postmortem identified five cascading failures:
The system's 12 scrapers were all tech-focused. No geopolitical wire services, no defense/foreign policy sources. The signal existed in the world but not in the system's inputs.
The entity's signal score of 3.0 fell between spike detection (needs prior below 2.5) and sustained detection (needs prior above 5.0). A division bug in one comparison created a dead zone where moderate-but-real signals were invisible.
The auto-market discovery system had never fetched geopolitical prediction markets. The only matching market in the database was a FIFA World Cup entry.
All 43 tracked entities were tech companies. The system had no concept of countries, militaries, or political leaders as signal-generating entities.
The system could detect signal intensity but had no mechanism to compare signals against market prices. It couldn't answer the fundamental question: is the market pricing this correctly?
Omen Analytica is a Bayesian signal detection and prediction market intelligence system. It monitors 12 news and data sources concurrently, computes weekly weighted signal scores per entity, and cross-references those signals against Polymarket prediction markets to find asymmetric bets — situations where signal intensity is high but market prices haven't moved.
Bayesian source credibility: Each source starts with an uninformative Beta(1,1) prior. As predictions resolve, the system updates: alpha = 1 + hits, beta = 1 + misses. Source weight uses the lower bound of a 90% confidence interval — deliberately conservative, penalizing low sample sizes even with high apparent hit rates.
Signal scoring: Weighted mentions per entity per week. Source-specific weights range from 1.5 (general tech news) to 10.0 (product launch events). These weights evolve via the Bayesian feedback loop as predictions resolve.
Two detection modes: Velocity spike (score crosses threshold from below) and sustained trend (above threshold for 2+ consecutive weeks).
| Detector | What It Finds | Accountability |
|---|---|---|
| Convex Bets | Signal/price divergence — high signal, cheap market price | Brier scored |
| Volume Spikes | Sudden trading volume surges on specific markets | Brier scored |
| Cheap Conviction | Markets with high YES volume but prices still under $0.10 | Brier scored |
| Topic Clusters | Multiple entities generating correlated signals simultaneously | Brier scored |
| Extreme Conviction | Markets where 90%+ of volume is on one side | Brier scored |
| Divergence | News intensity disconnected from market price movement | Brier scored |
Every detector is individually scored using Brier scores (mean squared prediction error). Random baseline is 0.25. Any detector performing worse than random for 4+ consecutive weeks has its weight zeroed automatically — and displays a skull icon in the UI. The system publishes its own track record.
Thompson Sampling handles the exploration/exploitation tradeoff for detector weight allocation. New detectors get enough trials to prove themselves, but persistent underperformers are culled.
After the miss, the rebuild applied the same antifragile principles the system was designed to detect:
Via negativa (subtraction): The dead zone bug was a /2 in one comparison. Removing it restored detection for moderate signals. The highest-impact fix was deleting two characters.
Barbell strategy: Core changes (fix the dead zone, build auto-market discovery) were conservative and non-negotiable. Edge changes (geopolitical entity categories, the Taleb Detector, OFAC/GDELT stress signals) were experimental — each could fail without breaking the system.
New stress signal sources:
US Treasury sanctions list. Change velocity in sanctions indicates the state putting skin in the game. Novel use of a public government API as a geopolitical signal.
Global event database. Volume and tone monitoring for 6 countries. When event volume spikes and tone drops simultaneously, something is happening.
6 bond and volatility series (high-yield spreads, yield curve inversion, VIX, BBB spreads, financial stress index, 10Y treasury). Market stress indicators that precede geopolitical events.
Composite stress output: LOW / ELEVATED / HIGH / EXTREME.
The critical bug: The accountability loop (trackPredictions()) was defined but never called from the scan path. Every detector had been running with uniform 1/6 weights since launch. The Brier/Thompson scoring chain was starved of data. Fixing this closed the feedback loop for the first time — detectors could actually be culled based on performance.
trackPredictions() into the scan path so the system could learn from its own results.
/2) had more impact than any new feature. The Lindy filter applied to every architecture decision: SQL, string matching, HTTP, cron. All patterns that predate 2010.
Bayesian signal detection uses probability theory to weight information sources based on their track record. Each source starts with no assumed reliability and earns credibility as its signals prove accurate. In this system, sources use Beta-Binomial priors updated as predictions resolve, with the lower bound of a 90% confidence interval as the reliability estimate.
The system compares signal intensity against prediction market prices. When news signals are escalating but market prices remain low (under $0.10), the divergence suggests the market hasn't priced in the information. Six detectors scan for different patterns of this signal/price mismatch.
Omen monitors Polymarket, the largest prediction market platform. Auto-market discovery fuzzy-matches tracked entities to live market questions, so new markets are detected without manual configuration.
Each detector is individually scored using Brier scores (mean squared prediction error). Random baseline is 0.25. Detectors performing worse than random for 4+ weeks are automatically zeroed. The system publishes its track record — including failures — on every page load.
Not predict — detect. The system identifies when signal intensity diverges from market pricing, suggesting the market may be mispricing a risk. The Iran postmortem showed that detection depends entirely on having the right inputs: the system had zero geopolitical sources at the time and missed a $529M event.
Approximately $0.50 per month in total infrastructure costs. The system uses serverless PostgreSQL, static hosting, and workflow orchestration on free/hobby tiers. No ML training costs — the Bayesian model is computationally trivial.