86Trust
Verified
🔍 Web Verified🏛 Established Source (T2)
Ars TechnicaonX / Twitter3d ago
AI models are terrible at betting on soccer—especially xAI Grok arstechnica.com/ai/2026/04/ai-…
Trust Metrics
88
90
82
80
Claim Accuracy88%
Source Quality90%
Framing & Tone82%
Context80%
Analysis Summary
A new study tested top AI models from Google, OpenAI, Anthropic, and xAI on simulated soccer betting in the 2023–24 Premier League season. All of them lost money—xAI's Grok performed worst, going bankrupt in one attempt. The findings are well-sourced from a report by AI start-up General Reasoning (though the paper hasn't been peer-reviewed yet). What matters here: even advanced AI systems fail at long-horizon real-world tasks, which counters hype about AI replacing white-collar work. The article is accurate and balanced, though it notes the paper is pre-review.
Claims Analysis (4)
“AI models are terrible at betting on soccer—especially xAI Grok”
Study shows xAI's Grok 4.20 went bankrupt once and failed other attempts; all frontier models lost money over the season.
“Systems from Google, OpenAI, Anthropic, and xAI struggle with the Premier League”
Article documents losses for Claude, Grok, and Gemini in a simulated 2023–24 Premier League betting scenario. Every model lost money.
“A study tested eight top AI systems in a virtual recreation of the 2023–24 Premier League season”
KellyBench report by General Reasoning tested multiple frontier models with detailed historical data and betting instructions.
“Anthropic's Claude Opus 4.6 fared best, with an average loss of 11 percent”
Directly stated in article with specific performance metric. Claude came closest to breaking even but still lost overall.
Was this analysis helpful?
Try ClearFeed free →