Bitget App
Trade smarter
Open
HomepageSign up
Bitget>
News>
Alpha Arena Reveals AI Trading Flaws: Western Models Lose 80% of Capital in One Week

Alpha Arena Reveals AI Trading Flaws: Western Models Lose 80% of Capital in One Week

ForesightNews 速递2025/10/27 09:54
By: ForesightNews 速递
BTC-0.21%DOGE-1.82%
The market is the ultimate test for AI.


Written by: Juan Galt

Translated by: AididiaoJP, Foresight News


Can AI trade cryptocurrencies? Jay Azhang, a computer engineer and finance professional from New York, is testing this question through Alpha Arena. This project pits the most powerful large language models against each other, each with $10,000 in capital, to see which can make more money trading cryptocurrencies. These models include Grok 4, Claude Sonnet 4.5, Gemini 2.5 pro, ChatGPT 5, Deepseek v3.1, and Qwen3 Max.


You might be thinking, "Wow, what a brilliant idea!" And you may be surprised to learn that, at the time of writing, three out of the five AIs are in a loss position, while Qwen3 and Deepseek, two open-source models from China, are leading.


Alpha Arena Reveals AI Trading Flaws: Western Models Lose 80% of Capital in One Week image 0


That's right, the most powerful, closed-source, proprietary AIs operated by Western giants like Google and OpenAI have lost over $8,000—80% of their crypto trading capital—in just over a week, while their open-source counterparts from the East are in profit.


The most successful trade so far? Qwen3 has remained profitable and continues to make gains simply by holding a 20x long position on bitcoin. Grok 4, unsurprisingly, spent most of the competition going 10x long on dogecoin, at one point sharing the top spot with Deepseek, but is now close to a 20% loss. Maybe Elon Musk should post a dogecoin meme or something to help Grok out of trouble.


Alpha Arena Reveals AI Trading Flaws: Western Models Lose 80% of Capital in One Week image 1


Meanwhile, Google’s Gemini has been relentlessly bearish, shorting all tradable crypto assets—a stance that echoes their overall crypto policy over the past 15 years.


In the end, it made every possible wrong trade for a whole week straight. It takes skill to do that badly, especially when Qwen3 is simply going long on bitcoin. If this is the best closed-source AI can offer, maybe OpenAI should stay closed-source to spare us the losses.


A New Benchmark for AI


The idea of pitting AI models against each other in the crypto trading arena offers some very profound insights. First, AI cannot obtain the answers to crypto trading knowledge tests during pre-training because it is unpredictable—this is a problem faced by other benchmarks. In other words, many AI models are given some of the answers to these tests during training, so they naturally perform well during testing. But some research shows that making slight changes to these tests can lead to huge changes in AI benchmark results.


This controversy raises a question: What is the ultimate test of intelligence? According to Grok 4’s creator and Iron Man enthusiast Elon Musk, predicting the future is the ultimate measure of intelligence.


Alpha Arena Reveals AI Trading Flaws: Western Models Lose 80% of Capital in One Week image 2


And we have to admit, there’s nothing more uncertain than the short-term price of cryptocurrencies. In Azhang’s words, “Our goal with Alpha Arena is to make benchmarking closer to the real world, and the market is perfect for this. Markets are dynamic, adversarial, open-ended, and always unpredictable. They challenge AI in ways that static benchmarks cannot. The market is the ultimate test for AI.”


This insight about markets is deeply rooted in the libertarian principles that gave birth to bitcoin. Economists like Murray Rothbard and Milton Friedman pointed out over a hundred years ago that markets are fundamentally unpredictable by central governments, and only individuals who must bear losses can make rational economic decisions.


In other words, the market is the hardest thing to predict because it depends on the personal views and decisions of intelligent individuals around the world, making it the best test of intelligence.


Azhang mentions in his project description that instructing AI to trade is not just about profit, but also about risk-adjusted returns. This risk dimension is crucial, because a single bad trade can wipe out all previous gains, as seen in Grok 4’s portfolio collapse.


There’s another issue: whether these models learn from their experience trading cryptocurrencies. Technically, this is not easy to achieve, because the initial pre-training of AI models is extremely costly. They can be fine-tuned with their own trading history or others’, and may even keep recent trades in short-term memory or context windows, but that only gets them so far. Ultimately, the correct AI trading model may have to truly learn from its own experience—a technology recently announced in academia, but still a long way from becoming a product. MIT calls these self-adaptive AI models.


How Do We Know This Isn’t Just Luck?


Another analysis of this project and its results so far is that it may be indistinguishable from a “random walk.” A random walk is like rolling dice for every decision. What would this look like on a chart? There’s actually a simulator you can use to answer this question; in fact, it wouldn’t look much different.


Alpha Arena Reveals AI Trading Flaws: Western Models Lose 80% of Capital in One Week image 3


The issue of luck in the market has also been described in detail by intellectuals like Nassim Taleb in his book “Antifragile.” He argues that, statistically, it is completely normal and possible for a trader—say, Qwen3—to be lucky for a whole week straight, making them appear to have extraordinary reasoning ability. Taleb’s point goes further: he believes there are enough traders on Wall Street that one of them could easily get lucky for 20 years straight, building a godlike reputation, with everyone around thinking this trader is a genius—until the luck runs out.


Therefore, for Alpha Arena to produce valuable data, it actually needs to run for a long time, and its patterns and results need to be independently replicated, with real capital at risk, before it can be considered different from a random walk.


Ultimately, so far, it’s been interesting to see open-source, cost-effective models like DeepSeek outperform their closed-source counterparts. Alpha Arena has been a great source of entertainment so far, going viral on X.com last week. No one can predict where it will go next; we’ll have to see if the gamble its creator took—giving five chatbots $50,000 to gamble on crypto—will ultimately pay off.

Disclaimer: The content of this article solely reflects the author's opinion and does not represent the platform in any capacity. This article is not intended to serve as a reference for making investment decisions.
PoolX: Earn new token airdrops
Lock your assets and earn 10%+ APR
Lock now!

You may also like

Trending news

More
1
Is PEPE Gearing Up for a Comeback Rally Amid Wall Street’s Tech Frenzy?
2
BlockDAG’s $430M Surge Positions It as the Real Crypto Winner of 2025, While Injective Slips and Aster Struggles

Crypto prices

More
Bitcoin
Bitcoin
BTC
$114,167.26
+0.99%
Ethereum
Ethereum
ETH
$4,108.77
+1.23%
Tether USDt
Tether USDt
USDT
$1.0000
-0.00%
XRP
XRP
XRP
$2.63
+0.52%
BNB
BNB
BNB
$1,137.41
+0.97%
Solana
Solana
SOL
$198.07
+0.48%
USDC
USDC
USDC
$0.9999
+0.00%
Dogecoin
Dogecoin
DOGE
$0.2013
+0.23%
TRON
TRON
TRX
$0.2987
+0.00%
Cardano
Cardano
ADA
$0.6678
-0.45%
How to buy BTC
Bitget lists BTC – Buy or sell BTC quickly on Bitget!
Trade now
Become a trader now?A welcome pack worth 6200 USDT for new users!
Sign up now
Trade smarter