Tuesday, September 9, 2025
Smart Again
  • Home
  • Trending
  • Politics
  • Law & Defense
  • Community
  • Contact Us
No Result
View All Result
Smart Again
  • Home
  • Trending
  • Politics
  • Law & Defense
  • Community
  • Contact Us
No Result
View All Result
Smart Again
No Result
View All Result
Home Trending

A tournament tried to test how well experts could forecast AI progress. They were all wrong.

September 5, 2025
in Trending
Reading Time: 8 mins read
0 0
A A
0
A tournament tried to test how well experts could forecast AI progress. They were all wrong.
Share on FacebookShare on Twitter


Two of the smartest people I follow in the AI world recently sat down to check in on how the field is going.

One was François Chollet, creator of the widely used Keras library and author of the ARC-AGI benchmark, which tests if AI has reached “general” or broadly human-level intelligence. Chollet has a reputation as a bit of an AI bear, eager to deflate the most boosterish and over-optimistic predictions of where the technology is going. But in the discussion, Chollet said his timelines have gotten shorter recently. Researchers had made big progress on what he saw as the major obstacles to achieving artificial general intelligence, like models’ weakness at recalling and applying things they learned before.

Sign up here to explore the big, complicated problems the world faces and the most efficient ways to solve them. Sent twice a week.

Chollet’s interlocutor — Dwarkesh Patel, whose podcast has become the single most important place for tracking what top AI scientists are thinking — had, in reaction to his own reporting, moved in the opposite direction. While humans are great at learning continuously or “on the job,” Patel has become more pessimistic that AI models can gain this skill any time soon.

“[Humans are] learning from their failures. They’re picking up small improvements and efficiencies as they work,” Patel noted. “It doesn’t seem like there’s an easy way to slot this key capability into these models.”

All of which is to say, two very plugged-in, smart people who know the field as well as anyone else can come to perfectly reasonable yet contradictory conclusions about the pace of AI progress.

In that case, how is someone like me, who’s certainly less knowledgeable than Chollet or Patel, supposed to figure out who’s right?

The forecaster wars, three years in

One of the most promising approaches I’ve seen to resolving — or at least adjudicating — these disagreements comes from a small group called the Forecasting Research Institute.

In the summer of 2022, the institute began what it calls the Existential Risk Persuasion Tournament (XPT for short). XPT was intended to “produce high-quality forecasts of the risks facing humanity over the next century.” To do this, the researchers (including Penn psychologist and forecasting pioneer Philip Tetlock and FRI head Josh Rosenberg) surveyed subject matter experts who study threats that at least conceivably could jeopardize humanity’s survival (like AI) in the summer of 2022.

But they also asked “superforecasters,” a group of people identified by Tetlock and others who have proven unusually accurate at predicting events in the past. The superforecaster group was not made up of experts on existential threats to humanity, but rather, generalists from a variety of occupations with solid predictive track records.

On each risk, including AI, there were big gaps between the area-specific experts and the generalist forecasters. The experts were much more likely than the generalists to say that the risk they study could lead to either human extinction or mass deaths. This gap persisted even after the researchers had the two groups engage in structured discussions meant to identify why they disagreed.

The two just had fundamentally different worldviews. In the case of AI, subject matter experts thought the burden of proof should be on skeptics to show why a hyper-intelligent digital species wouldn’t be dangerous. The generalists thought the burden of proof should be on the experts to explain why a technology that doesn’t even exist yet could kill us all.

So far, so intractable. Luckily for us observers, each group was asked not only to estimate long-term risks over the next century, which can’t be confirmed any time soon, but also events in the nearer future. They were specifically tasked with predicting the pace of AI progress in the short, medium, and long run.

In a new paper, the authors — Tetlock, Rosenberg, Simas Kučinskas, Rebecca Ceppas de Castro, Zach Jacobs, Jordan Canedy, and Ezra Karger — go back and evaluate how well the two groups fared at predicting the three years of AI progress since summer 2022.

In theory, this could tell us which group to believe. If the concerned AI experts proved much better at predicting what would happen between 2022–2025, Perhaps that’s an indication that they have a better read on the longer-run future of the technology, and therefore, we should give their warnings greater credence.

Alas, in the words of Ralph Fiennes, “Would that it were so simple!” It turns out the three-year results leave us without much more sense of who to believe.

Both the AI experts and the superforecasters systematically underestimated the pace of AI progress. Across four benchmarks, the actual performance of state-of-the-art models in summer 2025 was better than either superforecasters or AI experts predicted (though the latter was closer). For instance, superforecasters thought an AI would get gold in the International Mathematical Olympiad in 2035. Experts thought 2030. It happened this summer.

“Overall, superforecasters assigned an average probability of just 9.7 percent to the observed outcomes across these four AI benchmarks,” the report concluded, “compared to 24.6 percent from domain experts.”

That makes the domain experts look better. They put slightly higher odds that what actually happened would happen — but when they crunched the numbers across all questions, the authors concluded that there was no statistically significant difference in aggregate accuracy between the domain experts and superforecasters. What’s more, there was no correlation between how accurate someone was in projecting the year 2025 and how dangerous they thought AI or other risks were. Prediction remains hard, especially about the future, and especially about the future of AI.

The only trick that reliably worked was aggregating everyone’s forecasts — lumping all the predictions together and taking the median produced substantially more accurate forecasts than any one individual or group. We may not know which of these soothsayers are smart, but the crowds remain wise.

Perhaps I should have seen this outcome coming. Ezra Karger, an economist and co-author on both the initial XPT paper and this new one, told me upon the first paper’s release in 2023 that, “over the next 10 years, there really wasn’t that much disagreement between groups of people who disagreed about those longer run questions.” That is, they already knew that the predictions of people worried about AI and people less worried were pretty similar.

So, it shouldn’t surprise us too much that one group wasn’t dramatically better than the other at predicting the years 2022–2025. The real disagreement wasn’t about the near-term future of AI but about the danger it poses in the medium and long run, which is inherently harder to judge and more speculative.

There is, perhaps, some valuable information in the fact that both groups underestimated the rate of AI progress: perhaps that’s a sign that we have all underestimated the technology, and it’ll keep improving faster than anticipated. Then again, the predictions in 2022 were all made before the release of ChatGPT in November of that year. Who do you remember before that app’s rollout predicting that AI chatbots would become ubiquitous in work and school? Didn’t we already know that AI made big leaps in capabilities in the years 2022–2025? Does that tell us anything about whether the technology might not be slowing down, which, in turn, would be key to forecasting its long-term threat?

Reading the latest FRI report, I wound up in a similar place to my former colleague Kelsey Piper last year. Piper noted that failing to extrapolate trends, especially exponential trends, out into the future has led people badly astray in the past. The fact that relatively few Americans had Covid in January 2020 did not mean Covid wasn’t a threat; it meant that the country was at the start of an exponential growth curve. A similar kind of failure would lead one to underestimate AI progress and, with it, any potential existential risk.

At the same time, in most contexts, exponential growth can’t go on forever; it maxes out at some point. It’s remarkable that, say, Moore’s law has broadly predicted the growth in microprocessor density accurately for decades — but Moore’s law is famous in part because it’s unusual for trends about human-created technologies to follow so clean a pattern.

“I’ve increasingly come to believe that there is no substitute for digging deep into the weeds when you’re considering these questions,” Piper concluded. “While there are questions we can answer from first principles, [AI progress] isn’t one of them.”

I fear she’s right — and that, worse, mere deference to experts doesn’t suffice either, not when experts disagree with each other on both specifics and broad trajectories. We don’t really have a good alternative to trying to learn as much as we can as individuals and, failing that, waiting and seeing. That’s not a satisfying conclusion to a newsletter — or a comforting answer to one of the most important questions facing humanity — but it’s the best I can do.

You’ve read 1 article in the last month

Here at Vox, we’re unwavering in our commitment to covering the issues that matter most to you — threats to democracy, immigration, reproductive rights, the environment, and the rising polarization across this country.

Our mission is to provide clear, accessible journalism that empowers you to stay informed and engaged in shaping our world. By becoming a Vox Member, you directly strengthen our ability to deliver in-depth, independent reporting that drives meaningful change.

We rely on readers like you — join us.

Swati Sharma

Vox Editor-in-Chief



Source link

Tags: Artificial IntelligenceExpertsforecastFuture PerfectInnovationprogressTechnologytestTournamentwrong
Previous Post

Try as they might, Republicans can’t silence Epstein’s victims

Next Post

The US Open is hotter than Coachella. That’s what makes it awful.

Related Posts

Is 3I/ATLAS An Interstellar Comet Or Something Else?
Trending

Is 3I/ATLAS An Interstellar Comet Or Something Else?

September 9, 2025
Sotomayor says SCOTUS ruling lets ICE “seize anyone who looks Latino”
Trending

Sotomayor says SCOTUS ruling lets ICE “seize anyone who looks Latino”

September 8, 2025
“This is not a hoax”: Epstein birthday book note allegedly from Trump shared by House Dems
Trending

“This is not a hoax”: Epstein birthday book note allegedly from Trump shared by House Dems

September 8, 2025
SCOTUS tells ICE it can target people based on race
Trending

SCOTUS tells ICE it can target people based on race

September 8, 2025
Amy Coney Barrett ‘Can’t Answer’ If Trump Has The ‘Right To Do Anything’ As President
Trending

Amy Coney Barrett ‘Can’t Answer’ If Trump Has The ‘Right To Do Anything’ As President

September 8, 2025
From Guernica to Gaza: How bombing became standard procedure
Trending

From Guernica to Gaza: How bombing became standard procedure

September 8, 2025
Next Post
The US Open is hotter than Coachella. That’s what makes it awful.

The US Open is hotter than Coachella. That’s what makes it awful.

Why the Trump-Modi split is such a disaster

Why the Trump-Modi split is such a disaster

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • Trending
  • Comments
  • Latest
White Nationalist Struggles With Whether Cubans Can Be American

White Nationalist Struggles With Whether Cubans Can Be American

July 29, 2025
Clyburn blasts GOP proposal to oust him from Congress

Clyburn blasts GOP proposal to oust him from Congress

August 7, 2025
Israel’s Gaza policy is viciously cruel — and strategically disastrous

Israel’s Gaza policy is viciously cruel — and strategically disastrous

August 7, 2025
Democrats accuse GOP of “weaponizing” FBI against Texas lawmakers

Democrats accuse GOP of “weaponizing” FBI against Texas lawmakers

August 7, 2025
Trump’s drops IVF promise, preferring to blame women for infertility

Trump’s drops IVF promise, preferring to blame women for infertility

August 8, 2025
“Chasing relevance”: Maron sounds off on “desperate” Maher

“Chasing relevance”: Maron sounds off on “desperate” Maher

August 25, 2025
“They stole an election”: Former Florida senator found guilty in “ghost candidates” scandal

“They stole an election”: Former Florida senator found guilty in “ghost candidates” scandal

0
The Hawaii senator who faced down racism and ableism—and killed Nazis

The Hawaii senator who faced down racism and ableism—and killed Nazis

0
The murder rate fell at the fastest-ever pace last year—and it’s still falling

The murder rate fell at the fastest-ever pace last year—and it’s still falling

0
Trump used the site of the first assassination attempt to spew falsehoods

Trump used the site of the first assassination attempt to spew falsehoods

0
MAGA church plans to raffle a Trump AR-15 at Second Amendment rally

MAGA church plans to raffle a Trump AR-15 at Second Amendment rally

0
Tens of thousands are dying on the disability wait list

Tens of thousands are dying on the disability wait list

0
Nobody wants this dirty gas plant. Trump is forcing it to stay open.

Nobody wants this dirty gas plant. Trump is forcing it to stay open.

September 9, 2025
Is 3I/ATLAS An Interstellar Comet Or Something Else?

Is 3I/ATLAS An Interstellar Comet Or Something Else?

September 9, 2025
Sotomayor says SCOTUS ruling lets ICE “seize anyone who looks Latino”

Sotomayor says SCOTUS ruling lets ICE “seize anyone who looks Latino”

September 8, 2025
“This is not a hoax”: Epstein birthday book note allegedly from Trump shared by House Dems

“This is not a hoax”: Epstein birthday book note allegedly from Trump shared by House Dems

September 8, 2025
Supreme Court blesses racial profiling by ICE

Supreme Court blesses racial profiling by ICE

September 8, 2025
SCOTUS tells ICE it can target people based on race

SCOTUS tells ICE it can target people based on race

September 8, 2025
Smart Again

Stay informed with Smart Again, the go-to news source for liberal perspectives and in-depth analysis on politics, social justice, and more. Join us in making news smart again.

CATEGORIES

  • Community
  • Law & Defense
  • Politics
  • Trending
  • Uncategorized
No Result
View All Result

LATEST UPDATES

  • Nobody wants this dirty gas plant. Trump is forcing it to stay open.
  • Is 3I/ATLAS An Interstellar Comet Or Something Else?
  • Sotomayor says SCOTUS ruling lets ICE “seize anyone who looks Latino”
  • About Us
  • Advertise with Us
  • Disclaimer
  • Privacy Policy
  • DMCA
  • Cookie Privacy Policy
  • Terms and Conditions
  • Contact Us

Copyright © 2024 Smart Again.
Smart Again is not responsible for the content of external sites.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Home
  • Trending
  • Politics
  • Law & Defense
  • Community
  • Contact Us

Copyright © 2024 Smart Again.
Smart Again is not responsible for the content of external sites.

Go to mobile version