Question
On January 1, 2025, which frontier AI lab will have a publicly available model with the highest score on the MMLU benchmark?
Forecast Timeline
Total Forecasters 44
Make a Prediction
Lab | My Prediction | Me | ||
---|---|---|---|---|
ResolvedOpenAI | 54.5% | |||
Anthropic | 26.7% | |||
Google DeepMind | 16% | |||
Meta | 1.1% | |||
Mistral | 0.5% | |||
Other | 1.2% | |||
Total: ?
(? remaining)Which of these actually happened?OpenAI
Community Peer Score
17.7
Community Baseline Score
47.4
Authors:
Opened:Mar 28, 2024
Closes:Jan 1, 2025
Resolves:Jan 2, 2025
Spot Scoring Time:Mar 31, 2024
Learn more about Metaculus NewsMatch
In the following years, what will be the highest LLM scores on the GPQA Diamond benchmark?
93.1
When will AI achieve competency on multi-choice questions across diverse fields of expertise (MMLU)?
21 Nov 2025
When will a Chinese entity develop a model surpassing GPT-4's few-shot performance on MMLU?
19 Mar 2025
Authors:
Opened:Mar 28, 2024
Closes:Jan 1, 2025
Resolves:Jan 2, 2025
Spot Scoring Time:Mar 31, 2024
Learn more about Metaculus NewsMatch
In the following years, what will be the highest LLM scores on the GPQA Diamond benchmark?
93.1
When will AI achieve competency on multi-choice questions across diverse fields of expertise (MMLU)?
21 Nov 2025
When will a Chinese entity develop a model surpassing GPT-4's few-shot performance on MMLU?
19 Mar 2025