Section 7.1 of the paper. This is the part that turns the report from a survey into a call-to-action. Worth scanning if you ever want to choose a research direction here.
1. Bottlenecks & frictions for scaling
Data wall: can synthetic + interaction data generation keep up? When does third-party experience suffice (10 - Bottleneck — Data Wall)?
Resource demand: when does more compute → more intelligence? Universally or only for some classes? Can quantitative scaling trade off against qualitative algorithmic gains?
Paradigm shifts: what's anticipatable? Do current "missing pieces" inform what shifts to expect?
Neural paradigm: when does scaling become economically unviable? How do hardware/software efficiency gains change this?
Research-gets-harder: by how much would AI need to facilitate AI research to counter human-research saturation (12 - Bottleneck — Research Gets Harder)?
Embodied bottleneck: how do physical non-universality and real-time experiment latencies limit intelligence growth (13 - Bottleneck — Abstraction Barrier)?
Abstraction barrier: is the current paradigm fundamentally bounded by human conceptual frameworks?
2. Quantitative forecasting
Identify and measure the right macro-quantities (cost per FLOP, sector-specific AI productivity)
Build coupled mathematical models of how these factors interact
Ensemble across model families
Establish protocols for continuous updating as data arrives
3. Benchmarking ASI
"Comparing against human performance will not produce useful signal to quantitatively distinguish superhuman AIs and AI innovations."
The proposed approaches:
- Multi-agent benchmarks (zero-sum games — how chess engines are evaluated)
- Setter-solver — AI generates the benchmark; another AI is tested against it
- General compression benchmarks (motivated by Universal Induction)
- Indirect: economic productivity, resource efficiency
- Benchmarks distinguishing true qualitative leaps from saturating-metric artifacts
- How to use ASI benchmarks to steer development toward human compatibility
4. Recursive improvement dynamics
Identify and measure each mechanism (code, hardware, data, division of labor)
Establish recursive improvement scaling laws
Study how far a fixed model can be pushed with test-time compute alone
Develop theory of recursive distillation (AlphaZero-style)
Track research productivity of AI Scientist systems
Could specialization → recursive improvements in collectives?
Assume intellectual R&D is fully automated. What frictions remain?
5. Multi-agent scaling
How do groups of AGIs scale relative to individual model scaling?
For which task classes does multi-agent scaling work efficiently?
Develop multi-agent scaling laws — this is the equivalent of Kaplan/Chinchilla for groups
Group alignment — how to steer collectives; how to harden against epistemic hijacking, hallucinations, self-delusions
Epistemic resilience in mixed human-AI collectives
6. Theoretical foundations of superintelligence
Modify/extend AIXI for practical analysis
Theoretical bounds on lossy compression and approximation
Is capability jaggedness a fundamental property or an artifact of human comparison?
Predict in advance what ASI will/won't be able to do
Theoretical frameworks for myopic / non-agentic advanced AI
7. AI safety, alignment, sociocultural
Practical implementation of deliberate slowdown (taxation, prohibition)
What makes AIs and groups of AIs easier to align?
Instrumental sub-goal risks at scale (resource acquisition, self-preservation)
If science is automated, what happens to scientific epistemics?
Economic shift from labor to capital — effect on human "empowerment"
What's notable about this list
These are not idle questions. Several already have dedicated DeepMind papers in the references — Chan et al. 2026 (R&D automation measurement), Tomašev et al. 2025/2026 (virtual agent economies, distributional AGI safety, intelligent AI delegation), Trivedi et al. 2026 (cooperative superintelligence benchmarking), Morris et al. 2026 (capability jaggedness).
The report is in part a coordination signal — here is the research agenda we are actively prosecuting, here are the gaps where outside contribution is most valuable.