Safety in the Inference-Time Compute Paradigm Expression of Interest

Schmidt Sciences

Opens Mar 13 2025 12:00 AM (EDT)

Deadline May 1 2025 07:59 AM (EDT)

Description

Overview

Schmidt Sciences is a philanthropic organization that accelerates scientific knowledge and breakthroughs to support a thriving world. We are pleased to announce a new request for proposals to support research on technical AI safety, focused on the inference-time compute paradigm.

The goal of the AI Safety Science program is to deepen our understanding of the safety properties of systems built with large language models (LLMs) and to develop well-founded, concrete, implementable technical methods for testing, evaluating, and improving the safety of LLMs. The program will improve our understanding of various testing methodologies and drive methods to design safer AI systems, grow the technical AI safety academic community, and ensure that safety theory is informed by practice—and vice versa.

Schmidt Sciences seeks research proposals to advance the science of AI safety, specifically focused on the inference-time compute paradigm. Research proposals should anticipate a budget of up to USD $500,000 for projects between 12-18 months.

Context

Large language models (LLMs) have entered a new scaling paradigm. These models have historically been characterized by scaling laws relating model size, dataset size, and pre-training compute to model performance (Kaplan et al., 2020). However, recent breakthroughs across all frontier AI labs demonstrate a different approach with apparently better performance (e.g., OpenAI’s o1 and o3, DeepSeek’s R1, Google DeepMind’s Flash Thinking, xAI’s Grok 3 (Think), Anthropic’s Claude 3.7 Sonnet). Through reinforcement learning and other techniques, AI labs have developed methods to efficiently allocate more compute during inference time to dramatically improve LLM performance. This has created “reasoning models,” LLMs that leverage additional compute at inference time through RL-optimized chain-of-thought reasoning, recursive refinement, and other methods.

These reasoning models can break down complex problems into steps, verify intermediate conclusions, and explore multiple approaches before producing a final answer (e.g., Guo et al., 2025; Muennighoff et al., 2025; Snell et al., 2024; Geiping et al., 2025). This new inference-time compute paradigm (or test-time compute paradigm) demands rigorous scientific investigation for assuring safety, because we expect novel failure modes, novel opportunities for safeguards, and emergent behaviors to arise when models engage in increasingly complex multi-step reasoning at inference time. In particular, in this Request for Proposals, we seek swiftly conducted novel research to address current unknowns and make progress toward concrete outputs—such as tools, models, or other research artifacts that enable further progress.

Core question

We are interested in funding the crucial work needed to both understand the implications of this paradigm on model safety and how to utilize the inference-time compute paradigm to actively make LLMs safer. (For detailed discussion on how Schmidt Sciences thinks about safety, see our website and research agenda.)

Our core RFP question: What is the most critical technical AI safety challenge or opportunity that has emerged as a result of the inference-time compute paradigm? How would you address it?

Illustrative examples of project ideas

This section is not designed to direct or constrain your creative thinking about the hardest safety problems in inference-time compute. Rather, it provides illustrations of problems that might be considered both challenging and worthy of study. These ideas do not represent the full scope of the inference-time compute paradigm.

Example 1: Enduring Problems and New Risks. Issues like adversarial robustness, contamination, and scalable oversight are prominent and worthwhile areas of safety research, but recent breakthroughs in inference-time compute also warrant rigorous investigation like chain-of-thought faithfulness, new problems in reward gaming, and safe exploration. We encourage applications that investigate longstanding risks, emerging challenges in this paradigm, or both.
- Ex. Chain-of-Thought Faithfulness: As models leverage large-scale RL and inference-time compute, how do we evaluate whether and to what extent their chains of thought remain faithful and monitorable? How does this faithfulness vary with the prompt or context? How can we assess whether and to what extent CoT faithfulness empirically translates into greater downstream safety?
Example 2: Understanding Safety and Designing Safely. Research can focus on scientifically understanding the risks and implications of inference-time compute on model safety (e.g. evaluations science), or make models safer through intentional design changes or by harnessing inference-time compute as an active tool. All directions are valuable for improving AI safety.
- Ex. Safe Prompting: How do different prompting strategies affect the ability to prevent unsafe behaviors in misaligned models, including prompt injection within the chain of thought? Are there inference-time interventions that can effectively disable harmful capabilities?

We encourage applications for research that discover novel failure modes emerging from inference-time compute, demonstrate the replicable nature of recently surfaced problems to certify their validity, design robust evaluations that quantify and measure associated risks, or construct targeted interventions that actively enhance model safety.

Projects should aim to produce tangible research outcomes that advance the scientific understanding of inference-time compute safety—such as theoretical analyses, rigorously validated evaluation designs, mitigation strategies, functional prototype implementations, or reproducible experimental results.

Out of scope topics

Policy, regulatory, and governance frameworks
Purely organizational guidelines around inference-time scaling
Human-focused evaluation or human-based red-teaming research
- Note: Research into the safety-relevant challenges and opportunities associated specifically with inference-time compute in the context of human-LLM dyads is in scope
Broad alignment theory or catastrophic risk research (e.g., CBRN) unrelated to inference-time compute
Pre-training or post-training safety research unrelated to inference-time compute
Conventional formal verification research unrelated to inference-time compute
Any research unrelated to technical AI safety or unrelated to inference-time compute

Application process

This RFP will have two stages: an expression of interest (EoI) and a full proposal.

Stage 1: Expressions of Interest (EoIs)

At this stage, applicants submit up to 500 words addressing the core RFP question. We seek concisely described ideas for rigorous research that can, in a relatively short period of time, lead to progress on significant issues or opportunities in safety. The ideal project would focus on one critical technical safety problem or opportunity relevant for inference-time computation, and propose concrete/tangible methods for addressing it. The response should contain a crisp statement of what you think is the most critical technical challenge or opportunity related to AI safety in the evolving inference-time compute paradigm, why it is the most critical, and brief highlights of how you would tackle that challenge.

We encourage efforts that focus on the full range of available reasoning models, research that aims to produce generalizable insights rather than isolated model evaluations, and efforts that seek to develop a rigorous scientific understanding of this new paradigm.

Submissions must include the following information:

Project title
Principal investigator and institution and collaborators
Problem: What is the most critical technical AI safety challenge or opportunity that has emerged as a result of the inference-time compute paradigm? What makes this challenge or opportunity the most critical?
Approach: How will you address this challenge or opportunity?
Impact: What will we understand about AI safety through inference-time compute after your project is complete?
Estimated project budget

All expressions of interest must be submitted in English. Researchers may submit more than one expression of interest. Applicants can submit up to 500 words in their expression of interest, references do not count toward the world limit. Relevant diagrams or figures can be uploaded, but other types of submissions will not be reviewed.

The deadline for EoIs is Wednesday, April 30, 2025 at 11:59 PM, Anywhere on Earth. Late submissions will not be accepted.

Stage 2: Full proposals

After reviewing EoI submissions, we will invite a subset of applicants to submit full project proposals. Full proposals include more detail on goals, research plans, research outputs, and a detailed budget.

Submissions will be assessed on criteria that include:

Alignment with the RFP scope and focus on inference-time compute
Alignment with Basic Research category of the AI Safety Science research agenda
Likelihood that the proposed research will truly move the needle in AI safety
Novelty of the proposed research
Track record of applicant

All full proposals must be submitted in English.

Project duration

We are looking to fund 12- to 18-month projects. Not all teams are positioned to jump right into this work in Q3 2025 and deliver results in 12-18 months. To that end, if invited to submit a full proposal, applicants will have the opportunity to outline the team’s capabilities and existing infrastructure to start this work quickly.

Project resources

Awards will be up to USD $500,000 per project. Some projects might require much less funding to execute.

In addition to funding, the Safety Science program provides:

Dedicated compute: Through a partnership with the Center for AI Safety, Safety Science offers access to high-performance compute resources.
API access: Through a partnership with OpenAI, Safety Science offers OpenAI API credits to experiment with state-of-the-art reasoning models. (We are developing partnerships with other frontier model providers.)
Community access: Safety Science gathers its researcher community annually for a convening, designed to maximally benefit the community. This typically includes connecting researchers to each other, to other AI safety funders, to AI safety researchers in frontier labs, and to other important stakeholders in the space.

Eligibility

We invite individual researchers, research teams, research institutions, and multi-institution collaborations within university, national laboratory, institute, non-profit research organizations, or agency settings to submit research ideas.

We encourage collaborations across geographic boundaries, particularly outside North America and Western Europe. International applicants are welcome, and there is no requirement to include U.S.-based institutions.

Indirect costs of any project that we fund must be at or below 10% to comply with our policy.

Timeline

March 13: RFP launched
March & April: Office hours
April 30: Expression of Interest deadline
May: Selected applicants invited to submit full proposals

Office hours

We will hold virtual office hours on the following dates and times:

Thursday, March 27, 9:00 - 11:00 AM EDT
Wednesday, April 2, 11:00 AM - 1:00 PM EDT
Thursday, April 9, 3:00 - 5:00 PM EDT

Please sign up for office hours here. Please only sign up for one office hours slot.

Questions

Please send any questions to Ryan Gajarawala at aisafety@schmidtsciences.org