[AIEWF Preview] Multi-Turn RL for Multi-Hour Agents — with Will Brown, Prime Intellect
Podcast:Latent Space: The AI Engineer Podcast Published On: Fri May 23 2025 Description: In an otherwise heavy week packed with Microsoft Build, Google I/O, and OpenAI io, the worst kept secret in biglab land was the launch of Claude 4, particularly the triumphant return of Opus, which many had been clamoring for. We will leave the specific Claude 4 recap to AINews, however we think that both Gemini’s progress on Deep Think this week and Claude 4 represent the next frontier of progress on inference time compute/reasoning (at last until GPT5 ships this summer).Will Brown’s talk at AIE NYC and open source work on verifiers have made him one of the most prominent voices able to publicly discuss (aka without the vaguepoasting LoRA they put on you when you join a biglab) the current state of the art in reasoning models and where current SOTA research directions lead. We discussed his latest paper on Reinforcing Multi-Turn Reasoning in LLM Agents via Turn-Level Credit Assignment and he has previewed his AIEWF talk on Agentic RL for those with the temerity to power thru bad meetup audio.Full Video EpisodeTimestamps00:00 Introduction to the Podcast and Guests01:00 Discussion on Claude 4 and AI Models03:07 Extended Thinking and Tool Use in AI06:47 Technical Highlights and Model Trustworthiness10:31 Thinking Budgets and Their Implications13:38 Controversy Surrounding Opus and AI Ethics18:49 Reflections on AI Tools and Their Limitations21:58 The Chaos of Predictive Systems22:56 Marketing and Safety in AI Models24:30 Evaluating AI Companies and Their Strategies25:53 The Role of Academia in AI Evaluations27:43 Teaching Taste in Research28:41 Making Educated Bets in AI Research30:12 Recent Developments in Multi-Turn Tool Use32:50 Incentivizing Tool Use in AI Models34:45 The Future of Reward Models in AI39:10 Exploring Flexible Reward Systems This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.latent.space/subscribe