OpenAI, the company behind ChatGPT, is developing a secretive AI project code-named “Strawberry,” aimed at significantly enhancing AI reasoning capabilities and potentially leading to breakthroughs in autonomous research and problem-solving.
While details are closely guarded, Reuters has obtained information from sources and internal documents suggesting that Strawberry involves a novel approach to training and processing AI models. This approach is expected to enable AI to perform tasks that current systems cannot. Unlike existing models that primarily generate text-based responses, Strawberry seeks to equip AI with the ability to “plan ahead” and autonomously navigate the internet to conduct what OpenAI calls “deep research.” This represents a substantial leap forward, requiring a deeper understanding of context, logic, and multi-step problem-solving.
The pursuit of human-level AI reasoning is a central focus in the industry, with companies like Google, Meta, and Microsoft exploring various techniques. Experts believe that achieving this breakthrough could allow AI to drive scientific discoveries, develop complex software, and tackle challenges that currently require human intuition and planning.
Although OpenAI has not publicly confirmed specifics about Strawberry, a company spokesperson told Reuters, “We want our AI models to see and understand the world more like we do. Continuous research into new AI capabilities is a common practice in the industry, with a shared belief that these systems will improve in reasoning over time.”
Strawberry appears to be an evolution of an earlier OpenAI project known as Q*, which generated internal excitement for its advanced reasoning abilities. Sources who witnessed Q* demos reported its ability to solve complex math and science problems beyond the capabilities of currently available AI.
While the exact mechanisms remain undisclosed, sources suggest that Strawberry involves a specialized form of “post-training”—a process of refining AI models after they’ve been trained on massive datasets. This post-training phase, potentially involving techniques like “fine-tuning” and self-generated training data, is crucial for honing the AI’s reasoning abilities.