AI agents must solve a host of tasks that require different speeds and levels of reasoning and planning capabilities. Ideally, an agent should know when to use its direct memory and when to use more complex reasoning capabilities. However, designing agentic systems that can properly handle tasks based on their requirements remains a challenge.
In a new paper, researchers at Google DeepMind introduce Talker-Reasoner, an agentic framework inspired by the “two systems” model of human cognition. This framework enables AI agents to find the right balance between different types of reasoning and provide a more fluid user experience.
The two-systems theory, first introduced by Nobel laureate Daniel Kahneman, suggests that human thought is driven by two distinct systems. System 1 is fast, intuitive, and automatic. It governs our snap judgments, such as reacting to sudden events or recognizing familiar patterns. System 2, in contrast, is slow, deliberate, and analytical. It enables complex problem-solving, planning, and reasoning.
While often treated as separate, these systems interact continuously. System 1 generates impressions, intuitions, and intentions. System 2 evaluates these suggestions and, if endorsed, integrates them into explicit beliefs and deliberate choices. This interplay allows us to seamlessly navigate a wide range of situations, from everyday routines to challenging problems.
Current AI agents mostly operate in a System 1 mode. They excel at pattern recognition, quick reactions, and repetitive tasks. However, they often fall short in scenarios requiring multi-step planning, complex reasoning, and strategic decision-making—the hallmarks of System 2 thinking.
The Talker-Reasoner framework proposed by DeepMind aims to equip AI agents with both System 1 and System 2 capabilities. It divides the agent into two distinct modules: the Talker and the Reasoner.
The Talker is the fast, intuitive component analogous to System 1. It handles real-time interactions with the user and the environment. It perceives observations, interprets language, retrieves information from memory, and generates conversational responses. The Talker agent usually uses the in-context learning (ICL) abilities of large language models (LLMs) to perform these functions.
The Reasoner embodies the slow, deliberative nature of System 2. It performs complex reasoning and planning. It is primed to perform specific tasks and interacts with tools and external data sources to augment its knowledge and make informed decisions. It also updates the agent’s beliefs as it gathers new information. These beliefs drive future decisions and serve as the memory that the Talker uses in its conversations.
“The Talker agent focuses on generating natural and coherent conversations with the user and interacts with the environment, while the Reasoner agent focuses on performing multi-step planning, reasoning, and forming beliefs, grounded in the environment information provided by the Talker,” the researchers write.
The two modules interact primarily through a shared memory system. The Reasoner updates the memory with its latest beliefs and reasoning results, while the Talker retrieves this information to guide its interactions. This asynchronous communication allows the Talker to maintain a continuous flow of conversation, even as the Reasoner carries out its more time-consuming computations in the background.
“This is analogous to [the] behavioral science dual-system approach, with System 1 always being on while System 2 operates at a fraction of its capacity,” the researchers write. “Similarly, the Talker is always on and interacting with the environment, while the Reasoner updates beliefs informing the Talker only when the Talker waits for it, or can read it from memory.”
The researchers tested their framework in a sleep coaching application. The AI coach interacts with users through natural language, providing personalized guidance and support for improving sleep habits. This application requires a combination of quick, empathetic conversation and deliberate, knowledge-based reasoning.
The Talker component of the sleep coach handles the conversational aspect, providing empathetic responses and guiding the user through different phases of the coaching process. The Reasoner maintains a belief state about the user’s sleep concerns, goals, habits, and environment. It uses this information to generate personalized recommendations and multi-step plans. The same framework could be applied to other applications, such as customer service and personalized education.
The DeepMind researchers outline several directions for future research. One area of focus is optimizing the interaction between the Talker and the Reasoner. Ideally, the Talker should automatically determine when a query requires the Reasoner’s intervention and when it can handle the situation independently. This would minimize unnecessary computations and improve overall efficiency.
Another direction involves extending the framework to incorporate multiple Reasoners, each specializing in different types of reasoning or knowledge domains. This would allow the agent to tackle more complex tasks and provide more comprehensive assistance.