New Agent Tops Benchmarks, Competing with OpenAI’s ChatGPT-5 Pro

Dec 12, 2025
Technology

Photo: Sundar Pichai (Collected)

Staff Report: PNN

Google on Thursday unveiled its new Gemini Deep Research agent, built on the Gemini 3 Pro foundation model. The agent is designed not only to generate research reports but also to allow developers to integrate Google’s advanced research capabilities into their own apps. This is enabled through Google’s Interactions API, offering developers more control in the age of agent-based AI.

The new Gemini Deep Research tool can analyze vast datasets and handle large-context complex prompts. Google says it is being used for tasks such as due diligence and pharmaceutical toxicity and safety research.

Google plans to integrate the agent with Google Search, Google Finance, the Gemini app, and popular NotebookLM soon, making information retrieval easier, with AI performing searches instead of humans.

The Deep Research Gemini 3 Pro model is designed to minimize hallucinations in complex, long-running agent-based tasks. Google also created a new open-source benchmark, DeepSearchQA, to test multi-step, complex information retrieval tasks.

The new agent has been tested on other independent benchmarks like Humanity’s Last Exam and BrowserComp, which measure browser-based task performance. Results show Google’s agent ranks top in its benchmark and Humanity’s Last Exam, while OpenAI’s ChatGPT-5 Pro performs nearly equally in some cases.

However, on the same day, OpenAI released its anticipated GPT-5.2 (Garlic) model, directly competing with Google. OpenAI claims the new model ranks top on most general and proprietary benchmarks.

Notably, Google announced its new agent before the Garlic model release, intensifying competition in the AI space.