Jianfeng Gao

Technical Fellow & Corporate Vice President

AsgardBench | three whit icons on a blue to purple gradient background | first icon shows a laptop screen with a eye in the upper right corner, second icon shows relational nodes | third icon is a security shield with a checkmark

Blog de recherche Microsoft

AsgardBench: A benchmark for visually grounded interactive planning

mars 26, 2026 | Andrea Tupini, Lars Liden, Reuben Tan, Yu Wang, et Jianfeng Gao

Imagine a robot tasked with cleaning a kitchen. It needs to observe its environment, decide what to do, and adjust when things don't go as expected, for example, when the mug it was tasked to wash is already clean, or…

V2GP framework | Three white line icons, showing a target within a rounded square, a checklist, and a robotic arm, on a blue‑to‑green gradient background.

Blog de recherche Microsoft

GroundedPlanBench: Spatially grounded long-horizon task planning for robot manipulation

mars 26, 2026 | Sehun Jung, HyunJee Song, Dong-Hee Kim, Reuben Tan, Jianfeng Gao, Yong Jae Lee, et Donghyun Kim

Vision-language models (VLMs) use images and text to plan robot actions, but they still struggle to decide what actions to take and where to take them. Most systems split these decisions into two steps: a VLM generates a plan in…

blue and purple gradient background with decorative white icons

Blog de recherche Microsoft

PlugMem: Transforming raw agent interactions into reusable knowledge

mars 10, 2026 | Ke Yang, Michel Galley, Chenglong Wang, Jianfeng Gao, Jiawei Han, et ChengXiang Zhai

It seems counterintuitive: giving AI agents more memory can make them less effective. As interaction logs accumulate, they grow large, fill with irrelevant content, and become increasingly difficult to use. More memory means that agents must search through larger volumes of…

Diagram showing visual, audio, and document icons feeding into a central network icon of connected people, which then leads to a checkmark symbol, all on a blue‑to‑purple gradient background.

Blog de recherche Microsoft

Argos: Multimodal reinforcement learning with agentic verifier for AI agents

janvier 20, 2026 | Reuben Tan, Baolin Peng, Zhengyuan Yang, Oier Mees, et Jianfeng Gao

Argos improves multimodal RL by evaluating whether an agent’s reasoning aligns with what it observes over time. The approach reduces visual hallucinations and produces more reliable, data-efficient agents for real-world applications.

Three white line icons on a gradient background transitioning from blue to pink. From left to right: a network or molecule structure with a central circle and six surrounding nodes, a 3D cube, and an open laptop with an eye symbol above it.

Blog de recherche Microsoft

MindJourney enables AI to explore simulated 3D worlds to improve spatial interpretation

août 20, 2025 | Yuncong Yang, Reuben Tan, Swadheen Shukla, et Jianfeng Gao

MindJourney can enable AI to navigate and interpret 3D environments from limited visual input, potentially improving performance in navigation, planning, and safety-critical tasks.

CollabLLM blog hero | flowchart diagram starting in the upper left corner with an icon of two overlapping chat bubbles; arrow pointing right to an LLM network node icon; branching down to show three simulated users; right arrow to a "Reward" box

Blog de recherche Microsoft

CollabLLM: Teaching LLMs to collaborate with users

juillet 15, 2025 | Shirley Wu, Michel Galley, Baolin Peng, Swadheen Shukla, et Jianfeng Gao

Recipient of an ICML 2025 Outstanding Paper Award, CollabLLM improves how LLMs collaborate with users, including knowing when to ask questions and how to adapt tone and communication style to different situations. This approach helps move AI toward more user-centric…

Blog de recherche Microsoft

Research Focus: Week of April 21, 2025

avril 23, 2025

In this issue: our CHI 2025 & ICLR 2025 contributions, plus research on causal reasoning & LLMs; countering LLM jailbreak attacks; and how people use AI vs. AI-alone. Also, SVP of Microsoft Health Jim Weinstein talks rural healthcare innovation.

Blog de recherche Microsoft

Research Focus: Week of March 24, 2025

mars 26, 2025

In this issue, we examine a new conversation segmentation method that delivers more coherent and personalized agent conversation, and we review efforts to improve MLLMs’ understanding of geologic maps. Check out the latest research and other updates.

Gradient background transitioning from blue on the left to pink on the right. In the center, a rectangular box with ‘MAGMA’ written in bold white letters. To the left, an icon of a globe representing Earth. To the right, an icon of a computer monitor displaying a globe. Arrows connect these three elements in a circular flow, indicating interaction or data exchange between Earth, MAGMA, and the computer.

Blog de recherche Microsoft

Magma: A foundation model for multimodal AI agents across digital and physical worlds

février 25, 2025 | Swadheen Shukla, Jianwei Yang, Reuben Tan, Qianhui Wu, et Jianfeng Gao

Explore Magma, a foundation model that can empower AI assistants to interpret environments, plan actions, and execute tasks across digital and physical spaces. Now available, learn how it advances the field of agentic AI.

Jianfeng Gao

Nouvelles et reportages

AsgardBench: A benchmark for visually grounded interactive planning

GroundedPlanBench: Spatially grounded long-horizon task planning for robot manipulation

PlugMem: Transforming raw agent interactions into reusable knowledge

Argos: Multimodal reinforcement learning with agentic verifier for AI agents

MindJourney enables AI to explore simulated 3D worlds to improve spatial interpretation

CollabLLM: Teaching LLMs to collaborate with users

Research Focus: Week of April 21, 2025

Research Focus: Week of March 24, 2025

Magma: A foundation model for multimodal AI agents across digital and physical worlds

Contact Jianfeng Gao

Microsoft Research Lab – Redmond