新闻与深度文章
| Sidharth Sinha, Anson Bastos, Xuchao Zhang, Akshay Nambi, Rujia Wang, 和 Chetan Bansal
Deploying large language models (LLMs) in real-world, high-stakes settings is harder than it should be. In high-stakes settings like law, medicine, and cloud incident response, performance and reliability can quickly break down because adapting models to domain-specific requirements is a…
| Shraddha Barke, Arnav Goyal, Alind Khare, 和 Chetan Bansal
As AI agents transition from simple chatbots to autonomous systems capable of managing cloud incidents, navigating complex web interfaces, and executing multi-step API workflows, a new challenge has emerged: transparency. When a human makes a mistake, we can usually trace…
In this edition: Privacy enhancements for multiparty deep learning; using smaller, open-source models to provide relevance judgments; new tool uses AI, data to automate innovation and development; Yasuyuki Matsushita named IEEE 2025 Computer Society Fellow.
| Minghua Ma, Gagan Somashekar, Rujia Wang, Chetan Bansal, 和 Saravan Rajmohan
AIOpsLab is an open-source framework designed to evaluate and improve AI agents for cloud operations, offering standardized, scalable benchmarks for real-world testing, enhancing cloud system reliability.
Holistic motion-capture calibration technique without calibration, manual intervention or custom hardware; Research on AI agents for autonomous clouds; Automating proof-oriented program construction; One-to-many testing for natural language code generation.
New Research | FLASH: Workflow automation agent for diagnosing recurring incidents; METAREFLECTION: Learning instructions for language agents using past reflections; Boosting LLM training efficiency through faster communication between GPUs; and more.
Welcome to Research Focus, a series of blog posts that highlights notable publications, events, code/datasets, new hires and other milestones from across the research community at Microsoft. Time-series forecasting is a technique used to predict future values based on previously…
In this edition: Can LLMs transform natural language into formal method postconditions; Semantically aligned question + code generation for automated insight generation; Explaining CLIP performance disparities on blind/low vision data; plus recent news.
| Anjaly Parayil, Ayush Choure, Fiza Husain, Avi Nayak, Piyali Jana, Rujia Wang, Chetan Bansal, 和 Saravan Rajmohan
Integrating AI into cloud service monitoring improves incident detection accuracy, reduces unnecessary alerts, and enhances overall system reliability. This helps organizations better align with business goals and increase customer satisfaction.