Publication BizGenEval: A Systematic Benchmark for Commercial Visual Content Generation Yan Li, Zezi Zeng, Ziwei Zhou, Xin Gao, Muzhao Tian, Yifan Yang, Ming-Hung Cheng, Qiaomin Dai, Yuqing Yang, Lili Qiu, Zhendong Wang, Zhengyuan Yang, Xue Yang, Lijuan Wang, Ji Li, Chong Luo March 2026
Publication Enabling ab initio geometry optimization of strongly correlated systems with transferable deep quantum Monte Carlo P. Szab'o, Zeno Schatzle, Frank Noé March 2026
Publication A Decade-Scale Benchmark Evaluating LLMs’ Clinical Practice Guidelines Detection and Adherence in Multi-turn Conversations Andong Tan, Shuyun Dai, Jinglu Wang, Fengtao Zhou, Yan Lu, Xi (Ada) Wang, Ying-Che Chen, Can Yang, Shujie Liu, Hao Chen March 2026
Publication DFLOP: A Data-driven Framework for Multimodal LLM Training Pipeline Optimization H. An, Sihyun Kim, Chaerim Lim, Hyunjoong Kim, Rathijit Sen, Sangmin Jung, Hye Yoon Lee, Dongwook Kim, Takki Yu, Jinkyu Jeong, Youngsok Kim, Kwanghyun Park March 2026
Publication MegaFlow: Zero-Shot Large Displacement Optical Flow Dingxi Zhang, Fangjinhua Wang, Marc Pollefeys, Haofei Xu March 2026
Publication RESPOND: Responsive Engagement Strategy for Predictive Orchestration and Dialogue Meng-Chen Lee, Costas Panay, Javier Hernandez, Sean Andrist, Dan Bohus, Anatoly Churikov, Andrew D. Wilson March 2026 Project
Publication Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs? Jeonghye Kim, Xufang Luo, Minbeom Kim, Sangmook Lee, Dohyung Kim, Jiwon Jeon, Dongsheng Li, Yuqing Yang March 2026
Publication Willful Disobedience: Automatically Detecting Failures in Agentic Traces Reshabh K Sharma, Shraddha Barke, Ben Zorn March 2026 Project
Publication The Price Reversal Phenomenon: When Cheaper Reasoning Models End Up Costing More Lingjiao Chen, Chi Zhang, Yeye He, Ion Stoica, Matei A. Zaharia, James Zou March 2026