출판 Scaling Agentic Capabilities, Not Context: Efficient Reinforcement Finetuning for Large Toolspaces Karan Gupta, Pranav Vajreshwari, Yash Pandya, Raghav Magazine, Akshay Nambi, Ahmed Awadallah ICLR Agents in the Wild | March 2026
출판 Learning When to Act or Refuse: Guarding Agentic Reasoning Models for Safe Multi-Step Tool Use Aradhye Agarwal, Gurdit Siyan, Yash Pandya, Joykirat Singh, Akshay Nambi, Ahmed Awadallah ICLR Agents in the Wild: Safety, Security, and Beyond | March 2026
Microsoft Research 블로그 Phi-4-reasoning-vision and the lessons of training a multimodal reasoning model 3월 4, 2026 | Jyoti Aneja, Michael Harrison, Neel Joshi, Tyler LaBonte, John Langford, Eduardo Salinas
출판 Phi-4-reasoning-vision-15B Technical Report Jyoti Aneja, Michael Harrison, Neel Joshi, Tyler LaBonte, John Langford, Eduardo Salinas MSR-TR-2026-10 | March 2026 글쓴이 Microsoft Research
출판 Wavelet Predictive Representations for Non-Stationary Reinforcement Learning Min Wang, Xin Li, Ye He, Yao-Hui Li, Hasnaa Bennis, Riashat Islam, Mingzhong Wang ICLR 2026 | October 2025
취업 기회 Research Intern – Foundations of GenAI Posted: January 12, 2026 위치: New York, NY, US 연구분야: Artificial intelligence