ニュース&特集
アワード | ACM SIGMICRO
Esha Choukse receives 2025 SIGMICRO Early Career Award
Choukse was recognized for her foundational contributions to hardware memory compression and to sustainable and efficient datacenter systems.
Research Focus: Week of September 23, 2024
Welcome to Research Focus, a series of blog posts that highlights notable publications, events, code/datasets, new hires and other milestones from across the research community at Microsoft. Time-series forecasting is a technique used to predict future values based on previously…
Research Focus: Week of April 15, 2024
In this issue: New research on appropriate reliance on generative AI; Power management opportunities for LLMs in the cloud; LLMLingua-2 improves task-agnostic prompt compression; Enhancing COMET to embrace under-resourced African languages:
Splitwise improves GPU usage by splitting LLM inference phases
| Esha Choukse, Chaojie Zhang, Íñigo Goiri, Aashaka Shah, Saeed Maleki, Rodrigo Fonseca, と Ricardo Bianchini
Expanded LLM use creates new demands on cloud GPU capacity. Splitwise presents an efficient solution by separating the two essential phases of LLM inference, achieving higher throughput within a limited power budget.