{"id":992625,"date":"2023-12-20T09:00:00","date_gmt":"2023-12-20T17:00:00","guid":{"rendered":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/blog\/research-focus-week-of-december-18-2023\/"},"modified":"2023-12-19T08:14:32","modified_gmt":"2023-12-19T16:14:32","slug":"research-focus-week-of-december-18-2023","status":"publish","type":"post","link":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/blog\/research-focus-week-of-december-18-2023\/","title":{"rendered":"Research Focus: Week of December 18, 2023"},"content":{"rendered":"\n<figure class=\"wp-block-pullquote\"><blockquote><p><em class=\"\">Welcome to Research Focus, a series of blog posts that highlights notable publications, events, code\/datasets, new hires and other milestones from across the research community at Microsoft.<\/em><\/p><\/blockquote><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1400\" height=\"788\" src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/12\/RF31-BlogHeroFeature-1400x788-1.png\" alt=\"Research Focus\nDecember 18th, 2023\" class=\"wp-image-992748\" srcset=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/12\/RF31-BlogHeroFeature-1400x788-1.png 1400w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/12\/RF31-BlogHeroFeature-1400x788-1-300x169.png 300w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/12\/RF31-BlogHeroFeature-1400x788-1-1024x576.png 1024w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/12\/RF31-BlogHeroFeature-1400x788-1-768x432.png 768w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/12\/RF31-BlogHeroFeature-1400x788-1-1066x600.png 1066w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/12\/RF31-BlogHeroFeature-1400x788-1-655x368.png 655w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/12\/RF31-BlogHeroFeature-1400x788-1-343x193.png 343w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/12\/RF31-BlogHeroFeature-1400x788-1-240x135.png 240w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/12\/RF31-BlogHeroFeature-1400x788-1-640x360.png 640w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/12\/RF31-BlogHeroFeature-1400x788-1-960x540.png 960w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/12\/RF31-BlogHeroFeature-1400x788-1-1280x720.png 1280w\" sizes=\"auto, (max-width: 1400px) 100vw, 1400px\" \/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading h6 has-blue-color has-text-color has-link-color wp-elements-a584a2137da4151ecbde93fba771f798\" id=\"new-research\">NEW RESEARCH<\/h3>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"naserex-optimizing-early-exits-via-automl-for-scalable-efficient-inference-in-big-image-streams\">NASerEx: Optimizing Early Exits via AutoML for Scalable Efficient Inference in Big Image Streams<\/h2>\n\n\n\n<p>Deep Neural Networks (DNNs) are essentially stacked transformation functions (layers) that generate progressively complex features\/encoding. This makes them universal approximators and allows for unprecedented success in complex tasks. This inferential effectiveness comes at the cost of increased computational complexity, making DNNs hard to scale for operational efficiency in AI applications, especially when running on resource-constrained hardware.&nbsp;<\/p>\n\n\n\n<p>In a recent paper: <a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/publication\/naserex-optimizing-early-exits-via-automl-for-scalable-efficient-inference-in-big-image-streams\/\" target=\"_blank\" rel=\"noreferrer noopener\">NASerEx: Optimizing Early Exits via AutoML for Scalable Efficient Inference in Big Image Streams,<\/a> researchers from Microsoft and their collaborators propose a new framework to address this problem. NASerEX leverages neural architecture search (NAS) with a novel saliency-constrained search space and exit decision metric to learn suitable early exit structures to augment deep neural models for scalable efficient inference on big image streams. Optimized exit-augmented models, with the power of smart adaptive inference, perform ~2.5x faster having ~4x aggregated lower effective FLOPs, with no significant accuracy loss.<\/p>\n\n\n\n<div class=\"wp-block-buttons is-content-justification-center is-content-justification-center is-layout-flex wp-container-core-buttons-is-layout-16018d1d wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button is-style-outline is-style-outline--1\"><a data-bi-type=\"button\" class=\"wp-block-button__link wp-element-button\" href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/publication\/naserex-optimizing-early-exits-via-automl-for-scalable-efficient-inference-in-big-image-streams\/\">Read the paper<\/a><\/div>\n<\/div>\n\n\n\n\t<div class=\"border-bottom border-top border-gray-300 mt-5 mb-5 msr-promo text-center text-md-left alignwide\" data-bi-aN=\"promo\" data-bi-id=\"1141385\">\n\t\t\n\n\t\n\t<div class=\"row pt-3 pb-4 align-items-center\">\n\t\t\t\t\t\t<div class=\"msr-promo__media col-12 col-md-5\">\n\t\t\t\t<a class=\"bg-gray-300 display-block\" href=\"https:\/\/ai.azure.com\/labs\" aria-label=\"Azure AI Foundry Labs\" data-bi-cN=\"Azure AI Foundry Labs\" target=\"_blank\">\n\t\t\t\t\t<img decoding=\"async\" class=\"w-100 display-block\" src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/06\/Azure-AI-Foundry_1600x900.jpg\" \/>\n\t\t\t\t<\/a>\n\t\t\t<\/div>\n\t\t\t\n\t\t\t<div class=\"msr-promo__content p-3 px-5 col-12 col-md\">\n\n\t\t\t\t\t\t\t\t\t<h2 class=\"h4\">Azure AI Foundry Labs<\/h2>\n\t\t\t\t\n\t\t\t\t\t\t\t\t<p id=\"azure-ai-foundry-labs\" class=\"large\">Get a glimpse of potential future directions for AI, with these experimental technologies from Microsoft Research.<\/p>\n\t\t\t\t\n\t\t\t\t\t\t\t\t<div class=\"wp-block-buttons justify-content-center justify-content-md-start\">\n\t\t\t\t\t<div class=\"wp-block-button\">\n\t\t\t\t\t\t<a href=\"https:\/\/ai.azure.com\/labs\" aria-describedby=\"azure-ai-foundry-labs\" class=\"btn btn-brand glyph-append glyph-append-chevron-right\" data-bi-cN=\"Azure AI Foundry Labs\" target=\"_blank\">\n\t\t\t\t\t\t\tAzure AI Foundry\t\t\t\t\t\t<\/a>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t\t\t<\/div><!--\/.msr-promo__content-->\n\t<\/div><!--\/.msr-promo__inner-wrap-->\n\t<\/div><!--\/.msr-promo-->\n\t\n\n\n<h3 class=\"wp-block-heading h6 has-blue-color has-text-color has-link-color wp-elements-21d8108ee594aad478409a8aa618b2ee\" id=\"new-research-1\">NEW RESEARCH<\/h3>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"insightpilot-an-llm-empowered-automated-data-exploration-system\">InsightPilot: An LLM-Empowered Automated Data Exploration System<\/h2>\n\n\n\n<p>Effective data exploration requires in-depth knowledge of the dataset and the user intent, and expertise in data analysis techniques. Not being familiar with either can create obstacles that make the process time-consuming and overwhelming.<\/p>\n\n\n\n<p>In a recent paper, <a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/publication\/insightpilot-an-llm-empowered-automated-data-exploration-system\/\" target=\"_blank\" rel=\"noreferrer noopener\">InsightPilot: An LLM-Empowered Automated Data Exploration System,<\/a> researchers from Microsoft address this issue. InsightPilot is a large language model (LLM)-based, automated system designed to simplify the data exploration process. It features a set of carefully designed analysis actions that streamline the data exploration process. Given a natural language question, InsightPilot collaborates with the LLM to issue a sequence of analysis actions, explore the data, and generate insights. The authors demonstrate the effectiveness of InsightPilot in a user study and a case study, showing how it can help users gain valuable insights from their datasets.&nbsp;<\/p>\n\n\n\n<div class=\"wp-block-buttons is-content-justification-center is-content-justification-center is-layout-flex wp-container-core-buttons-is-layout-16018d1d wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button is-style-outline is-style-outline--2\"><a data-bi-type=\"button\" class=\"wp-block-button__link wp-element-button\" href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/publication\/insightpilot-an-llm-empowered-automated-data-exploration-system\/\">Read the paper<\/a><\/div>\n<\/div>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-dots\"\/>\n\n\n\n<h3 class=\"wp-block-heading h6 has-blue-color has-text-color has-link-color wp-elements-12d24ec3c423365af157b03da27206ad\" id=\"blog-post\">BLOG POST<\/h3>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"boosting-cloud-efficiency-harnessing-data-driven-decision-making-and-optimization-techniques\">Boosting Cloud Efficiency: Harnessing Data-Driven Decision-Making and Optimization Techniques<\/h2>\n\n\n\n<p>Microsoft&#8217;s cloud system serves as the backbone for the daily operations of hundreds of thousands of organizations, driving productivity and collaboration. The foundational infrastructure demands both high <em>reliability <\/em>and <em>efficiency<\/em>. In a new <a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/group\/systems-innovation\/articles\/boosting-cloud-efficiency-harnessing-data-driven-decision-making-and-optimization-techniques\/\">blog post<\/a>, Microsoft\u2019s Systems Innovation team explores some recent innovations to continually enhance hyper-scale cloud capacity efficiency, delivering substantial operational cost savings for customers.<\/p>\n\n\n\n<p><a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/group\/systems-innovation\/\" target=\"_blank\" rel=\"noreferrer noopener\">Systems Innovation<\/a> is a collaboration between Microsoft 365, Microsoft Research and Azure. The research group is focused on leveraging their shared deep workload understanding and combining algorithmic research with AI\/machine learning techniques and hardware innovation to improve operational reliability and efficiency.<\/p>\n\n\n\n<div class=\"wp-block-buttons is-content-justification-center is-content-justification-center is-layout-flex wp-container-core-buttons-is-layout-16018d1d wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button\"><a data-bi-type=\"button\" class=\"wp-block-button__link wp-element-button\" href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/group\/systems-innovation\/articles\/boosting-cloud-efficiency-harnessing-data-driven-decision-making-and-optimization-techniques\/\">Read the blog<\/a><\/div>\n<\/div>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-dots\"\/>\n\n\n\n<h3 class=\"wp-block-heading h6 has-blue-color has-text-color has-link-color wp-elements-204f3500ef0b32d03ecf52013a43473b\" id=\"community-challenge\">COMMUNITY CHALLENGE<\/h3>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"neurips-large-language-model-efficiency-challenge\">NeurIPS Large Language Model Efficiency Challenge<\/h2>\n\n\n\n<p>Large language models (LLMs) trained on large bodies of text can solve tasks with few supervised examples. These few-shot models have shown state-of-the-art success across natural language processing (NLP) tasks, language translation, standardized exams, and coding challenges, as well as in subjective domains such as chatbots. All of these domains involve bootstrapping a single LLM referred to as a foundation model with examples of specific knowledge from the associated task.<\/p>\n\n\n\n<p>The process of updating a model with limited domain-specific data is known as fine-tuning. However, the costs of accessing, fine-tuning and querying foundation models to perform new tasks can be large.<\/p>\n\n\n\n<p>To help democratize access to language models, Microsoft and other industry leaders were pleased to sponsor the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/llm-efficiency-challenge.github.io\/index\" target=\"_blank\" rel=\"noopener noreferrer\">NeurIPS Large Language Model Efficiency Challenge,<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> which addressed three major issues:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Lack of transparency around model training methods leads to a majority of models being not reproducible.<\/li>\n\n\n\n<li>The absence of a standard benchmark to evaluate these models side-by-side.<\/li>\n\n\n\n<li>Insufficient access to dedicated hardware prevents widespread availability and usage of these models.<\/li>\n<\/ol>\n\n\n\n<p>The challenge to the community was to adapt a foundation model to specific tasks by fine-tuning on a single GPU of either 4090 or A100 (40GB) within a 24-hour (1-day) time frame, while maintaining high accuracy for these desired tasks.<\/p>\n\n\n\n<p>Each submission was evaluated for accuracy and computational performance tradeoffs at commodity hardware scales. Insights and lessons were distilled into a set of well documented steps and easy-to-follow tutorials. The machine learning community will have documentation on how to achieve the same performance as winning entries, which will serve as the starting point to help them build their own LLM solutions.<\/p>\n\n\n\n<div class=\"wp-block-buttons is-content-justification-center is-content-justification-center is-layout-flex wp-container-core-buttons-is-layout-16018d1d wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button\"><a data-bi-type=\"button\" class=\"wp-block-button__link wp-element-button\" href=\"https:\/\/llm-efficiency-challenge.github.io\/leaderboard\" target=\"_blank\" rel=\"noreferrer noopener\">View winners<\/a><\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>In this issue of Research Focus: Optimized exit-augmented models for scalable efficient inference; NeurIPS LLM Efficiency Challenge; LLM-empowered automated data exploration; Boosting cloud efficiency with data-driven decision-making and optimization.<\/p>\n","protected":false},"author":42183,"featured_media":992748,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":[],"msr_hide_image_in_river":0,"footnotes":""},"categories":[1],"tags":[],"research-area":[13561,13556,13563,13547],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[243984],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-992625","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-research-blog","msr-research-area-algorithms","msr-research-area-artificial-intelligence","msr-research-area-data-platform-analytics","msr-research-area-systems-and-networking","msr-locale-en_us","msr-post-option-blog-homepage-featured"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[714577,811276],"related-projects":[],"related-events":[],"related-researchers":[],"msr_type":"Post","featured_image_thumbnail":"<img width=\"960\" height=\"540\" src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/12\/RF31-BlogHeroFeature-1400x788-1-960x540.png\" class=\"img-object-cover\" alt=\"Research Focus 31\" decoding=\"async\" loading=\"lazy\" srcset=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/12\/RF31-BlogHeroFeature-1400x788-1-960x540.png 960w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/12\/RF31-BlogHeroFeature-1400x788-1-300x169.png 300w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/12\/RF31-BlogHeroFeature-1400x788-1-1024x576.png 1024w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/12\/RF31-BlogHeroFeature-1400x788-1-768x432.png 768w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/12\/RF31-BlogHeroFeature-1400x788-1-1066x600.png 1066w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/12\/RF31-BlogHeroFeature-1400x788-1-655x368.png 655w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/12\/RF31-BlogHeroFeature-1400x788-1-343x193.png 343w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/12\/RF31-BlogHeroFeature-1400x788-1-240x135.png 240w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/12\/RF31-BlogHeroFeature-1400x788-1-640x360.png 640w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/12\/RF31-BlogHeroFeature-1400x788-1-1280x720.png 1280w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2023\/12\/RF31-BlogHeroFeature-1400x788-1.png 1400w\" sizes=\"auto, (max-width: 960px) 100vw, 960px\" \/>","byline":"","formattedDate":"December 20, 2023","formattedExcerpt":"In this issue of Research Focus: Optimized exit-augmented models for scalable efficient inference; NeurIPS LLM Efficiency Challenge; LLM-empowered automated data exploration; Boosting cloud efficiency with data-driven decision-making and optimization.","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/posts\/992625","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/users\/42183"}],"replies":[{"embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/comments?post=992625"}],"version-history":[{"count":12,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/posts\/992625\/revisions"}],"predecessor-version":[{"id":993978,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/posts\/992625\/revisions\/993978"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/media\/992748"}],"wp:attachment":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/media?parent=992625"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/categories?post=992625"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/tags?post=992625"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=992625"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=992625"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=992625"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=992625"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=992625"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=992625"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=992625"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=992625"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}