{"id":1032900,"date":"2024-05-15T11:12:21","date_gmt":"2024-05-15T18:12:21","guid":{"rendered":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/?p=1032900"},"modified":"2024-05-17T11:50:56","modified_gmt":"2024-05-17T18:50:56","slug":"research-focus-week-of-may-13-2024","status":"publish","type":"post","link":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/blog\/research-focus-week-of-may-13-2024\/","title":{"rendered":"Research Focus: Week of May 13, 2024"},"content":{"rendered":"\n<figure class=\"wp-block-pullquote\"><blockquote><p><em class=\"\">Welcome to Research Focus, a series of blog posts that highlights notable publications, events, code\/datasets, new hires and other milestones from across the research community at Microsoft.<\/em><\/p><\/blockquote><\/figure>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"576\" src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2024\/05\/RF41-BlogHeroFeature-1400x788-1-1024x576.png\" alt=\"Research Focus: May 13, 2024\" class=\"wp-image-1033017\" srcset=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2024\/05\/RF41-BlogHeroFeature-1400x788-1-1024x576.png 1024w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2024\/05\/RF41-BlogHeroFeature-1400x788-1-300x169.png 300w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2024\/05\/RF41-BlogHeroFeature-1400x788-1-768x432.png 768w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2024\/05\/RF41-BlogHeroFeature-1400x788-1-1066x600.png 1066w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2024\/05\/RF41-BlogHeroFeature-1400x788-1-655x368.png 655w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2024\/05\/RF41-BlogHeroFeature-1400x788-1-240x135.png 240w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2024\/05\/RF41-BlogHeroFeature-1400x788-1-640x360.png 640w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2024\/05\/RF41-BlogHeroFeature-1400x788-1-960x540.png 960w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2024\/05\/RF41-BlogHeroFeature-1400x788-1-1280x720.png 1280w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2024\/05\/RF41-BlogHeroFeature-1400x788-1.png 1400w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading h6 has-blue-color has-text-color has-link-color wp-elements-a584a2137da4151ecbde93fba771f798\" id=\"new-research\">NEW RESEARCH<\/h3>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"injecting-new-knowledge-into-large-language-models-via-supervised-fine-tuning\">Injecting New Knowledge into Large Language Models via Supervised Fine-Tuning&nbsp;<\/h2>\n\n\n\n<p>Large language models (LLMs) have shown remarkable performance in generating text similar to that created by people, proving to be a valuable asset across various applications. However, adapting these models to incorporate new, out-of-domain knowledge remains a challenge, particularly for facts and events that occur after the model\u2019s training knowledge cutoff date.<\/p>\n\n\n\n<p>In a recent paper: <a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/publication\/injecting-new-knowledge-into-large-language-models-via-supervised-fine-tuning\/\" target=\"_blank\" rel=\"noreferrer noopener\">Injecting New Knowledge into Large Language Models via Supervised Fine-Tuning<\/a>, researchers from Microsoft investigate the effectiveness of supervised fine-tuning (SFT) as a method for knowledge injection in LLMs, specifically focusing on recent sporting events. They compare different dataset generation strategies\u2014token-based and fact-based scaling\u2014to create training data that helps the model learn new information. Their experiments on GPT-4 demonstrate that while token-based scaling can lead to improvements in Q&A accuracy, it may not provide uniform coverage of new knowledge. Fact-based scaling, on the other hand, offers a more systematic approach to ensure even coverage across all facts. The researchers present a novel dataset generation process that leads to more effective knowledge ingestion through SFT, and results show considerable performance improvements in Q&A tasks related to out-of-domain knowledge.&nbsp;<\/p>\n\n\n\n<div class=\"wp-block-buttons is-content-justification-center is-content-justification-center is-layout-flex wp-container-core-buttons-is-layout-16018d1d wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button is-style-outline is-style-outline--1\"><a data-bi-type=\"button\" class=\"wp-block-button__link wp-element-button\" href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/publication\/injecting-new-knowledge-into-large-language-models-via-supervised-fine-tuning\/\">Read the paper<\/a><\/div>\n<\/div>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-dots\"\/>\n\n\n\n<h3 class=\"wp-block-heading h6 has-blue-color has-text-color has-link-color wp-elements-a584a2137da4151ecbde93fba771f798\" id=\"new-research\">NEW RESEARCH<\/h3>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"hh\">A Reflection on Human-Notebook Experiences in the Era of AI<\/h2>\n\n\n\n<p>Computational notebooks provide an interactive way to work with data. They have been widely used by data professionals to write code, explore data, and generate visualizations, all in one document. Previous research has revealed unique pain points around the user experience in computational notebooks. However, as AI tools like ChatGPT or Copilot have emerged, it is unclear whether these pain points have been reduced or changed, or whether new pain points have arisen. Due to the fast pace of advances in AI technology, most of the development of new AI tools has been primarily driven by technology and not by user experience.<\/p>\n\n\n\n<p>In a recent paper: <a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/publication\/a-reflection-on-human-notebook-experiences-in-the-era-of-ai\/\" target=\"_blank\" rel=\"noreferrer noopener\">A Reflection on Human-Notebook Experiences in the Era of AI<\/a>, researchers from Microsoft summarize literature on how new AI technology has impacted human-notebook interaction and human-computer interaction (HCI) paradigms, new challenges and user behavior around using AI assistants, and recent research on AI assistants in computational notebook scenarios. They outline gaps in existing literature and suggest a future focus on improving macro human-notebook experiences throughout a user\u2019s workflow, measuring and quantifying the value of AI systems, and establishing a set of standards and best practices for AI tools.<\/p>\n\n\n\n<div class=\"wp-block-buttons is-content-justification-center is-content-justification-center is-layout-flex wp-container-core-buttons-is-layout-16018d1d wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button is-style-outline is-style-outline--2\"><a data-bi-type=\"button\" class=\"wp-block-button__link wp-element-button\" href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/publication\/a-reflection-on-human-notebook-experiences-in-the-era-of-ai\/\">Read the paper<\/a><\/div>\n<\/div>\n\n\n\n\t<div class=\"border-bottom border-top border-gray-300 mt-5 mb-5 msr-promo text-center text-md-left alignwide\" data-bi-aN=\"promo\" data-bi-id=\"1160910\">\n\t\t\n\n\t\t<p class=\"msr-promo__label text-gray-800 text-center text-uppercase\">\n\t\t<span class=\"px-4 bg-white display-inline-block font-weight-semibold small\">video series<\/span>\n\t<\/p>\n\t\n\t<div class=\"row pt-3 pb-4 align-items-center\">\n\t\t\t\t\t\t<div class=\"msr-promo__media col-12 col-md-5\">\n\t\t\t\t<a class=\"bg-gray-300 display-block\" href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/story\/on-second-thought\/\" aria-label=\"On Second Thought\" data-bi-cN=\"On Second Thought\" target=\"_blank\">\n\t\t\t\t\t<img decoding=\"async\" class=\"w-100 display-block\" src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2026\/01\/MFST_feature_SecondThought_1400x788.jpg\" alt=\"On Second Thought with Sinead Bovell\" \/>\n\t\t\t\t<\/a>\n\t\t\t<\/div>\n\t\t\t\n\t\t\t<div class=\"msr-promo__content p-3 px-5 col-12 col-md\">\n\n\t\t\t\t\t\t\t\t\t<h2 class=\"h4\">On Second Thought<\/h2>\n\t\t\t\t\n\t\t\t\t\t\t\t\t<p id=\"on-second-thought\" class=\"large\">A video series with Sinead Bovell built around the questions everyone\u2019s asking about AI. With expert voices from across Microsoft, we break down the tension and promise of this rapidly changing technology, exploring what\u2019s evolving and what\u2019s possible.<\/p>\n\t\t\t\t\n\t\t\t\t\t\t\t\t<div class=\"wp-block-buttons justify-content-center justify-content-md-start\">\n\t\t\t\t\t<div class=\"wp-block-button\">\n\t\t\t\t\t\t<a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/story\/on-second-thought\/\" aria-describedby=\"on-second-thought\" class=\"btn btn-brand glyph-append glyph-append-chevron-right\" data-bi-cN=\"On Second Thought\" target=\"_blank\">\n\t\t\t\t\t\t\tExplore the series\t\t\t\t\t\t<\/a>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t\t\t<\/div><!--\/.msr-promo__content-->\n\t<\/div><!--\/.msr-promo__inner-wrap-->\n\t<\/div><!--\/.msr-promo-->\n\t\n\n\n<h3 class=\"wp-block-heading h6 has-blue-color has-text-color has-link-color wp-elements-a584a2137da4151ecbde93fba771f798\" id=\"new-research\">NEW RESEARCH<\/h3>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"hh\">Jacdac: Service-Based Prototyping of Embedded Systems<\/h2>\n\n\n\n<p>The traditional approach to programming embedded systems is monolithic: firmware on a microcontroller contains both application code and the drivers needed to communicate with sensors and actuators, using low-level protocols such as I2C, SPI, and RS232. In comparison, software development for the cloud has moved to a service-based development and operation paradigm: a service provides a discrete unit of functionality that can be accessed remotely by an application, or other service, but is independently managed and updated.<\/p>\n\n\n\n<p>In a recent paper: <a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/publication\/jacdac-pldi2024\/\" target=\"_blank\" rel=\"noreferrer noopener\">Jacdac: Service-Based Prototyping of Embedded Systems<\/a>, researchers from Microsoft propose, design, implement, and evaluate a service-based approach to prototyping embedded systems called\u202f<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/aka.ms\/jacdac\" target=\"_blank\" rel=\"noopener noreferrer\">Jacdac<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>. Jacdac defines a service specification language, designed especially for embedded systems, along with a host of specifications for a variety of sensors and actuators. With Jacdac, each sensor\/actuator in a system is paired with a low-cost microcontroller that advertises the services that represent the functionality of the underlying hardware over an efficient and low-cost single-wire bus protocol. A separate microcontroller executes the user\u2019s application program, which is a client of the Jacdac services on the bus.&nbsp;<\/p>\n\n\n\n<p>Three Jacdac kits, comprising over twenty modules, have been produced by third-party manufacturers: <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.kittenbot.cc\/\" target=\"_blank\" rel=\"noopener noreferrer\">KittenBot<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> and <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/forwardedu.com\/\">Forward Education<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.<\/p>\n\n\n\n<div class=\"wp-block-buttons is-content-justification-center is-content-justification-center is-layout-flex wp-container-core-buttons-is-layout-16018d1d wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button is-style-outline is-style-outline--3\"><a data-bi-type=\"button\" class=\"wp-block-button__link wp-element-button\" href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/publication\/jacdac-pldi2024\/\">Read the paper<\/a><\/div>\n<\/div>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-dots\"\/>\n\n\n\n<h3 class=\"wp-block-heading h6 has-blue-color has-text-color has-link-color wp-elements-a584a2137da4151ecbde93fba771f798\" id=\"new-research\">NEW RESEARCH<\/h3>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"hh\">PARIKSHA: A Scalable, Democratic, Transparent Evaluation Platform for Assessing Indic Large Language Models<\/h2>\n\n\n\n<p>Evaluation of multilingual LLMs is challenging due to a variety of factors \u2013 the lack of benchmarks with sufficient linguistic diversity, contamination of popular benchmarks into LLM pre-training data, and the lack of local, cultural nuances in translated benchmarks. Hence, it is difficult to extensively evaluate LLMs in a multilingual setting, leading to lack of fair comparisons between models and difficulties in replicating the evaluation setup used by some models. Recently, several Indic (Indian language) LLMs have been created to help build more locally and culturally relevant LLMs.<\/p>\n\n\n\n<p>In a recent paper: <a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/publication\/pariksha-a-scalable-democratic-transparent-evaluation-platform-for-assessing-indic-large-language-models\/\" target=\"_blank\" rel=\"noreferrer noopener\">PARIKSHA: A Scalable, Democratic, Transparent Evaluation Platform for Assessing Indic Large Language Models<\/a>, researchers from Microsoft present an evaluation framework, which is the first comprehensive evaluation of Indic LLMs using a combination of human and LLM-based evaluation. The researchers conduct a total of 90,000 human evaluations and 50,000 LLM-based evaluations of 29 models to present leaderboards for 10 Indic languages. Pariksha provides inclusive evaluation by engaging a community of workers that represent India\u2019s large and diverse workforce and also serves as a research platform for improving the process of evaluation. For transparency on the process, the evaluation artifacts will be released. Conducting Pariksha at regular intervals, the researchers aim to enable models to improve over time with insights and artifacts from their evaluations.&nbsp;<\/p>\n\n\n\n<div class=\"wp-block-buttons is-content-justification-center is-content-justification-center is-layout-flex wp-container-core-buttons-is-layout-16018d1d wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button is-style-outline is-style-outline--4\"><a data-bi-type=\"button\" class=\"wp-block-button__link wp-element-button\" href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/publication\/pariksha-a-scalable-democratic-transparent-evaluation-platform-for-assessing-indic-large-language-models\/\">Read the paper<\/a><\/div>\n<\/div>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-dots\"\/>\n\n\n\n<h3 class=\"wp-block-heading h6 has-blue-color has-text-color has-link-color wp-elements-a584a2137da4151ecbde93fba771f798\" id=\"new-research\">NEW RESEARCH<\/h3>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"hh\">Tinker, Tailor, Configure, Customize: The Articulation Work of Customizing AI Fairness Checklists<\/h2>\n\n\n\n<p>Many responsible AI resources, such as toolkits, playbooks, and checklists, have been developed to support AI practitioners in identifying, measuring, and mitigating potential fairness-related harms. These resources are often designed to be general purpose, in order to address a variety of use cases, domains, and deployment contexts. However, this can lead to decontextualization, where such resources lack the level of relevance or specificity needed to use them.<\/p>\n\n\n\n<p>To understand how AI practitioners might contextualize one such resource, an AI fairness checklist, for their particular use cases, domains, and deployment contexts, researchers from Microsoft conducted a retrospective contextual inquiry with 13 AI practitioners from seven organizations. In a recent paper: <a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/publication\/tinker-tailor-configure-customize-the-articulation-work-of-customizing-ai-fairness-checklists\/\" target=\"_blank\" rel=\"noreferrer noopener\">Tinker, Tailor, Configure, Customize: The Articulation Work of Customizing AI Fairness Checklists<\/a>, they identify how contextualizing this checklist introduces new forms of work for AI practitioners and other stakeholders, while opening up new sites for negotiation and contestation of values in AI. The researchers also identify how the contextualization process may help AI practitioners develop a shared language around AI fairness. They also identify dynamics related to ownership over this process that suggest larger issues of accountability in responsible AI work.&nbsp;<\/p>\n\n\n\n<div class=\"wp-block-buttons is-content-justification-center is-content-justification-center is-layout-flex wp-container-core-buttons-is-layout-16018d1d wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button is-style-outline is-style-outline--5\"><a data-bi-type=\"button\" class=\"wp-block-button__link wp-element-button\" href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/publication\/tinker-tailor-configure-customize-the-articulation-work-of-customizing-ai-fairness-checklists\/\">Read the paper<\/a><\/div>\n<\/div>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-dots\"\/>\n\n\n\n<h3 class=\"wp-block-heading h6 has-blue-color has-text-color has-link-color wp-elements-a584a2137da4151ecbde93fba771f798\" id=\"new-research\">NEW RESEARCH<\/h3>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"hh\">MS MARCO Web Search: A Large-scale Information-rich Web Dataset with Millions of Real Click Labels<\/h2>\n\n\n\n<p>LLMs are becoming indispensable tools for many creative and information related tasks, but they still come with limitations, including a tendency to fabricate content. State-of-the-art algorithms pair the LLM with an external, dynamically updated knowledge base to ground the LLM\u2019s answers and provide up-to-date information. However, these techniques require large amounts of relevant, labeled training data that have not previously been publicly available.&nbsp;<\/p>\n\n\n\n<p>In a recent paper: <a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/publication\/ms-marco-web-search-a-large-scale-information-rich-web-dataset-with-millions-of-real-click-labels\/\" target=\"_blank\" rel=\"noreferrer noopener\">MS MARCO Web Search: A Large-scale Information-rich Web Dataset with Millions of Real Click Labels<\/a> presented at the 2024 ACM Web Conference, researchers from Microsoft introduce a novel dataset that closely mimics real-world web document and query distribution. MS MARCO Web Search contains 10 million unique queries across 93 languages with millions of relevant labeled query-document pairs. It uses ClueWeb22\u2019s 10 billion high-quality web pages as the document corpus and provides rich information for various kinds of downstream tasks.&nbsp;<\/p>\n\n\n\n<p>This dataset unlocks several new research directions that previous datasets cannot well support, including generic end-to-end neural indexer models, generic embedding models, and next generation information access systems with LLMs. MS MARCO Web Search offers a retrieval benchmark with three web scale retrieval challenge tasks, each with automatic evaluation and leaderboard. These tasks demand innovation in both machine learning and information retrieval systems. The researchers intend for MS MARCO Web Search to lay the groundwork for future advancements in AI and systems research.<\/p>\n\n\n\n<div class=\"wp-block-buttons is-content-justification-center is-content-justification-center is-layout-flex wp-container-core-buttons-is-layout-16018d1d wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button is-style-fill-github\"><a data-bi-type=\"button\" class=\"wp-block-button__link wp-element-button\" href=\"https:\/\/github.com\/microsoft\/MS-MARCO-Web-Search\">View dataset<\/a><\/div>\n\n\n\n<div class=\"wp-block-button is-style-outline is-style-outline--6\"><a data-bi-type=\"button\" class=\"wp-block-button__link wp-element-button\" href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/publication\/ms-marco-web-search-a-large-scale-information-rich-web-dataset-with-millions-of-real-click-labels\/\">Read the paper<\/a><\/div>\n<\/div>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-dots\"\/>\n\n\n\n<h3 class=\"wp-block-heading h6 has-blue-color has-text-color has-link-color wp-elements-56323e1c55299b34125129c842d8b7cd\" id=\"new-research\">VIDEO<\/h3>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"ai-case-studies-for-natural-science-research-with-bonnie-kruft\">AI Case Studies for Natural Science Research with Bonnie Kruft<\/h2>\n\n\n\n<p>Among the stunning changes and disruptions driven by AI, one of the most significant is the impact on scientific discovery. In her presentation at <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/event.technologyreview.com\/emtech-digital-us-2024\/home\" target=\"_blank\" rel=\"noopener noreferrer\">EmTech Digital 2024<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, Bonnie Kruft, partner deputy director at Microsoft Research AI for Science, outlined some examples of how generative AI enables groundbreaking research in the natural sciences. Recent breakthroughs aided by AI include small molecular inhibitors for treating infectious disease, the discovery of new materials for energy storage, and new drug development.&nbsp;<\/p>\n\n\n\n<p>Catch a <a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/video\/ai-case-studies-for-natural-science-research-with-bonnie-kruft\/\" target=\"_blank\" rel=\"noreferrer noopener\">replay of the presentation<\/a>, including a follow-up Q&A with the audience, and hear how researchers are reducing discovery times from years to months. The discussion explores safe and responsible AI practices, how large language models can work with science-based models, and what lies ahead for AI in science.&nbsp;<\/p>\n\n\n\n<div class=\"wp-block-buttons is-content-justification-center is-content-justification-center is-layout-flex wp-container-core-buttons-is-layout-16018d1d wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button is-style-outline is-style-outline--7\"><a data-bi-type=\"button\" class=\"wp-block-button__link wp-element-button\" href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/video\/ai-case-studies-for-natural-science-research-with-bonnie-kruft\/\">Watch the video<\/a><\/div>\n<\/div>\n\n\n\n<div style=\"padding-bottom:64px; padding-top:64px\" class=\"wp-block-msr-immersive-section alignfull row has-background has-lighter-gray-background-color has-text-color has-black-color wp-block-msr-immersive-section\">\n\t\n\t<div class=\"container\">\n\t\t<div class=\"wp-block-msr-immersive-section__inner\">\n\t\t\t\t\t<\/div>\n\t<\/div>\n\n\t<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Welcome to Research Focus, a series of blog posts that highlights notable publications, events, code\/datasets, new hires and other milestones from across the research community at Microsoft. Large language models (LLMs) have shown remarkable performance in generating text similar to that created by people, proving to be a valuable asset across various applications. However, adapting [&hellip;]<\/p>\n","protected":false},"author":42735,"featured_media":1033017,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":[{"type":"user_nicename","value":"Leonardo Nunes","user_id":"40759"},{"type":"user_nicename","value":"Sara Malvar","user_id":"40753"},{"type":"user_nicename","value":"Bruno Silva","user_id":"42309"},{"type":"user_nicename","value":"Ranveer Chandra","user_id":"33344"},{"type":"user_nicename","value":"Serena Hillman","user_id":"41143"},{"type":"user_nicename","value":"Thomas Ball","user_id":"33895"},{"type":"user_nicename","value":"Peli de Halleux","user_id":"32253"},{"type":"user_nicename","value":"James Devine","user_id":"41632"},{"type":"user_nicename","value":"Michal Moskal","user_id":"37431"},{"type":"user_nicename","value":"Vivek Seshadri","user_id":"36323"},{"type":"user_nicename","value":"Manohar Swaminathan","user_id":"35356"},{"type":"user_nicename","value":"Sunayana Sitaram","user_id":"37287"},{"type":"user_nicename","value":"Hanna Wallach","user_id":"34779"},{"type":"user_nicename","value":"Jennifer Wortman Vaughan","user_id":"32235"},{"type":"user_nicename","value":"Qi Chen","user_id":"36990"},{"type":"user_nicename","value":"Xiubo Geng","user_id":"39075"},{"type":"user_nicename","value":"Corby Rosset","user_id":"41997"},{"type":"user_nicename","value":"Carolyn Buractaon","user_id":"39327"},{"type":"user_nicename","value":"Jingwen Lu","user_id":"40021"},{"type":"user_nicename","value":"Yeyun Gong","user_id":"39186"},{"type":"user_nicename","value":"Nick Craswell","user_id":"33088"},{"type":"user_nicename","value":"Xing Xie","user_id":"34906"},{"type":"user_nicename","value":"Fan Yang","user_id":"31782"},{"type":"user_nicename","value":"Bryan Tower","user_id":"36653"},{"type":"user_nicename","value":"Jason (Zengzhong) Li","user_id":"40543"},{"type":"user_nicename","value":"Rangan Majumder","user_id":"38931"},{"type":"user_nicename","value":"Jennifer Neville","user_id":"40946"},{"type":"user_nicename","value":"Harsha Simhadri","user_id":"36146"},{"type":"user_nicename","value":"Manik Varma","user_id":"32791"},{"type":"user_nicename","value":"Mao Yang","user_id":"32798"},{"type":"user_nicename","value":"Bonnie Kruft","user_id":"41919"}],"msr_hide_image_in_river":0,"footnotes":""},"categories":[1],"tags":[],"research-area":[13556,13563,13545,13554,13553,13560,13568],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[243984],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-1032900","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-research-blog","msr-research-area-artificial-intelligence","msr-research-area-data-platform-analytics","msr-research-area-human-language-technologies","msr-research-area-human-computer-interaction","msr-research-area-medical-health-genomics","msr-research-area-programming-languages-software-engineering","msr-research-area-technology-for-emerging-markets","msr-locale-en_us","msr-post-option-blog-homepage-featured"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[199562,199565,199571,851467],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[144784,144812,372368,714067],"related-projects":[1018536,950052,881235,807097,716050],"related-events":[],"related-researchers":[{"type":"user_nicename","value":"Leonardo Nunes","user_id":40759,"display_name":"Leonardo Nunes","author_link":"<a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/lnunes\/\" aria-label=\"Visit the profile page for Leonardo Nunes\">Leonardo Nunes<\/a>","is_active":false,"last_first":"Nunes, Leonardo","people_section":0,"alias":"lnunes"},{"type":"user_nicename","value":"Sara Malvar","user_id":40753,"display_name":"Sara Malvar","author_link":"<a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/saramalvar\/\" aria-label=\"Visit the profile page for Sara Malvar\">Sara Malvar<\/a>","is_active":false,"last_first":"Malvar, Sara","people_section":0,"alias":"saramalvar"},{"type":"user_nicename","value":"Bruno Silva","user_id":42309,"display_name":"Bruno Silva","author_link":"<a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/brunosilva\/\" aria-label=\"Visit the profile page for Bruno Silva\">Bruno Silva<\/a>","is_active":false,"last_first":"Silva, Bruno","people_section":0,"alias":"brunosilva"},{"type":"user_nicename","value":"Ranveer Chandra","user_id":33344,"display_name":"Ranveer Chandra","author_link":"<a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/ranveer\/\" aria-label=\"Visit the profile page for Ranveer Chandra\">Ranveer Chandra<\/a>","is_active":false,"last_first":"Chandra, Ranveer","people_section":0,"alias":"ranveer"},{"type":"user_nicename","value":"Serena Hillman","user_id":41143,"display_name":"Serena Hillman","author_link":"<a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/sehillma\/\" aria-label=\"Visit the profile page for Serena Hillman\">Serena Hillman<\/a>","is_active":false,"last_first":"Hillman, Serena","people_section":0,"alias":"sehillma"},{"type":"user_nicename","value":"Peli de Halleux","user_id":32253,"display_name":"Peli de Halleux","author_link":"<a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/jhalleux\/\" aria-label=\"Visit the profile page for Peli de Halleux\">Peli de Halleux<\/a>","is_active":false,"last_first":"de Halleux, Peli","people_section":0,"alias":"jhalleux"},{"type":"user_nicename","value":"James Devine","user_id":41632,"display_name":"James Devine","author_link":"<a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/devinejames\/\" aria-label=\"Visit the profile page for James Devine\">James Devine<\/a>","is_active":false,"last_first":"Devine, James","people_section":0,"alias":"devinejames"},{"type":"user_nicename","value":"Manohar Swaminathan","user_id":35356,"display_name":"Manohar Swaminathan","author_link":"<a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/swmanohmicrosoft-com\/\" aria-label=\"Visit the profile page for Manohar Swaminathan\">Manohar Swaminathan<\/a>","is_active":false,"last_first":"Swaminathan, Manohar","people_section":0,"alias":"swmanoh@microsoft.com"},{"type":"user_nicename","value":"Sunayana Sitaram","user_id":37287,"display_name":"Sunayana Sitaram","author_link":"<a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/susitara\/\" aria-label=\"Visit the profile page for Sunayana Sitaram\">Sunayana Sitaram<\/a>","is_active":false,"last_first":"Sitaram, Sunayana","people_section":0,"alias":"susitara"},{"type":"user_nicename","value":"Hanna Wallach","user_id":34779,"display_name":"Hanna Wallach","author_link":"<a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/wallach\/\" aria-label=\"Visit the profile page for Hanna Wallach\">Hanna Wallach<\/a>","is_active":false,"last_first":"Wallach, Hanna","people_section":0,"alias":"wallach"},{"type":"user_nicename","value":"Jennifer Wortman Vaughan","user_id":32235,"display_name":"Jenn Wortman Vaughan","author_link":"<a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/jenn\/\" aria-label=\"Visit the profile page for Jenn Wortman Vaughan\">Jenn Wortman Vaughan<\/a>","is_active":false,"last_first":"Wortman Vaughan, Jenn","people_section":0,"alias":"jenn"},{"type":"user_nicename","value":"Qi Chen","user_id":36990,"display_name":"Qi Chen","author_link":"<a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/cheqi\/\" aria-label=\"Visit the profile page for Qi Chen\">Qi Chen<\/a>","is_active":false,"last_first":"Chen, Qi","people_section":0,"alias":"cheqi"},{"type":"user_nicename","value":"Xiubo Geng","user_id":39075,"display_name":"Xiubo Geng","author_link":"<a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/xigeng\/\" aria-label=\"Visit the profile page for Xiubo Geng\">Xiubo Geng<\/a>","is_active":false,"last_first":"Geng, Xiubo","people_section":0,"alias":"xigeng"},{"type":"user_nicename","value":"Corby Rosset","user_id":41997,"display_name":"Corby Rosset","author_link":"<a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/corbyrosset\/\" aria-label=\"Visit the profile page for Corby Rosset\">Corby Rosset<\/a>","is_active":false,"last_first":"Rosset, Corby","people_section":0,"alias":"corbyrosset"},{"type":"user_nicename","value":"Carolyn Buractaon","user_id":39327,"display_name":"Carolyn Buractaon","author_link":"<a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/caburact\/\" aria-label=\"Visit the profile page for Carolyn Buractaon\">Carolyn Buractaon<\/a>","is_active":false,"last_first":"Buractaon, Carolyn","people_section":0,"alias":"caburact"},{"type":"user_nicename","value":"Jingwen Lu","user_id":40021,"display_name":"Jingwen Lu","author_link":"<a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/jinlu\/\" aria-label=\"Visit the profile page for Jingwen Lu\">Jingwen Lu<\/a>","is_active":false,"last_first":"Lu, Jingwen","people_section":0,"alias":"jinlu"},{"type":"user_nicename","value":"Yeyun Gong","user_id":39186,"display_name":"Yeyun Gong","author_link":"<a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/yegong\/\" aria-label=\"Visit the profile page for Yeyun Gong\">Yeyun Gong<\/a>","is_active":false,"last_first":"Gong, Yeyun","people_section":0,"alias":"yegong"},{"type":"user_nicename","value":"Nick Craswell","user_id":33088,"display_name":"Nick Craswell","author_link":"<a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/nickcr\/\" aria-label=\"Visit the profile page for Nick Craswell\">Nick Craswell<\/a>","is_active":false,"last_first":"Craswell, Nick","people_section":0,"alias":"nickcr"},{"type":"user_nicename","value":"Xing Xie","user_id":34906,"display_name":"Xing Xie","author_link":"<a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/xingx\/\" aria-label=\"Visit the profile page for Xing Xie\">Xing Xie<\/a>","is_active":false,"last_first":"Xie, Xing","people_section":0,"alias":"xingx"},{"type":"user_nicename","value":"Fan Yang","user_id":31782,"display_name":"Fan Yang","author_link":"<a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/fanyang\/\" aria-label=\"Visit the profile page for Fan Yang\">Fan Yang<\/a>","is_active":false,"last_first":"Yang, Fan","people_section":0,"alias":"fanyang"},{"type":"user_nicename","value":"Bryan Tower","user_id":36653,"display_name":"Bryan Tower","author_link":"<a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/brtower\/\" aria-label=\"Visit the profile page for Bryan Tower\">Bryan Tower<\/a>","is_active":false,"last_first":"Tower, Bryan","people_section":0,"alias":"brtower"},{"type":"user_nicename","value":"Jason (Zengzhong) Li","user_id":40543,"display_name":"Jason (Zengzhong) Li","author_link":"<a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/jasol\/\" aria-label=\"Visit the profile page for Jason (Zengzhong) Li\">Jason (Zengzhong) Li<\/a>","is_active":false,"last_first":"Li, Jason (Zengzhong)","people_section":0,"alias":"jasol"},{"type":"user_nicename","value":"Rangan Majumder","user_id":38931,"display_name":"Rangan Majumder","author_link":"<a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/ranganm\/\" aria-label=\"Visit the profile page for Rangan Majumder\">Rangan Majumder<\/a>","is_active":false,"last_first":"Majumder, Rangan","people_section":0,"alias":"ranganm"},{"type":"user_nicename","value":"Jennifer Neville","user_id":40946,"display_name":"Jennifer Neville","author_link":"<a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/jenneville\/\" aria-label=\"Visit the profile page for Jennifer Neville\">Jennifer Neville<\/a>","is_active":false,"last_first":"Neville, Jennifer","people_section":0,"alias":"jenneville"},{"type":"user_nicename","value":"Harsha Simhadri","user_id":36146,"display_name":"Harsha Simhadri","author_link":"<a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/harshasi\/\" aria-label=\"Visit the profile page for Harsha Simhadri\">Harsha Simhadri<\/a>","is_active":false,"last_first":"Simhadri, Harsha","people_section":0,"alias":"harshasi"},{"type":"user_nicename","value":"Manik Varma","user_id":32791,"display_name":"Manik Varma","author_link":"<a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/manik\/\" aria-label=\"Visit the profile page for Manik Varma\">Manik Varma<\/a>","is_active":false,"last_first":"Varma, Manik","people_section":0,"alias":"manik"},{"type":"user_nicename","value":"Bonnie Kruft","user_id":41919,"display_name":"Bonnie Kruft","author_link":"<a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/bonniekruft\/\" aria-label=\"Visit the profile page for Bonnie Kruft\">Bonnie Kruft<\/a>","is_active":false,"last_first":"Kruft, Bonnie","people_section":0,"alias":"bonniekruft"}],"msr_type":"Post","featured_image_thumbnail":"<img width=\"960\" height=\"540\" src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2024\/05\/RF41-BlogHeroFeature-1400x788-1-960x540.png\" class=\"img-object-cover\" alt=\"Research Focus: May 13, 2024\" decoding=\"async\" loading=\"lazy\" srcset=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2024\/05\/RF41-BlogHeroFeature-1400x788-1-960x540.png 960w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2024\/05\/RF41-BlogHeroFeature-1400x788-1-300x169.png 300w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2024\/05\/RF41-BlogHeroFeature-1400x788-1-1024x576.png 1024w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2024\/05\/RF41-BlogHeroFeature-1400x788-1-768x432.png 768w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2024\/05\/RF41-BlogHeroFeature-1400x788-1-1066x600.png 1066w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2024\/05\/RF41-BlogHeroFeature-1400x788-1-655x368.png 655w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2024\/05\/RF41-BlogHeroFeature-1400x788-1-240x135.png 240w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2024\/05\/RF41-BlogHeroFeature-1400x788-1-640x360.png 640w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2024\/05\/RF41-BlogHeroFeature-1400x788-1-1280x720.png 1280w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2024\/05\/RF41-BlogHeroFeature-1400x788-1.png 1400w\" sizes=\"auto, (max-width: 960px) 100vw, 960px\" \/>","byline":"","formattedDate":"May 15, 2024","formattedExcerpt":"Welcome to Research Focus, a series of blog posts that highlights notable publications, events, code\/datasets, new hires and other milestones from across the research community at Microsoft. Large language models (LLMs) have shown remarkable performance in generating text similar to that created by people, proving&hellip;","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/posts\/1032900","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/users\/42735"}],"replies":[{"embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/comments?post=1032900"}],"version-history":[{"count":25,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/posts\/1032900\/revisions"}],"predecessor-version":[{"id":1035336,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/posts\/1032900\/revisions\/1035336"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/media\/1033017"}],"wp:attachment":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/media?parent=1032900"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/categories?post=1032900"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/tags?post=1032900"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=1032900"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=1032900"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=1032900"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=1032900"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=1032900"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=1032900"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=1032900"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=1032900"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}