{"id":748432,"date":"2021-05-25T17:47:27","date_gmt":"2021-05-26T00:47:27","guid":{"rendered":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/?post_type=msr-blog-post&#038;p=748432"},"modified":"2021-05-25T17:47:27","modified_gmt":"2021-05-26T00:47:27","slug":"when-does-text-prediction-benefit-from-additional-context-an-exploration-of-contextual-signals-for-chat-and-email-messages","status":"publish","type":"msr-blog-post","link":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/articles\/when-does-text-prediction-benefit-from-additional-context-an-exploration-of-contextual-signals-for-chat-and-email-messages\/","title":{"rendered":"When does text prediction benefit from additional context? An exploration of contextual signals for chat and email messages"},"content":{"rendered":"<p><b>By <a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/sttrajan\/\" target=\"_blank\" rel=\"noopener\">Stojan Trajanovski<\/a>, <a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/chatalla\/\" target=\"_blank\" rel=\"noopener\">Chad Atalla<\/a>, Kunho Kim, Vipul Agarwal, <a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/milads\/\" target=\"_blank\" rel=\"noopener\">Milad Shokouhi<\/a>, and <a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/chrisq\/\" target=\"_blank\" rel=\"noopener\">Chris Quirk<\/a><\/b><\/p>\n<p>Email and chat communication tools are increasingly important for completing daily professional and personal tasks. Given the recent pandemic and shift to remote work, this usage has surged. The number of daily active users in Microsoft Teams, the largest business communication and chat platform, has increased from <a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/microsoft-365\/blog\/2019\/11\/19\/5-attributes-successful-teams\/\" target=\"_blank\" rel=\"noopener\">20 million<\/a> pre-pandemic time in 2019 to <a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/microsoft-365\/blog\/2020\/10\/28\/microsoft-teams-reaches-115-million-dau-plus-a-new-daily-collaboration-minutes-metric-for-microsoft-365\/\" target=\"_blank\" rel=\"noopener\">more than 115 million<\/a> and 145 million in October 2020 and April 2021, respectively. On the other hand, email continues to be the crucial driver for formal communication showing ever increasing usage. Providing real-time suggestions for word or phrase auto-completions is known as text prediction. The efficiency of these communications is enhanced by suggesting highly accurate text predictions with low latency. Text prediction services have been deployed across popular communication tools and platforms such as <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/insider.office.com\/en-us\/blog\/text-predictions-in-word-outlook\" target=\"_blank\" rel=\"noopener noreferrer\">Microsoft Outlook Text Predictions<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> or <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/doi.org\/10.1145\/3292500.3330723\" target=\"_blank\" rel=\"noopener noreferrer\">GMail Smart Compose<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> <a id=\"r1\" href=\"#fn1\">[1]<\/a>.<\/p>\n<p>Modern text prediction algorithms are based on large language models and generally rely on the prefix of a message (characters typed until cursor position) to create predictions. We study to what extent <em><b>additional contextual signals<\/b><\/em> improve text predictions in chat and email messages in two of the largest commercial communication platforms Microsoft Teams and Outlook.<br \/>\nWe examine several signals accompanying the main message: <em>composition time, subject,<\/em> and <em>previous messages<\/em> (see the Table below).<\/p>\n<table class=\"aligncenter\" style=\"border-collapse: collapse;border-spacing: inherit\" border=\"1\">\n<tbody>\n<tr>\n<td style=\"width: 264px;padding: inherit;border: 1px solid\"><strong>Contextual signal<\/strong><\/td>\n<td style=\"width: 1124px;padding: inherit;border: 1px solid\"><strong>Details<\/strong><\/td>\n<\/tr>\n<tr>\n<td style=\"width: 264px;padding: inherit;border: 1px solid\"><em>Composition time<\/em><\/td>\n<td style=\"width: 1124px;padding: inherit;border: 1px solid\">It is a contextual signal which can provide added value for text prediction, enabling suggestions with relevant date-time words, like &#8220;weekend&#8221;, &#8220;tonight&#8221;.<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 264px;padding: inherit;border: 1px solid\"><em>Subject<\/em><\/td>\n<td style=\"width: 1124px;padding: inherit;border: 1px solid\">Message subjects often contain the purpose or summarized information of a message. In the email scenario, we use <em>subject<\/em> as context. In the chat scenario, we use the <em>chat window name<\/em> as a proxy for subject.<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 264px;padding: inherit;border: 1px solid\"><em>Previous email<\/em><\/td>\n<td style=\"width: 1124px;padding: inherit;border: 1px solid\">Previous messages can provide valuable background information which influences the text of the current message being composed.\u00a0 In the email case, we create pairs of messages and replies.<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 264px;padding: inherit;border: 1px solid\"><em>Previous chat messages<\/em><\/td>\n<td style=\"width: 1124px;padding: inherit;border: 1px solid\">Prior message contextualization for chat scenario is much more complex. Chat conversations typically consist of many small messages sent in quick succession.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>We combine and encode these signals with the message body into a single &#8220;contextualized&#8221; string for the language model, using <em>special tokens<\/em> to separate from the other signals (Figure 1).<\/p>\n<div id=\"attachment_748447\" style=\"width: 573px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-748447\" class=\"size-full wp-image-748447\" src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/05\/ContextBlog1.png\" alt=\"Context extraction and encoding\" width=\"563\" height=\"351\" srcset=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/05\/ContextBlog1.png 563w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/05\/ContextBlog1-300x187.png 300w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/05\/ContextBlog1-16x10.png 16w\" sizes=\"auto, (max-width: 563px) 100vw, 563px\" \/><p id=\"caption-attachment-748447\" class=\"wp-caption-text\">Figure 1. <em>Context extraction and encoding.<\/em><\/p><\/div>\n<p>We segment chat histories by <em>message blocks<\/em> and <em>time windows<\/em>. A series of uninterrupted messages sent by one sender is considered as a single <em>message block<\/em>. Messages sent within the past <em>N<\/em> minutes are within a <em>time window<\/em>, which enforces recency as a proxy for relevance. We define three previous message context aggregation modes in the chat scenario (visualized in Figure 2), mimicking prior email context:<\/p>\n<ul>\n<li><em>Ignore-Blocks<\/em>: chat messages from the <em>current sender<\/em>, in the past <em>N<\/em> minutes (e.g., 2, 5, 10 minutes), ignoring any message block boundaries.<\/li>\n<li><em>Respect-Blocks<\/em>: chat messages from the <em>current sender<\/em>, in the past <em>N<\/em> minutes, confined to the <em>most recent message block<\/em>.<\/li>\n<li><em>Both-Senders<\/em>: chat messages from <em>both senders<\/em>, in the past <em>N<\/em> minutes. When the sender turn changes, strings are separated by a space or a special token.<\/li>\n<\/ul>\n<div id=\"attachment_748444\" style=\"width: 946px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-748444\" class=\"size-full wp-image-748444\" src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/05\/ContextBlog2.png\" alt=\"chat window \" width=\"936\" height=\"357\" srcset=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/05\/ContextBlog2.png 936w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/05\/ContextBlog2-300x114.png 300w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/05\/ContextBlog2-768x293.png 768w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/05\/ContextBlog2-16x6.png 16w\" sizes=\"auto, (max-width: 936px) 100vw, 936px\" \/><p id=\"caption-attachment-748444\" class=\"wp-caption-text\">Figure 2. <em>Aggregating a 5 min prior chat window in various context modes.<\/em><\/p><\/div>\n<p>For example, 2 minutes <em>Both-Senders<\/em> mode and 5 minutes <em>Ignore-Blocks<\/em> aggregate similar amount: 2.5 chat messages on average and 56-59% of chat messages have at least one message as context. Given the email and chat message length statistics from Figure 2, we expect chat messages to be about 10 x smaller than emails. Namely, in a statistical analysis of the chat message lengths (see Figure 3, blue box) we find that mean tokens number is 9.15, while median tokens number is 6. On the other hand, email (see Figure 3, green box) mean number of tokens is 94, while the median is 53 tokens. So, we limit chat histories to 20 messages, which is roughly equivalent to an email-reply pair in length.<\/p>\n<p style=\"text-align: center\"><b>\u201c1 email formal content = 10 x chat (informal) messages\u201d<\/b><\/p>\n<div id=\"attachment_748453\" style=\"width: 660px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-748453\" class=\"size-full wp-image-748453\" src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/05\/ContextBlog3.png\" alt=\"chart, box and whisker chart\" width=\"650\" height=\"296\" srcset=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/05\/ContextBlog3.png 650w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/05\/ContextBlog3-300x137.png 300w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2021\/05\/ContextBlog3-16x7.png 16w\" sizes=\"auto, (max-width: 650px) 100vw, 650px\" \/><p id=\"caption-attachment-748453\" class=\"wp-caption-text\">Figure 3. <em>Box-plot statistics for messages aggregation in Teams and Outlook.<\/em><\/p><\/div>\n<p>For ethical considerations of how we process data through multiple privacy precautions; according to General Data Protection Regulation (<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/ec.europa.eu\/info\/law\/law-topic\/data-protection\/eu-data-protection-rules_en\" target=\"_blank\" rel=\"noopener noreferrer\">GDPR<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>) and with \u201cfair block-listing\u201d of denigrative, offensive, controversial, sensitive, and stereotype-prone words and phrases, check out our NAACL paper <a id=\"r2\" href=\"#fn2\">[2]<\/a>.<\/p>\n<p><b>Results.<\/b> Previous message contextualization leads to significant gains for chat messages from Microsoft Teams, when using an appropriate message aggregation strategy. By using a 5-minute time window and messages from both senders, we see a <b>9.4%<\/b> relative increase in the match rate<sup><a id=\"r3\" href=\"#fn3\">1<\/a><\/sup>, and an <b>18.6%<\/b> relative gain on estimated characters accepted. This 5-minute window of prior messages from both senders outperforms the corresponding 2- and 10-minute window configurations. Chat messages are often short and can lack context about a train of thought; thus, the appropriate number of previous messages can bring necessary semantics to the model to provide a correct prediction. Benefits are comparatively insignificant for subject and compose time as contextual signals in chat messages. In the email scenario based on Microsoft Outlook, we find that time as a contextual signal yields the largest boost with a 2% relative increase on the match rate, while subject only helps in conjunction with time, and prior messages yields no improvement. We conclude that the different characteristics of chat and email messages impede domain transfer. The best contextual text prediction models are custom trained for each scenario, using the most impactful subset of contextual signals. Future work involves exploring different encodings for contextual signals, such as utilizing hierarchical RNNs to better capture context, or using more advanced architectures such as transformers, generative models or GPT-3.<\/p>\n<h1>References<\/h1>\n<p id=\"fn1\"><a href=\"#r1\">[1]<\/a> M. X. Chen, B. N. Lee, G. Bansal, Y. Cao, S. Zhang, J. Lu, J. Tsay, Y. Wang, A. M. Dai, Z. Chen et al. (2019) <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/doi.org\/10.1145\/3292500.3330723\" target=\"_blank\" rel=\"noopener noreferrer\">Gmail Smart Compose: Real-time assisted writing<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, In Proc. of the 25th ACM SIGKDD Intl. Conf. on Knowledge Discovery & Data Mining, pp. 2287\u20132295.<\/p>\n<p id=\"fn2\"><a href=\"#r2\">[2]<\/a> S. Trajanovski, C. Atalla, K. Kim, V. Agarwal, V. Shokouhi, and C. Quirk (2021) <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.aclweb.org\/anthology\/2021.naacl-industry.1\/\" target=\"_blank\" rel=\"noopener noreferrer\">When does text prediction benefit from additional context? An exploration of contextual signals for chat and email messages<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, In Proc. of NAACL-HLT (Annual Conf. of the North American Chapter of the Association for Computational Linguistics &#8211; Industry track papers).<\/p>\n<hr \/>\n<p id=\"fn3\"><a href=\"#r3\"><sup>1<\/sup><\/a>The ratio of the number of matched suggestions and the total number of generated suggestions.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>By Stojan Trajanovski, Chad Atalla, Kunho Kim, Vipul Agarwal, Milad Shokouhi, and Chris Quirk Email and chat communication tools are increasingly important for completing daily professional and personal tasks. Given the recent pandemic and shift to remote work, this usage has surged. The number of daily active users in Microsoft Teams, the largest business communication [&hellip;]<\/p>\n","protected":false},"author":39000,"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-content-parent":644373,"msr_hide_image_in_river":0,"footnotes":""},"research-area":[],"msr-locale":[268875],"msr-post-option":[],"class_list":["post-748432","msr-blog-post","type-msr-blog-post","status-publish","hentry","msr-locale-en_us"],"msr_assoc_parent":{"id":644373,"type":"group"},"_links":{"self":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-blog-post\/748432","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-blog-post"}],"about":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-blog-post"}],"author":[{"embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/users\/39000"}],"version-history":[{"count":37,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-blog-post\/748432\/revisions"}],"predecessor-version":[{"id":748666,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-blog-post\/748432\/revisions\/748666"}],"wp:attachment":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/media?parent=748432"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=748432"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=748432"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=748432"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}