{"id":503396,"date":"2018-08-31T05:57:50","date_gmt":"2018-08-31T12:57:50","guid":{"rendered":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/?p=503396"},"modified":"2020-12-11T13:01:39","modified_gmt":"2020-12-11T21:01:39","slug":"thinking-outside-of-the-black-box-of-machine-learning-on-the-long-quest-to-perfecting-automatic-speech-recognition","status":"publish","type":"post","link":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/blog\/thinking-outside-of-the-black-box-of-machine-learning-on-the-long-quest-to-perfecting-automatic-speech-recognition\/","title":{"rendered":"Thinking outside-of-the-black-box of machine learning on the long quest to perfecting automatic speech recognition"},"content":{"rendered":"<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-503399\" src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2018\/08\/01_SCS-MS-Research_20180821_1400x788_Site-Placement-1024x576.png\" alt=\"\" width=\"1024\" height=\"576\" srcset=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2018\/08\/01_SCS-MS-Research_20180821_1400x788_Site-Placement-1024x576.png 1024w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2018\/08\/01_SCS-MS-Research_20180821_1400x788_Site-Placement-300x169.png 300w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2018\/08\/01_SCS-MS-Research_20180821_1400x788_Site-Placement-768x432.png 768w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2018\/08\/01_SCS-MS-Research_20180821_1400x788_Site-Placement-1066x600.png 1066w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2018\/08\/01_SCS-MS-Research_20180821_1400x788_Site-Placement-655x368.png 655w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2018\/08\/01_SCS-MS-Research_20180821_1400x788_Site-Placement-343x193.png 343w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2018\/08\/01_SCS-MS-Research_20180821_1400x788_Site-Placement.png 1400w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><br \/>\nSpeech recognition is something we humans do remarkably well, which includes our ability to understand speech even in noisy multi-talker environments. While our natural sophistication at this is something we take for granted, speech recognition researchers continue to pursue refinements and improvements on the frontiers of the research space of automatic speech recognition. Significant technological progress that has been made over decades has shaped automatic speech recognition technology into its current form, which is already powering various Microsoft products, including Cortana, Skype Translator, Presentation Translator, Office Dictation, HoloLens, and Azure Cognitive Services. Yet, there is still a long way to go. Particularly challenging for humans \u2013 and almost impossible for machines \u2013 is zeroing in on one speaker in a noisy multi-talker environment. A pair of significant recent advances in the field coming out of Microsoft\u2019s AI investments promises to get us even closer to the day in which AI speech recognition surpasses even the abilities of humans to process and understand the dynamic buzz of words in complex interactions and settings and to perhaps leverage speech in ways previously unimagined.<\/p>\n<p>In papers to be presented at <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"http:\/\/interspeech2018.org\/index.html\">Interspeech 2018<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> in Hyderabad, India September 2-6, Microsoft AI researchers outline a pair of significant innovations in the area of overlapped speech recognition and in rethinking established methods of temporal modeling for automatic speech recognition.<\/p>\n<p>\u201c<a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/publication\/recognizing-overlapped-speech-in-meetings-a-multichannel-separation-approach-using-neural-networks-2\/\">Recognizing Overlapped Speech in Meetings: A Multichannel Separation Approach Using Neural Networks<\/a>\u201d by Microsoft AI and Research researchers Takuya Yoshioka, Hakan Erdogan, Zhuo Chen, Xiong Xiao, and Fil Alleva, approaches the real-world problem of developing a far-field meeting transcription system that can recognize speech even when utterances of different speakers are overlapped. While automatic speech recognition technology has made significant progress in recent years thanks to deep learning, when it comes to dealing with speech overlaps, AI still can\u2019t compete with humans, especially in the one realm where humans dominate: zeroing in on one speaker in a noisy multi-talker environment and understanding what the speaker is saying even when his or her voice is overlapped by the chatter of other speakers within earshot \u2013 what the researchers call the cocktail party problem. Current automatic speech recognition systems perform pretty badly when utterances of two or more speakers overlap.<\/p>\n<p>The challenges that need to be overcome include an unknown and varying number of speakers, unknown speaker identities, unknown speech activity segments, and background noise and reverberation.<\/p>\n<blockquote>\n<p style=\"text-align: center;\"><strong>\u201cSpeech separation or overlapped speech recognition is paramount for far-field conversational speech recognition. It has a wide range of potential applications, such as meeting assistance and medical dialog transcription.\u201d \u2013 Takuya Yoshioka<\/strong><\/p>\n<\/blockquote>\n<p>\u201cIn order to separate overlapped speech in real meeting audio, we have to solve two challenges in a speaker-independent fashion: overlap detection and speech separation. In our paper, we jointly addressed these problems by using a neural network and integrating it with traditional signal processing techniques in a cohesive way,\u201d explained Takuya Yoshioka.<\/p>\n<p>The team came up with a new signal processing module, the unmixing transducer, a novel signal processing module for converting multi-channel (multi-microphone-sourced) audio signals into a fixed number of separated speech streams and implemented it using a windowed BLSTM. A novel neural network architecture was proposed to effectively leverage beamforming capability. Significant gains in meeting transcription performance were obtained, especially in multi-talker segments, compared with a state-of-the-art neural network-based beamformer. The team emphasizes that the new method makes no assumptions regarding the total number of meeting attendees nor their identities.<\/p>\n<p>In typical meetings, overlapped speaking segments account for 10+ percent of the speaking time. While this is far too significant to ignore, handling it requires a great care because the system now has to always consider the possibility of overlap. Otherwise, the system will end up with inserting a lot of redundant \u2018ghost\u2019 words between correct words.<\/p>\n<p>In this system, the unmixing transducer continuously receives microphone signals and generates a fixed number of time-synchronous audio streams. The acoustic signal of each utterance found in the input \u201cspurts\u201d from one of the output channels. When the number of active speakers is fewer than that of the outputs, the extra channels generate zero-valued signals. The signal from each output channel is segmented and transcribed by a back-end speech recognizer connected to that channel.<\/p>\n<p>Yoshioka applied the method to recordings of the team\u2019s own meetings and to his surprise and delight it worked pretty well. The result was kind of unexpected if only because real-world overlapped speech recognition remained a persistent challenge in the community; previous methods had been tested only in simplified laboratory settings with none successful in real-world settings. \u201cThat was the moment I decided to bet on this approach,\u201d said Yoshioka. The team has actively been pursuing the technology with performance continuously improving.<\/p>\n<p>To their knowledge, it represents the first overlapped speech recognition system that has been demonstrated to work well for actual meetings with no prior assumptions.<\/p>\n<p>\u201cSpeech separation or overlapped speech recognition is paramount for far-field conversational speech recognition,\u201d, said Yoshioka. \u201cIt has a wide range of potential applications, such as meeting assistance and medical dialog transcription. As computers begin to sense the world better and get smarter, they will be able to provide us more effective assistance and help us focus on more important things.\u201d<\/p>\n<p>In the accompanying paper titled, \u201c<a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/publication\/layer-trajectory-lstm\/\">Layer Trajectory LSTM<\/a>\u201d, Microsoft AI researchers Jinyu Li and fellow researchers Changliang Liu and Yifan Gong, successfully reassessed the potential for innovation in traditional time-based LSTM networks. Jinyu Li described his conceptual approach saying, \u201cSometimes deep learning is treated as a black box and researchers just keep trying different model structures without taking a couple of steps back and thinking about why the models work \u2013 and what else might be possible.\u201d<\/p>\n<p>Traditional LSTM networks in recurrent neural networks (RNNs), well-suited to classifying and making predictions based on time series data such as speech, nevertheless still left room for improvement in advanced speech recognition. Traditionally, the AI takes speech and builds a layer-by-layer structure to get an abstraction of phonemes that models the time-speech signal much better. What Li\u2019s team propose in their paper is to separate the tasks of temporal modeling and phoneme classification with time-based LSTM and layer-based LSTM, respectively. Because every layer of traditional time-based LSTM has its own information, a layer-trajectory LSTM could be built to scan all this information instead of the traditional method of just using the top layer of time-based information typically relied upon in traditional LSTM models. Layer Trajectory LSTM would not only just use top layer time-based LSTM, but all outputs from every layer, that is, using depth versus time.<\/p>\n<blockquote>\n<p style=\"text-align: center;\"><strong>\u201cWe\u2019re excited about this breakthrough for how it significantly advances LSTM while consistently improving performance across every Microsoft speech recognition product.\u201d \u2013 Jinyu Li<\/strong><\/p>\n<\/blockquote>\n<p>\u201cEvery layer of standard time-based LSTM has its own information; we built Layer Trajectory LSTM to scan all that untapped information for our phoneme classification and prediction instead of using only the top-layer time-based LSTM.<\/p>\n<p>Like most of their peers in the space, the team had been using traditional time-based LSTM models and had performed many experiments aimed at improving performance but it was very challenging. Li had been devoting a lot of thinking to the shape of the problem and came to wonder if the real issue wasn\u2019t that LSTM relied on a single time-based LSTM block to perform two very different tasks \u2013 temporal modeling of speech signals and a layer-by-layer handling of phonemes for classification on the layer axis. What if these two very different tasks should be done with the separate blocks in the model? It was a eureka moment and implementing it, he observed over the next few days of model training that it yielded very good accuracy.<\/p>\n<p>With the two blocks now each having its own assigned tasks and clear goals, they no longer interfered with each other, explained Li. \u201cWe didn\u2019t just blindly try different modeling structures; this innovation is based on very clear thinking on what kind of modelling speech recognition should use.\u201d<\/p>\n<p>How innovative is this? \u201cIt\u2019s definitely new. The insight that modeling the time sequence and phonetic classification on separate axes extends the LSTM framework in an important new dimension that already has yielded a huge (10%) improvement in quality,\u201d said Li.<\/p>\n<p>Such task decoupling makes it possible to use modeling units other than LSTM for modeling layer dependency, opening a door for flexible model design.<\/p>\n<p>\u201cThis is very good technology, not only for the meeting scenario, but for all Microsoft, far-field speaker applications,\u201d said Li. \u201cCortana, Harman Kardon Invoke with Cortana by Microsoft, Skype Translator \u2013 all these products are experiencing the benefits of our research.\u201d<\/p>\n<p>At the upcoming Interspeech conference, Microsoft researchers and scientists will be presenting far more papers as listed below. We encourage you to look for these papers and meet the people behind them in Hyderabad September 2-6 and look forward to seeing this knowledge applied throughout the field in the coming months.<\/p>\n<h3><strong>Microsoft @ Interspeech 2018<\/strong><\/h3>\n<p>If you are in Hyderabad, please take time to chat with us and stop by our booth at location L1. And be sure to check our<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"http:\/\/aka.ms\/interspeech-2018\"> Interspeech&nbsp;<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"http:\/\/aka.ms\/interspeech-2018\">event page<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.<\/p>\n<h3>September 3<\/h3>\n<p><strong>17:50 Entity-Aware Language Model as an Unsupervised Re-ranker<\/strong><br \/>\nWe demonstrate an n-best reranking method to incorporate entity relationships from a knowledge-base into a language model without the need for difficult-to-obtain human annotated training data for the ranker.<br \/>\nHall 1 Mohammad Sadegh Rasooli and Sarangarajan Parthasarathy<\/p>\n<h3>September 4<\/h3>\n<p><strong>10:00 HoloCompanion: An MR Friend for Everyone<\/strong><br \/>\nMR56 Annam Naresh, Rushabh Gandhi, Mallikarjuna Rao Bellamkonda, Mithun Das Gupta<\/p>\n<p><strong>10:00 Cycle-Consistent Speech Enhancement<\/strong><br \/>\nHall 4 Zhong Meng, Jinyu Li, Yifan Gong and Biing-Hwang (Fred) Juang<\/p>\n<p><strong>10:00 Effect of TTS Generated Audio on OOV Detection and Word Error Rate in ASR for<\/strong><br \/>\n<strong>MR12_1 Low-resource Languages<\/strong><br \/>\nSavitha Murthy, Dinkar Sitaram and Sunayana Sitaram<\/p>\n<p><strong>14:30 Paired Phone-Posteriors Approach to ESL Pronunciation Quality Assessment<\/strong><br \/>\nWe propose to incorporate paired phone-posteriors as input features into a neural net model for assessing an ESL learner\u2019s pronunciation quality, which improves the evaluation quality of existing methods and gives learners more effective feedback<br \/>\nHall 4 Yujia Xiao, Frank Soong and Wenping Hu<\/p>\n<h3>September 5<\/h3>\n<p><strong>10:00 Layer Trajectory LSTM<\/strong><br \/>\nWe propose the layer trajectory LSTM (ltLSTM) which builds a layer-LSTM using all the layer outputs from a standard multi-layer time-LSTM. Compared with LSTM and variants which work layer-by-layer and time-by-time. ltLSTM joint optimization drives 9% error rate reduction, model design flexibility, and effective implementation.<br \/>\nHall 3 Jinyu Li, Changliang Liu and Yifan Gong<\/p>\n<p><strong>10:00 A New Glottal Neural Vocoder for Speech Synthesis<\/strong><br \/>\nWe propose a novel neural network-based vocoder for synthesis, which generates high quality speech with good CPU cost, outperforming traditional glottal vocoders.<br \/>\nHall 4 Yang Cui, Xi Wang, Lei He and Frank K. Soong<\/p>\n<p><strong>10:00 Homophone Identification and Merging for Code-switched Speech Recognition Brij<\/strong><br \/>\nWe propose a pronunciation-based approach to disambiguate and merge homophones in cross-transcribed multilingual text and a metric to measure authentic word error rate in code-switched speech recognition.<br \/>\nMR Mohan Lal Srivastava and Sunayana Sitaram<\/p>\n<p><strong>17:00 Improved Training for Online End-to-end Speech Recognition Systems<\/strong><br \/>\nHall4 Suyoun Kim, Michael Seltzer, Jinyu Li and Rui Zhao<\/p>\n<h3>September 6<\/h3>\n<p><strong>10:00 Recognizing Overlapped Speech in Meetings: A Multichannel Separation Approach Using Neural Networks<\/strong><br \/>\nA multi-channel neural network-based separation system is proposed. Previous methods which work in \u201claboratory settings\u201d contrast with the proposed system enabling overlapped speech recognition in real unconstrained meetings.<br \/>\nHall 3 Takuya Yoshioka, Hakan Erdogan, Zhuo Chen, Xiong Xiao and Fil Alleva<\/p>\n<p><strong>10:00 Adversarial Feature-Mapping for Speech Enhancement<\/strong><br \/>\nHall 4 Zhong Meng, Jinyu Li, Yifan Gong and Biing-Hwang (Fred) Juang<\/p>\n<p><strong>10:00 What to Expect from Expected Kneser-Ney Smoothing<\/strong><br \/>\nWe describe practical extensions and applications of Kneser-Ney Smoothing on Expected Counts that allows for training of a KN LM that takes full advantage of fractional n-gram counts.<br \/>\nHall 4 Michael Levit, Sarangarajan Parthasarathy and Shuangyu Chang<\/p>\n<p><strong>16:10 Investigations on Data Augmentation and Loss Functions for Deep Learning Based<\/strong><br \/>\nSpeech-Background Separation<br \/>\nWe investigate a novel SNR-based loss functions and on-the-fly data augmentation for separation of speech from background audio and improve the best published result on CHiME-2 medium track database as a result.<br \/>\nHall 1 Hakan Erdogan and Takuya Yoshioka<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Speech recognition is something we humans do remarkably well, which includes our ability to understand speech even in noisy multi-talker environments. While our natural sophistication at this is something we take for granted, speech recognition researchers continue to pursue refinements and improvements on the frontiers of the research space of automatic speech recognition. Significant technological [&hellip;]<\/p>\n","protected":false},"author":37074,"featured_media":503399,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":[{"type":"user_nicename","value":"Takuya Yoshioka","user_id":"36278"},{"type":"user_nicename","value":"Zhuo Chen","user_id":"38589"},{"type":"user_nicename","value":"Xiong Xiao","user_id":"38778"},{"type":"user_nicename","value":"Jinyu Li","user_id":"32312"},{"type":"user_nicename","value":"Yifan Gong","user_id":"34994"}],"msr_hide_image_in_river":0,"footnotes":""},"categories":[194481,194456,194462],"tags":[],"research-area":[13556,13545,13554],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-503396","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-human-centered-computing","category-natural-language-processing","category-speech-and-dialog","msr-research-area-artificial-intelligence","msr-research-area-human-language-technologies","msr-research-area-human-computer-interaction","msr-locale-en_us"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[],"related-projects":[],"related-events":[],"related-researchers":[{"type":"user_nicename","value":"Jinyu Li","user_id":32312,"display_name":"Jinyu Li","author_link":"<a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/jinyli\/\" aria-label=\"Visit the profile page for Jinyu Li\">Jinyu Li<\/a>","is_active":false,"last_first":"Li, Jinyu","people_section":0,"alias":"jinyli"},{"type":"user_nicename","value":"Yifan Gong","user_id":34994,"display_name":"Yifan Gong","author_link":"<a href=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/people\/ygong\/\" aria-label=\"Visit the profile page for Yifan Gong\">Yifan Gong<\/a>","is_active":false,"last_first":"Gong, Yifan","people_section":0,"alias":"ygong"}],"msr_type":"Post","featured_image_thumbnail":"<img width=\"960\" height=\"540\" src=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2018\/08\/01_SCS-MS-Research_20180821_1400x788_Site-Placement.png\" class=\"img-object-cover\" alt=\"\" decoding=\"async\" loading=\"lazy\" srcset=\"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2018\/08\/01_SCS-MS-Research_20180821_1400x788_Site-Placement.png 1400w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2018\/08\/01_SCS-MS-Research_20180821_1400x788_Site-Placement-300x169.png 300w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2018\/08\/01_SCS-MS-Research_20180821_1400x788_Site-Placement-768x432.png 768w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2018\/08\/01_SCS-MS-Research_20180821_1400x788_Site-Placement-1024x576.png 1024w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2018\/08\/01_SCS-MS-Research_20180821_1400x788_Site-Placement-1066x600.png 1066w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2018\/08\/01_SCS-MS-Research_20180821_1400x788_Site-Placement-655x368.png 655w, https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-content\/uploads\/2018\/08\/01_SCS-MS-Research_20180821_1400x788_Site-Placement-343x193.png 343w\" sizes=\"auto, (max-width: 960px) 100vw, 960px\" \/>","byline":"","formattedDate":"August 31, 2018","formattedExcerpt":"Speech recognition is something we humans do remarkably well, which includes our ability to understand speech even in noisy multi-talker environments. While our natural sophistication at this is something we take for granted, speech recognition researchers continue to pursue refinements and improvements on the frontiers&hellip;","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/posts\/503396","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/users\/37074"}],"replies":[{"embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/comments?post=503396"}],"version-history":[{"count":9,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/posts\/503396\/revisions"}],"predecessor-version":[{"id":712363,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/posts\/503396\/revisions\/712363"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/media\/503399"}],"wp:attachment":[{"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/media?parent=503396"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/categories?post=503396"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/tags?post=503396"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=503396"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=503396"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=503396"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=503396"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=503396"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=503396"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=503396"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/cm-edgetun.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=503396"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}