Advances in Natural Language Generation for Indian Languages
Much of recent progress for natural language generation (NLG) has been in the context of English and, in general, high resource languages, however, Indian languages have yet to see similar paradigm shifts despite their speaking…
MInference: Million-Tokens Prompt Inference for Long-context LLMs
Million-Tokens Prompt Inference for Long-context LLMs MInference 1.0 leverages the dynamic sparse nature of LLMs’ attention, which exhibits some static patterns, to speed up the pre-filling for long-context LLMs. It first determines offline which sparse pattern…
EmoCtrl-TTS
Controlling Time-Varying Emotional States of Flow-Matching-Based Zero-Shot Text-to-Speech EmoCtrl-TTS is an emotion-controllable zero-shot TTS that can generate highly emotional speech with non-verbal vocalizations such as laughter and crying for any speaker. EmoCtrl-TTS is purely a…
Research Focus: Week of June 24, 2024
In this issue: RENC makes 5G vRAN servers more energy efficient; CoExplorer uses AI to keep video meetings on track; Automatic bug detection in LLM-powered text-based games; MAIRA-2: Grounded radiology report generation.
E2 TTS
Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS E2 TTS (Embarrassingly Easy TTS) is a fully non-autoregressive zero-shot text-to-speech (TTS) system capable of generating the voice of any speaker. Despite its extremely simple model architecture and training…
Making Sentence Embeddings Robust to User-Generated Content
This seminar was hosted by Microsoft Research Africa, Nairobi together with the Microsoft AI for Good team in May 2024. User-generated content (UGC), e.g. social media posts written in “Internet language”, presents a lot of…
Insights into the Challenges and Opportunities of Large Multi-Modal Models for Blind and Low Vision Users: CLIP
Daniela Massiceti delves into the transformative potential of multimodal models such as CLIP for assistive technologies. Specifically focusing on the blind/low-vision community, the talk explores the current distance from realizing this potential and the advancements…