Insights into the Challenges and Opportunities of Large Multi-Modal Models for Blind and Low Vision Users: CLIP
PARIKSHA: A Scalable, Democratic, Transparent Evaluation Platform for Assessing Indic Large Language Models
Publication Table Cell Search for Question Answering Huan Sun, Hao Ma, Xiaodong He, Scott Wen-tau Yih, Yu Su, Xifeng Yan Proceedings of the companion publication of the 25th international conference on World Wide Web | April 2016
Publication Speaker-aware Training of LSTM-RNNS for Acoustic Modelling Tian Tan, Yanmin Qian, Dong Yu, Souvik Kundu, Liang Lu, Khe Chai SIM, Xiong Xiao, Yu Zhang April 2016
Publication Multimodal Learning for Image Captioning and Visual Question Answering (talk at UC Berkeley, BVLC) Xiaodong He MSR-TR-2016-16 | April 2016
Publication Prediction-Adaption-Correction Recurrent Neural Networks for Low-Resource Language Speech Recognition Yu Zhang, Ekapol Chuangsuwanich, James Glass, Dong Yu April 2016
Publication Integrated Adaptation with Multi-Factor Joint-Learning for Far-field Speech Recognition Yanmin Qian, Tian Tan, Dong Yu, Yu Zhang April 2016
Publication The Dialog State Tracking Challenge Series: A Review Jason Williams, Antoine Raux, Matthew Henderson Dialogue & Discourse | April 2016 Project Project
Publication Highway Long Short-term Memory RNNS for Distant Speech Recognition Yu Zhang, Guoguo Chen, Dong Yu, Kaisheng Yao, Sanjeev Khudanpur, James Glass April 2016
Publication Deep Beamforming Networks for Multi-Channel Speech Recognition Xiong Xiao, Shinji Watanabe, Hakan Erdogan, Liang Lu, John Hershey, Mike Seltzer, Guoguo Chen, Yu Zhang, Michael Mandel, Dong Yu April 2016
Publication An Investigation into Using Parallel Data for Far-Field Speech Recognition Yanmin Qian, Tian Tan, Dong Yu April 2016
Publication Sitka: a collaboration between type design and science Kevin Larson, Matthew Carter Digital Fonts and Reading | Published by WORLD SCIENTIFIC | 2016