Publication Towards More Unified In-context Visual Understanding Dianmo Sheng, Dongdong Chen, Zhentao Tan, Qiankun Liu, Qi Chu, Jianmin Bao, Tao Gong, Bin Liu, Shengwei Xu, Nenghai Yu CVPR 2024 | December 2023
Publication MM-Reasoner: A Multi-Modal Knowledge-Aware Framework for Knowledge-Based Visual Question Answering Mahmoud Khademi, Ziyi Yang, Felipe Vieira Frujeri, Chenguang Zhu 2023 Empirical Methods in Natural Language Processing | December 2023
Publication Datasets and Foundation Models for Landsat Imagery Adam J. Stewart, Nils Lehmann, Isaac A. Corley, Yi Wang, Yi-Chia Chang, Nassim Ait Ali Braham, Shradha Sehgal, Caleb Robinson, Arindam Banerjee Advances in Neural Information Processing Systems (NeurIPS 2023) | December 2023 Project Project Project
Publication MuRF: Multi-Baseline Radiance Fields Haofei Xu, Anpei Chen, Yuedong Chen, Christos Sakaridis, Yulun Zhang, Marc Pollefeys, Andreas Geiger, Fisher Yu CVPR 2024 | December 2023
Publication Segment and Caption Anything Xiaoke Huang, Jianfeng Wang, Yansong Tang, Zheng Zhang, Han Hu, Jiwen Lu, Lijuan Wang, Lijuan Wang, Zicheng Liu CVPR 2024 | November 2023
Publication MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation Yanhui Wang, Jianmin Bao, Wenming Weng, Ruoyu Feng, Dacheng Yin, Tao Yang, Jingxu Zhang, Qi Dai, Chunyu Wang, Kai Qiu, Yuhui Yuan, Xiaoyan Sun, Chong Luo, Baining Guo CVPR 2024 | November 2023
Publication MotionEditor: Editing Video Motion via Content-Aware Diffusion Shuyuan Tu, Qi Dai, Zhi-Qi Cheng, Hang-Rui Hu, Xintong Han, Zuxuan Wu, Yu-Gang Jiang CVPR 2024 | November 2023
Publication Semantic constraints to represent common sense required in household actions for multimodal learning-from-observation robot Katsushi Ikeuchi, Naoki Wake, Kazuhiro Sasabuchi, Jun Takamatsu Int. J. Robotics Res. | November 2023, Vol 43(2): pp. 134-170 Project
Publication MM-Narrator: Narrating Long-form Videos with Multimodal In-Context Learning Chaoyi Zhang, Kevin Lin, Zhengyuan Yang, Jianfeng Wang, Linjie Li, Chung-Ching Lin, Zicheng Liu, Lijuan Wang CVPR 2024 | November 2023
Publication OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation Qidong Huang, Xiaoyi Dong, Pan Zhang, Bin Wang, Conghui He, Jiaqi Wang, Dahua Lin, Weiming Zhang, Neng H. Yu CVPR 2024 | November 2023