Cross-document Event Coreference Resolution based on Cross-media Features

  • Tongtao Zhang ,
  • Hongzhi Li ,
  • Heng Ji ,
  • Shih-Fu Chang

Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing |

Published by Association for Computational Linguistics

In this paper we focus on a new problem of event coreference resolution across television news videos. Based on the observation that the contents from multiple data modalities are complementary, we develop a novel approach to jointly encode effective features from both closed captions and video key frames. Experiment results demonstrate that visual features provided 7.2% absolute F-score gain on state-of-the-art text based event extraction and coreference resolution.