This is the Trace Id: 08fe71a004b09aeac68686c4b84f1701
Skip to main content Microsoft 365 Office Azure Copilot Windows Support Windows Apps OneDrive Outlook Moving from Skype to Teams OneNote Microsoft Teams Accessories Xbox games Microsoft AI Microsoft Security Azure Dynamics 365 Microsoft 365 for business Microsoft Power Platform Windows 365 Digital Sovereignty Microsoft Developer Microsoft Learn Support for AI marketplace apps Microsoft Tech Community Microsoft Marketplace Visual Studio Marketplace Rewards Free downloads & security Education Gift cards View Sitemap

Microsoft Research Social Media Conversation Corpus

A collection of 12,696 Tweet Ids representing 4,232 three-step conversational snippets extracted from Twitter logs. Last published: June 1, 2015.

Important! Selecting a language below will dynamically change the complete page content to that language.

Download
  • Version:

    1.0

    Date Published:

    15/07/2024

    File Name:

    MSRSocialMediaConversationCorpus.zip

    File Size:

    108.7 KB

    A collection of 12,696 Tweet Ids representing 4,232 three-step conversational snippets extracted from Twitter logs. Each row in the dataset represents a single context-message-response triple that has been evaluated by crowdsourced annotators as scoring an average of 4 or higher on a 5-point Likert scale measuring quality of the response in the context. The data has been randomly binned into tuning (development) and test sets, comprising 2118 and 2114 triples respectively. It is released to the natural language processing community for academic research purposes only. In order to access the underlying tweets and related metadata, you will need to call the Twitter API. If you use this material in your research, we ask that you cite the following paper: Alessandro Sordoni, Michel Galley, Michael Auli, Chris Brockett, Yangfeng Ji, Meg Mitchell, Jian-Yun Nie, Jianfeng Gao, and Bill Dolan, A Neural Network Approach to Context-Sensitive Generation of Conversational Responses, Conference of the North American Chapter of the Association for Computational Linguistics – Human Language Technologies (NAACL-HLT 2015), June 2015. Additional information about this and related projects may be found at http://research.microsoft.com/en-us/projects/convo/.
  • Supported Operating Systems

    Windows 10, Windows 7, Windows 8

    • Windows 7, Windows 8, or Windows 10
    • Click Download and follow the instructions.