This is the Trace Id: bd4d5eb35bb534bc56ce29b113fd826f
Skip to main content Microsoft 365 Office Azure Copilot Windows Xbox Support Windows Apps OneDrive Outlook Moving from Skype to Teams OneNote Microsoft Teams Shop Xbox Accessories Xbox games Microsoft AI Microsoft Security Azure Dynamics 365 Microsoft 365 for business Microsoft Power Platform Windows 365 Digital Sovereignty Microsoft Developer Microsoft Learn Support for AI marketplace apps Microsoft Tech Community Microsoft Marketplace Visual Studio Marketplace Rewards Free downloads & security Education Gift cards Licensing View Sitemap

Microsoft Research Asia Chinese Word-Segmentation Data Set

A set of manually annotated Chinese word-segmentation data and specifications for training and testing a Chinese word-segmentation system for research purposes. Last published: August 16, 2007.

Important! Selecting a language below will dynamically change the complete page content to that language.

Download
  • Version:

    1.0

    Date Published:

    7/15/2024

    File Name:

    msra-chinese-word-segmentation-data-v1.zip

    File Size:

    4.4 MB

    A set of manually annotated Chinese word-segmentation data and specifications for training and testing a Chinese word-segmentation system for research purposes. The data was extracted from the People's Daily, which we have licensed for commercial usage, and the annotation was done by the Natural Language Computing group within Microsoft Research Asia.
  • Supported Operating Systems

    Windows 10, Windows 7, Windows 8

    • Windows 7, Windows 8, or Windows 10
    • Click Download and follow the instructions.