This is the Trace Id: 4e680729ccfb934020cff5e6e6a92483
Skip to main content Microsoft 365 Office Azure Copilot Windows Support Windows Apps OneDrive Outlook Moving from Skype to Teams OneNote Microsoft Teams Accessories Xbox games Microsoft AI Microsoft Security Azure Dynamics 365 Microsoft 365 for business Microsoft Power Platform Windows 365 Digital Sovereignty Microsoft Developer Microsoft Learn Support for AI marketplace apps Microsoft Tech Community Microsoft Marketplace Visual Studio Marketplace Rewards Free downloads & security Education Gift cards View Sitemap

Microsoft Speech Corpus (Indian languages)

This dataset contains conversational and phrasal speech training and test data for Telugu, Tamil and Gujarati languages.

Important! Selecting a language below will dynamically change the complete page content to that language.

Download
  • Version:

    1.0

    Date Published:

    15/07/2024

    File Name:

    microsoftspeechcorpusindianlanguages.zip

    File Size:

    12.3 GB

    Microsoft Speech Corpus (Indian languages) release contains conversational and phrasal speech training and test data for Telugu, Tamil and Gujarati languages. The data package includes audio and corresponding transcripts. Data provided in this dataset shall not be used for commercial purposes. You may use the data solely for research purposes. If you publish your findings, you must provide the following attribution: “Data provided by Microsoft and SpeechOcean.com”.
  • Supported Operating Systems

    Windows 7, Windows 8, Windows 10, Windows 11

    • Windows 7, Windows 8, Windows 10, or Windows 11
    • Click Download and follow the instructions.