Discover an index of datasets, SDKs, APIs and open-source tools developed by Microsoft researchers and shared with the global academic community below. These experimental technologies—available through Azure AI Foundry Labs (opens in new tab)—offer a glimpse into the future of AI innovation.
CodeXGLUE
CodeXGLUE is a benchmark dataset and open challenge for code intelligence. It includes a collection of code intelligence tasks and a platform for model evaluation and comparison. CodeXGLUE stands for General Language Understanding Evaluation benchmark…
MPNet
MPNet: Masked and Permuted Pre-training for Language Understanding, by Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu, is a novel pre-training method for language understanding tasks. It solves the problems of MLM (masked…
Pytorch-wildlife
At the core of our mission is the desire to create a harmonious space where conservation scientists from all over the globe can unite. Where they’re able to share, grow, use datasets and deep learning…
Microsoft AI for Earth Species Classification API
This project contains the training code for the Microsoft AI for Earth Species Classification API, along with the code for our API demo page (opens in new tab). This API classifies handheld photos of around 5000…
AI for Earth Engineering and Data Science
After developing an algorithm or machine learning model, researchers face the problem of deploying their model for others to consume, integrating it with data sources, securing its access, and keeping it current. Due to these…
AI for Earth – Creating APIs
These images and examples are meant to illustrate how to build containers for use in the AI for Earth API system.
Multi-species Bioacoustic Classification
Multi-species bioacoustic classification using deep learning algorithms. With audio recordings collected from rainforests in Puerto Rico, we build a deep learning model that combines transfer learning and pseudo-labeling as a data augmentation technique to: 1)…
InnerEye – Deep Learning
This is a deep learning toolbox to train models on medical images (or more generally, 3D images). It integrates seamlessly with cloud computing in Azure.