Discover an index of datasets, SDKs, APIs and open-source tools developed by Microsoft researchers and shared with the global academic community below. These experimental technologies—available through Azure AI Foundry Labs (opens in new tab)—offer a glimpse into the future of AI innovation.
AirSim Simulator
AirSim is high fidelity extensible simulation platform to allow data generation, algorithms testing and reinforcement learning for developing autonomous agents.
Graph-based code modeling toolkit
A toolkit for reasoning about source code (tasks related to program understanding, synthesis, and verification) using graph neural networks. Developed in partnership with MSR Cambridge. Used by several ongoing projects both inside and outside MSR.
Dataset for Learning Karel Programs
A synthetic dataset of visual programs for the program synthesis task, now a common benchmark in the academic community. This webpage hosts the dataset for synthetically generated Karel programs that are used for training and…
Microsoft Program Synthesis using Examples SDK
A framework/SDK for program synthesis from input-output examples, with pre-built applications for data wrangling, Jupyter integration for data scientists, repetitive code editing, text manipulations, and extraction from webpages. Started at MSR, now developed by a…
Vowpal Wabbit
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning. There is a specific focus on reinforcement learning…
Vale
Vale (Verified Assembly Language for Everest) is a tool for constructing formally verified high-performance assembly language code, with an emphasis on cryptographic code. It uses existing verification frameworks, such as Dafny and F*, for formal…
EverCrypt
EverCrypt (opens in new tab) is a high-performance, cross-platform, formally verified modern cryptographic provider distributed as a combined C/ASM library. EverCrypt packages cryptographic implementations from the HACL* and ValeCrypt projects, and automatically picks the fastest…
GitHub Publication Publication Publication Publication Publication
MBML Book Sample Code
Supporting code for the Model-Based Machine Learning book. This project contains the sample code and test data for the freely available online book on model-based machine learning published at http://mbmlbook.com/ (opens in new tab). The…