Research Tools: code, datasets, & models

Tool

CodePlan

CodePlan is a research project that formalizes repository-level coding tasks as planning problems and uses static analysis and large language models (LLMs) to solve them. This replication package is for the paper titled “CodePlan: Repository-level…

GitHub

Tool

Algorithms to Handle Signal Delay in Deep Reinforcement Learning

Algorithms to Handle Signal Delay in Deep Reinforcement Learning aims to address the problem of signal delay in continuous robotic control. Signal delay occurs when there is a lag between an agent’s perception of the…

GitHub Publication

Tool

HoloAssist

A large-scale egocentric human interaction dataset, where two people collaboratively complete physical manipulation tasks.

GitHub Publication

Tool

Orca-2-13B

Orca 2 is a finetuned version of LLAMA-2. It is built for research purposes only and provides a single turn response in tasks such as reasoning over user given data, reading comprehension, math problem solving…

Access Publication

Tool

Orca-2-7B

Orca 2 is a finetuned version of LLAMA-2. It is built for research purposes only and provides a single turn response in tasks such as reasoning over user given data, reading comprehension, math problem solving…

Access Publication

Tool

LLF-Bench

LLF Bench is a benchmark for evaluating learning agents that provides a diverse collection of interactive learning problems where the agent gets language feedback instead of rewards (as in RL) or action feedback (as in…

GitHub Publication

Tool

Fifty Shades of Bias

This repo contains data and code for EMNLP’23 paper “Fifty Shades of Bias”: Normative Ratings of Gender Bias in GPT Generated English Text

GitHub Publication

Tool

Tyger

Tyger is a framework for remote signal processing. It enables reliable transmission of data to remote computational resources, where the data can be processed and transformed as it streams in. It was designed for streaming…

GitHub

Tool

Skeleton-of-Thought (SoT)

This work aims at decreasing the end-to-end generation latency of large language models (LLMs). One of the major causes of the high generation latency is the sequential decoding approach adopted by almost all state-of-the-art LLMs.…

GitHub

Tool

SatCLIP

SatCLIP: Global, General-Purpose Location Embeddings with Satellite Imagery [PyTorch implementation, dataset and pretrained models for the paper “SatCLIP: Global, General-Purpose Location Embeddings with Satellite Imagery”]

GitHub Publication