MagenticLite: A full-stack agentic experience powered by Small Models

May 14, 2026
Harkirat Behl, Microsoft; Weili Shi, Microsoft; Hussein Mozannar, Microsoft
Microsoft Research Forum | Season 2, Episode 4

What if you could run a capable AI agent without leaning on frontier-scale models? MagenticLite is the next generation of Magentic-UI, an agentic experience reimagined and optimized for small language models. It works across both your browser and your local file system in a single workflow, keeping you in the driver’s seat at every step. In this session, we’ll demo MagenticLite in action and deep dive into the two models powering it: MagenticBrain for planning, coding, and delegation, and Fara-1.5-9B for browser use.

Explore more

Fara1.5 and MagenticBrain coming soon to Microsoft Foundry

All Research Forum sessions

Transcript

MagenticLite is here: A full-stack agentic experience powered by Small Models

[MUSIC]

[MUSIC FADES INTO SWEEPING SOUND]

YASH LARA: Today, I’m excited to announce a new release from the Microsoft Research AI Frontiers Lab, MagenticLite.

MagenticLite is the next generation of Magentic UI. It works across both the browser and your local file system in a single workflow, keeping you in the driver’s seat at every step.

Along with MagenticLite, we’re also releasing the two models powering it. MagenticLite is a result of deep collaboration across AI Frontiers—spanning model development, agentic systems, agentic harnesses, and UX. Together, the team has created an end-to-end integration throughout the agentic stack.

We’ll now hear from our amazing colleagues, Weili, Harkirat, and Hussein, who will walk you through MagenticLite and the models, and demo it live.

[MUSIC]

[MUSIC FADES INTO SWEEPING SOUND]

WEILI SHI: Hi, I’m Weili.

HUSSEIN MOZANNAR: I’m Hussein.

HARKIRAT BEHL: And I’m Harkirat.

We are all members of AI Frontiers, a boutique lab inside Microsoft Research.

HUSSEIN MOZANNAR: And for the past few months, we’ve been working closely together to reimagine what an agentic application can look like—one that’s efficient, capable, and useful for getting work done.

WEILI SHI: To do that, we had to work on the full stack, rethinking everything from how we generate training data, how we design the models, how the harness orchestrates it all, to the user experience that makes the whole thing feel like genuine collaboration.

HARKIRAT BEHL: What really moved the needle was going beyond standard benchmarks. We started with hero use cases—real everyday tasks that people care about. We built our own evals around them and used this signal to drive a flywheel of iterative improvements across the whole stack.

HUSSEIN MOZANNAR: The result today is three joint releases.

First is MagenticLite, our experience where you can get a real feel of these small-language models to get real work done.

Second is our computer-use agent model, Fara-1.5, a state-of-the-art model for its size class.

And last but not least, Magentic Orchestrator, a model capable of reasoning, delegation, and coding that brings the whole experience together.

Now, Weili is going to show us how MagenticLite works.

WEILI SHI: Let’s dive into MagenticLite.

You may have heard of Magentic UI, the agentic application we released last year that set the foundation for this work. With MagenticLite, we reworked the agent harness to run efficiently on small-language models—making it faster, more lightweight, and no longer reliant on frontier-scale models.

We also refreshed the UX design based on community feedback, making it easier and more natural to work with.

MagenticLite works across both the browser and your local file system to help you get real work done—whether that’s filling out online forms, making appointments on your behalf, managing files on your desktop, or generating simple code.

Let’s see it in action.

I can give MagenticLite access to a folder on my computer. Here, I’m giving it notes from the last Microsoft Build conference. I want the agent to search for what has changed and create an update document to help me prepare for the next conference.

It has successfully accessed my notes and created to-dos. Next, it opened a web browser and started to gather the information I need.

MagenticLite has access to its own browser running in a virtual machine. This helps minimize the risk of data leakage while allowing Fara, our browser-use model, to operate quickly.

Fara performs well at long-running tasks. It looks like Fara has gathered enough information, and the Orchestrator has created a document. Let’s check it out.

MagenticLite did the job. It included updates on all key sections.

Next, I’d like to email this document to my colleague. I’ll let MagenticLite do this for me. I can keep working on other things, and MagenticLite will notify me when my attention is needed.

In this case, it needs my help to log into my email account. I’m taking control of the browser to log in.

Once unblocked, the agent composes the email and can send it once it finishes.

Now that you’ve seen it in action, let’s take a closer look at the models powering MagenticLite—small models that punch above their weight. Next, Harkirat will introduce Magentic Orchestrator.

HARKIRAT BEHL: Thanks, Weili.

If MagenticLite is the app you interact with, and Fara drives the browser, then Magentic Orchestrator is the brain that ties it all together. It is the planner, the coder, and the delegator—all in one model.

Its job is to take a messy request—like “Book me a dentist appointment Tuesday afternoon and add it to my calendar”—and convert it into a concrete plan.

The Orchestrator figures out the steps, picks the right tool or sub-agent for each step, writes code when needed, and recovers when something breaks mid-task.

What’s interesting is that the recipe is quite simple. Orchestration is usually where people reach for the biggest model they can get, but we wanted to show you that you can push all of this into a small model without giving up capability.

The training is standard SFT, but the key is the data mix—blending complementary styles of data in the right ratio.

The first style is tool-calling data—clear requests, selecting tools, calling them with arguments, and handling responses.

The second is terminal-style data, where an agent performs step-by-step actions—observing results and deciding what to do next.

Mixing these teaches the model when to use tools and when to generate code directly.

The result is a model that competes with much larger ones, while staying small enough to run locally alongside Fara—and it’s open weight, so it can integrate into your own systems.

With that, Hussein will introduce Fara.

HUSSEIN MOZANNAR: One of our goals in AI Frontiers is to train agentic models to complete computer-use tasks. Our bet is that end-to-end synthetic data generation—without human interaction data—can get us there.

Last November, we released Fara-7B. Today, we’re excited to introduce Fara-1.5, a family of models across three sizes: 4B, 9B, and 27B.

Fara-1.5 sets state-of-the-art results for models in its class.

Fara can handle real-world web tasks like form filling, booking, shopping, and other repetitive actions. It works by capturing screenshots, analyzing them alongside past context, and predicting the next action—like clicking or typing.

It operates in a loop: observe, act, evaluate, and continue.

Its action space includes clicking, keyboard input, memory tools, and the ability to ask the user for approval when needed.

On the Mind2Web benchmark, Fara-1.5 nearly doubles performance compared to Fara-7B, improving from 35% to about 65%.

But we didn’t train it just for benchmarks—we trained it for real-world usefulness.

This is enabled by our synthetic data system, FaraGen 2.0, which generates training data using:

Live and synthetic web environments
A strong teacher agent
A user simulator
Verification systems for correctness, efficiency, and safety

This allows us to scale data generation and train models effectively.

Looking ahead, we plan to expand Fara with always-on capabilities, support for additional environments like Windows and Linux, and deeper integration with terminal workflows.