Mass Simulations

A Mass Simulation is a collection of Sessions where the same agent runs the same task multiple times. Due to the inherent non-determinism of autonomous agents, a single run is often misleading — Mass Sims let you observe what your agent can do, usually does, and sometimes fails to do. Learn to create one here

Why Mass Sims?

Mass Sims help answer:

What are the possible paths my agent might take?
Where does the agent behave inconsistently?
Are there probabilistic failure modes I’m not catching in single sessions?
What is the agent’s real-world reliability over many runs?

If your agent succeeds only 80–90% of the time, it’s not ready for production. Mass Sims reveal where and why the other 10–20% breaks.

Structure

Each Mass Simulation:

Groups many Sessions that run the same task
Computes a unified Workflow Trajectory graph
Clusters Steps across Sessions based on similar states
Shows a probabilistic map of your agent’s possible decisions

MassSim
├── Session 1
│   └── Steps → Events
├── Session 2
│   └── Steps → Events
├── ...
└── Workflow Trajectory Graph (merged from all Sessions)

Get Started

Features

Integrations

Core Concepts

Python SDK Functions

Mass Simulations

Mass Simulations

Why Mass Sims?

Structure

Get Started

Features

Integrations

Core Concepts

Python SDK Functions

​Mass Simulations

​Why Mass Sims?

​Structure

Mass Simulations

Why Mass Sims?

Structure