Steps
Learn how Steps represent agent states and transitions within a Session.
🪜 Steps
A Step is a unit of execution within a Session. It captures both a snapshot of the agent’s current state and the action taken to transition to the next state. Steps are the backbone of a session timeline and form the nodes in your workflow graph.
Each Step may contain multiple underlying Events, such as tool calls, API requests, or LLM invocations.
🧬 Step Anatomy
Each Step records:
- The agent’s state (optional string + LLM-generated summary)
- The goal the agent is trying to accomplish (optional string + summary)
- The action taken to transition to the next step (with optional LLM-generated summary)
- Timestamps for start and update time
- Cost for operations within this step
- Whether screenshots were captured
- An evaluation score and evaluation description if provided or computed
More details can be found on programmatically what a Step looks like here
🎬 Screenshots and Visual State
Steps may include a screenshot that visually captures what the agent was seeing or interacting with at that moment. These can be critical for understanding UI-based agents or browser actions.
- Displayed in the Step detail view
- Used for visual debugging
- Stored per-step and accessible via the dashboard
💥 Evaluating Steps
Steps can be scored individually using:
- User-provided scores: log your own logic-based rating of a step
- LLM-generated scores (planned): use limited context to judge a step’s quality
- Default heuristics: highlight stuck or repeated states
Warning: Step-level evaluation in isolation is often difficult without broader task context. We recommend pairing this with session-wide scoring or rubrics when possible.
🔄 Step Transitions & Workflow Graph
Steps link together to form the execution path of your agent:
In your dashboard’s Workflow Graph, Steps are:
- Represented as nodes
- Clustered automatically when the state is repeated (e.g. looping or retries)
- Connected by labeled edges that represent the agent’s actions
🔍 Grouping & Graph Insight
When multiple Steps share the same state and similar actions (e.g., retries or confusion loops), they’re automatically grouped in the graph view. This helps surface:
- Stuck behavior (e.g., trying the same thing repeatedly)
- Revisiting states (e.g., reloading a site, flipping between tabs)
- Structural patterns (e.g., loops, branches, dead ends)
📁 Events Inside Steps
Each Step may contain multiple Events — fine-grained units of work including:
- Tool invocations
- LLM calls
- API requests
- DOM scraping
- etc.
🧪 Debugging with Steps
Steps are the atomic unit of debugging:
- Replay exactly what the agent saw or did
- Inspect the agent’s thought process (via goal + action summaries)
- View logs and evaluations