Steps

A Step is a unit of execution within a Session. It captures both a snapshot of the agent’s current state and the action taken to transition to the next state. Steps are the backbone of a session timeline and form the nodes in your workflow graph. Each Step may contain multiple underlying Events, such as tool calls, API requests, or LLM invocations.

Step Anatomy

Each Step records:
  • The agent’s state (optional string + LLM-generated summary)
  • The goal the agent is trying to accomplish (optional string + summary)
  • The action taken to transition to the next step (with optional LLM-generated summary)
  • Timestamps for start and update time
  • Cost for operations within this step
  • Whether screenshots were captured
  • An evaluation score and evaluation description if provided or computed
More details can be found on programmatically what a Step looks like in the examples or create_step API reference

Screenshots and Visual State

Steps may include a screenshot that visually captures what the agent was seeing or interacting with at that moment. These can be critical for understanding UI-based agents or browser actions.
  • Displayed in the Step detail view
  • Used for visual debugging
  • Stored per-step and accessible via the dashboard

Evaluating Steps

Steps can be scored individually using:
  • User-provided scores: log your own logic-based rating of a step
  • LLM-generated scores (planned): use limited context to judge a step’s quality
  • Default heuristics: highlight stuck or repeated states
lai.update_step(eval_score=0.3, eval_description="Tried to switch tabs but failed.")
Warning: Step-level evaluation in isolation is often difficult without broader task context. We recommend pairing this with session-wide scoring or rubrics when possible.

Step Transitions & Workflow Graph

Steps link together to form the execution path of your agent:
[state₀] --(action)--> [state₁] --(action)--> [state₂] ...
In your dashboard’s Workflow Graph, Steps are:
  • Represented as nodes
  • Clustered automatically when the state is repeated (e.g. looping or retries)
  • Connected by labeled edges that represent the agent’s actions

Grouping & Graph Insight

When multiple Steps share the same state and similar actions (e.g., retries or confusion loops), they’re automatically grouped in the graph view. This helps surface:
  • Stuck behavior (e.g., trying the same thing repeatedly)
  • Revisiting states (e.g., reloading a site, flipping between tabs)
  • Structural patterns (e.g., loops, branches, dead ends)

Events Inside Steps

Each Step may contain multiple Events — fine-grained units of work including:
  • Tool invocations
  • LLM calls
  • API requests
  • DOM scraping
  • etc.

Debugging with Steps

Steps are the atomic unit of debugging:
  • Replay exactly what the agent saw or did
  • Inspect the agent’s thought process (via goal + action summaries)
  • View logs and evaluations