🪜 Steps

A Step is a unit of execution within a Session. It captures both a snapshot of the agent’s current state and the action taken to transition to the next state. Steps are the backbone of a session timeline and form the nodes in your workflow graph.

Each Step may contain multiple underlying Events, such as tool calls, API requests, or LLM invocations.


🧬 Step Anatomy

Each Step records:

  • The agent’s state (optional string + LLM-generated summary)
  • The goal the agent is trying to accomplish (optional string + summary)
  • The action taken to transition to the next step (with optional LLM-generated summary)
  • Timestamps for start and update time
  • Cost for operations within this step
  • Whether screenshots were captured
  • An evaluation score and evaluation description if provided or computed

More details can be found on programmatically what a Step looks like here


🎬 Screenshots and Visual State

Steps may include a screenshot that visually captures what the agent was seeing or interacting with at that moment. These can be critical for understanding UI-based agents or browser actions.

  • Displayed in the Step detail view
  • Used for visual debugging
  • Stored per-step and accessible via the dashboard

💥 Evaluating Steps

Steps can be scored individually using:

  • User-provided scores: log your own logic-based rating of a step
  • LLM-generated scores (planned): use limited context to judge a step’s quality
  • Default heuristics: highlight stuck or repeated states
lai.update_step(eval_score=0.3, eval_description="Tried to switch tabs but failed.")

Warning: Step-level evaluation in isolation is often difficult without broader task context. We recommend pairing this with session-wide scoring or rubrics when possible.


🔄 Step Transitions & Workflow Graph

Steps link together to form the execution path of your agent:

[state₀] --(action)--> [state₁] --(action)--> [state₂] ...

In your dashboard’s Workflow Graph, Steps are:

  • Represented as nodes
  • Clustered automatically when the state is repeated (e.g. looping or retries)
  • Connected by labeled edges that represent the agent’s actions

🔍 Grouping & Graph Insight

When multiple Steps share the same state and similar actions (e.g., retries or confusion loops), they’re automatically grouped in the graph view. This helps surface:

  • Stuck behavior (e.g., trying the same thing repeatedly)
  • Revisiting states (e.g., reloading a site, flipping between tabs)
  • Structural patterns (e.g., loops, branches, dead ends)

📁 Events Inside Steps

Each Step may contain multiple Events — fine-grained units of work including:

  • Tool invocations
  • LLM calls
  • API requests
  • DOM scraping
  • etc.

🧪 Debugging with Steps

Steps are the atomic unit of debugging:

  • Replay exactly what the agent saw or did
  • Inspect the agent’s thought process (via goal + action summaries)
  • View logs and evaluations