Sessions
A Session represents a single run of an AI agent as it attempts to complete a task. It captures the full trace of the agent’s behavior — from initial state to final outcome — and provides deep visibility into each decision, action, and outcome along the way.What is a Session?
A Session is a top-level unit of observability. It begins when an agent starts a task and ends when the task is either completed, failed, or interrupted. Learn to create a session with our Quick Start guide or see detailed examples Within each Session UI in Lucidic or what we call workflow sandbox, you’ll find:- A timeline of Steps (agent states and transitions)
- Fine-grained Events (e.g. LLM calls, API responses) inside each Step
- A workflow graph visualizing the session’s trajectory
- A GIF/video replay of the agent’s interaction (if available)
- Evaluation scores, cost metrics, metadata, and more
Sessions are non-deterministic by nature. Re-running the same task may lead to different outcomes due to the agent’s autonomy and stochasticity.
Session Structure
Step and Event Hierarchy
A Session is composed of ordered Steps, each representing a unique state and action taken by the agent. Each Step contains multiple Events, representing granular atomic operations like LLM calls or API hits.Session Dashboard or Workflow Sandbox
Each Session in the workflow sandbox includes:- Overview metrics: duration, step count, cost, completion status
- Visual replay: full session video or GIF (if captured)
- Interactive workflow graph: clickable nodes for each step
- Step-by-step explorer: deep dive into state/action/event logs
- Evaluation scores: human or LLM-generated
Evaluation and Scoring
Sessions can be evaluated via multiple systems:1. User-Provided Eval Score
Passed in at runtime using the Python SDK. Useful for customized or domain-specific scoring.2. Rubric-Based Evaluation
Define structured, weighted rubrics with multiple criteria and score definitions.3. Default Evaluation
Our built-in system outputs:- A score (0 or 1) based on task completion
- A list of “wrong” or suboptimal steps (automatically identified)
- Graph highlights of failure points
Debugging with Sessions
Sessions give you rich insight into how your agent operates and where it goes wrong:- Replay stuck behavior via video
- Identify high-cost operations
- Trace decision-making back to specific prompts or events
- Group failing steps into unified nodes in the graph
- Track revisited states and looping patterns
Metadata and Tagging
Each Session contains rich metadata including:- Task description
- Agent identity and version
- Cost
- Timestamps
- Link to parent Mass Simulation (if applicable)
- Replay data and graph reference