🧪 Workflow Sandbox

The Workflow Sandbox is your deep-dive view into a single agent run — also known as a Session. Learn to create a session here.

This is where observability meets usability: you can replay the session, inspect every step, explore the underlying events, and evaluate performance using rubrics — all in one place.

To get here Click Project -> Agent -> any session in the session history.


🧠 What It Shows

At the top of the Workflow Sandbox, you’ll see:

1. 📝 Task Overview

  • The task string supplied to the agent at the start of the session

2. 📊 Session Stats

  • Session Evaluation - Your overall session evaluation
  • Is Successful - Whether the session was successful or not
  • Cost - The cost of the session
  • Time - The time taken to complete the session
  • Total Steps - The total number of steps taken in the session
  • Repeated Steps - The number of repeated steps in the session
  • Average Steps per node - The average number of steps per node in the session (this is a proxy for how many times the agent retried a step)

📷


🧠 Pro tip: Hover over the session evaluation or the success/failure flag to see a the description!!

🧮 Rubric Evaluations

Below the overview, you’ll find any evaluation rubrics attached to this session. Read more about rubrics here.

  • Each criterion is scored and explained
  • The Lucidic default rubric runs on every session automatically
  • Custom rubrics can be attached per Agent when running a session

🧠 Pro tip: The description of why you got a certain score for a given criterion is below the score, so you can actually see why each of your criteria got a certain score!

📷


📚 Quick Access: Prompt DB + Session Replay GIF

Under the rubric evaluations, you can open the Prompt DB directly from the session by clicking the prompt name or the Prompt Editor button:

  • Quickly view which prompt was used
  • Test changes or view version history
  • Add tags to quickly iterate on prompts

On the right of the Prompt DB you can also see a replay GIF of the session:

📷


🧭 Session Replay: Step Trajectory Graph

Next, you’ll see a Session-specific graph — a mini version of the Workflow Trajectory, scoped to just this one session.

  • Nodes: states visited
  • Edges: actions taken
  • Grouped: similar states are clustered (e.g. looping or retrying)

🧠 Pro tip: The step which caused your agent to fail will be highlighted in red! Hover over the error icon in the top right of the node to quickly see a failure explanation.

Note: If you want to turn off error highlighting, click the alarm icon in the bottom left toolbar of the graph.

📷

📷

🧠 Pro tip: Clicking a Step/Event opens a full breakdown of that Step/Event below the graph!


🕰 Timeline Navigation

Two synced timelines appear below:

🪜 Step Timeline

  • Shows the ordered sequence of Steps
  • Clicking a Step:
    • Advances the graph to that state
    • Opens the step detail panel
  • Step detail panel shows:
    • Basic Information: Start Time, Duration, Cost
    • Evaluation: Status, Step Evaluation Score, Evaluation Description
    • State Overview: Goal, Action, State
    • State Screenshot

📷

🧠 Pro tip: Use arrow keys to quickly go from one step to the next.

⚙️ Event Timeline (within Step)

  • Shows all Events that occurred in that step (e.g. LLM calls, tool invocations)
  • Clicking an Event shows:
    • Basic Information: Status, Model, Start Time, Duration, Cost
    • Prettified vs Raw Request/Response (click the wand/text icon in the top right of Input/Output to toggle)
    • Images used (if any) in the request

🧠 Pro tip: SPEED UP iteration and debugging time by avoiding rerunning the same failed run with Time Travels. Time Travels, in the Time Travel tab to (to the right of “Event Details” which is directly under the two timelines), are extremely useful for debugging failed runs. Time Travels enable you to freeze time and rerun the event in isolation to see if your error/success is reliable. Learn more how to use them here. 📷

🧠 Pro tip: If your LLM calls are super cluttered, use our pretty text which is built in! It’s the wand and will show you your text in a more human-readable format.


🧰 Use Cases

The Workflow Sandbox is where you:

  • Debug a failed run
  • Investigate edge cases
  • Replay odd behavior
  • Evaluate agent consistency
  • Identify root causes of errors
  • Understand how a prompt or model change affected logic