Overview
The dataset functions allow you to programmatically work with test datasets, enabling systematic validation and regression testing of your AI agents.Core Functions
get_dataset
Retrieves a dataset configuration and its items.Parameters
The unique identifier of the dataset to retrieve.
Specific version number to retrieve. If not provided, returns the latest version.
Returns
Returns a dictionary containing:id: Dataset identifiername: Dataset namedescription: Dataset descriptionitems: List of dataset itemstags: Associated tagsversion: Version numbercreated_at: Creation timestamp
get_dataset_items
Retrieves all items from a specific dataset.Parameters
The unique identifier of the dataset.
Filter items by specific tags.
Returns
Returns a list of dataset items, each containing:id: Item identifiername: Test case nameinput: Input data for the testexpected_output: Expected result (if defined)metadata: Additional item metadatatags: Item-specific tags
Working with Dataset Items
Initialize Session with Dataset Item
When running individual dataset items, link them to sessions:Best Practices
-
Version Control: Track dataset versions when running tests
-
Meaningful Names: Use descriptive names for test cases
- Comprehensive Coverage: Include both positive and negative test cases
- Parallel Execution: Use parallel execution for large datasets to save time
-
Result Tracking: Store test results for trend analysis
Related Functions
init- Initialize sessions with dataset itemscreate_mass_sim- Run large-scale dataset testsget_prompt- Retrieve prompts for testing