From create-dataset
Creates evaluation datasets for Dokimos in JSON, CSV, or JSONL formats for LLM evaluation, test data, experiments, and format conversions.
npx claudepluginhub dokimos-dev/dokimos --plugin create-datasetThis skill uses the workspace's default tool permissions.
Create an evaluation dataset for Dokimos. The user will describe the dataset purpose and content via `$ARGUMENTS`.
Creates, manages, and uploads evaluation datasets to LangSmith using CLI and SDK. Handles types like final_response, single_step, trajectory, RAG for LLM testing.
Scaffolds JUnit parameterized tests for LLM evaluations using dokimos-junit and @DatasetSource. Enables eval-driven development with datasets as test cases in CI.
Scaffolds Dokimos Experiments wiring datasets, tasks, evaluators, and reporters for LLM evaluation pipelines, model testing, and end-to-end eval workflows.
Share bugs, ideas, or general feedback.
Create an evaluation dataset for Dokimos. The user will describe the dataset purpose and content via $ARGUMENTS.
dokimos-core/src/main/java/dev/dokimos/core/Dataset.javadokimos-core/src/main/java/dev/dokimos/core/DatasetParser.javadokimos-core/src/main/java/dev/dokimos/core/Example.javadokimos-examples/src/main/resources/datasets/Before creating a dataset, read Dataset.java and DatasetParser.java to understand supported formats.
The standard format. Structure:
{
"name": "Dataset Name",
"description": "What this dataset evaluates",
"examples": [
{
"input": "What is 2+2?",
"expectedOutput": "4"
},
{
"inputs": { "input": "complex query", "context": "additional info" },
"expectedOutputs": { "output": "expected answer" },
"metadata": { "category": "math", "difficulty": "easy" }
}
]
}
Two forms are supported:
input and expectedOutput as top-level stringsinputs, expectedOutputs, and metadata as maps (for multi-field examples)Headers must include input. Optional: expectedOutput (or expected_output or output). Extra columns become metadata.
input,expectedOutput,category
What is 2+2?,4,math
Capital of France?,Paris,geography
One JSON object per line. Same structure as JSON examples but without the wrapper:
{"input": "What is 2+2?", "expectedOutput": "4"}
{"input": "Capital of France?", "expectedOutput": "Paris"}
$ARGUMENTSsrc/test/resources/datasets/dokimos-examples/src/main/resources/datasets/// From file
Dataset dataset = Dataset.fromJson(Path.of("path/to/dataset.json"));
Dataset dataset = Dataset.fromCsv(Path.of("path/to/dataset.csv"));
Dataset dataset = Dataset.fromJsonl(Path.of("path/to/dataset.jsonl"));
// From classpath (in tests)
// Use @DatasetSource("classpath:datasets/my-dataset.json") with dokimos-junit