Help us improve
Share bugs, ideas, or general feedback.
From sagemaker-ai
Validates dataset formatting and quality for SageMaker model fine-tuning (SFT, DPO, RLVR). Detects file format, checks schema compliance, and reports readiness for training or evaluation.
npx claudepluginhub awslabs/agent-plugins --plugin sagemaker-aiHow this skill is triggered — by the user, by Claude, or both
Slash command
/sagemaker-ai:dataset-evaluationThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Follow the workflow shown below. Locate the dataset, check the file type, and resolve any issues with missing files or wrong file types. Determine the fine-tuning model and fine-tuning strategy. Run scripts/format_detector.py to evaluate whether the file is formatted correctly for the currently selected model and strategy. Summarize the results: is the dataset ready for fine-tuning?
Data validation and pipeline testing utilities for ML training projects. Validates datasets, model checkpoints, training pipelines, and dependencies. Use when validating training data, checking model outputs, testing ML pipelines, verifying dependencies, debugging training failures, or ensuring data quality before training.
Generates Jupyter notebooks to transform datasets between ML training/evaluation schemas (OpenAI, SageMaker, HuggingFace, Bedrock, VERL, custom JSONL).
Uploads, validates, and manages datasets for DataRobot projects. Handles file uploads, data quality checks, schema review, and prediction dataset preparation.
Share bugs, ideas, or general feedback.
Follow the workflow shown below. Locate the dataset, check the file type, and resolve any issues with missing files or wrong file types. Determine the fine-tuning model and fine-tuning strategy. Run scripts/format_detector.py to evaluate whether the file is formatted correctly for the currently selected model and strategy. Summarize the results: is the dataset ready for fine-tuning?
Locate Dataset:
Determine strategy and model:
Check File Formatting: Run the tool format_detector.py to make sure the file conforms to formatting requirements.
Summarize Results: Tell the user if their data is ready
references/strategy_data_requirements.md# With the file path argument identified in workflow step 1
python scripts/format_detector.py local_path/to/dataset