From harnessml
Use when starting a new ML project or revisiting the scope of an existing one.
npx claudepluginhub msilverblatt/harness-ml --plugin harnessmlThis skill uses the workspace's default tool permissions.
Use when starting a new ML project or revisiting the scope of an existing one.
Conducts multi-round deep research on GitHub repos via API and web searches, generating markdown reports with executive summaries, timelines, metrics, and Mermaid diagrams.
Dynamically discovers and combines enabled skills into cohesive, unexpected delightful experiences like interactive HTML or themed artifacts. Activates on 'surprise me', inspiration, or boredom cues.
Generates images from structured JSON prompts via Python script execution. Supports reference images and aspect ratios for characters, scenes, products, visuals.
Use when starting a new ML project or revisiting the scope of an existing one.
Answer these questions. Write the answers down (in the experiment journal or project notes). These answers guide every decision that follows.
Not just "the target column" — the real-world outcome. "Whether a patient will be readmitted within 30 days." "The sale price of a house." "Which team wins." The framing determines everything: what features make sense, what metrics matter, what errors are costly.
What decisions does this model inform? A model that ranks loan applicants has different error costs than one that predicts equipment failure. Understanding the use case tells you whether false positives or false negatives are worse, whether calibration matters more than discrimination, whether you need point predictions or uncertainty intervals.
Not just a metric threshold — what would make this model useful? If the current process is a human guessing with 60% accuracy, 70% might be transformative. If the current model achieves 0.95 AUC, improving to 0.96 might not justify the effort. Anchor expectations before you start.
configure(action="init", project_dir="...", task_type="...", target_column="...", primary_metric="...")
Choose the task type and primary metric based on your answers above, not by default. If calibration matters, make a calibration metric primary. If ranking matters, use NDCG.
data(action="ingest", path="...")
data(action="inspect")
data(action="list_features")
Don't rush past this. Read the output. Note:
This initial inspection should generate your first hypotheses about what might matter. Write them down. Then load the eda skill to go deeper.