From loose-tasks
Publish or update a dataset on Kaggle from a local directory using the Kaggle CLI. Creates a new dataset on first publish, or pushes a new version to an existing one. Use when the user wants to "sync to Kaggle", "publish a Kaggle dataset", "update a Kaggle dataset", or similar.
npx claudepluginhub danielrosehill/claude-code-plugins --plugin loose-tasksThis skill uses the workspace's default tool permissions.
Use the Kaggle CLI (`kaggle`) to publish a local directory as a new Kaggle dataset, or push a new version to one that already exists.
Guides Next.js Cache Components and Partial Prerendering (PPR): 'use cache' directives, cacheLife(), cacheTag(), revalidateTag() for caching, invalidation, static/dynamic optimization. Auto-activates on cacheComponents: true.
Guides building MCP servers enabling LLMs to interact with external services via tools. Covers best practices, TypeScript/Node (MCP SDK), Python (FastMCP).
Share bugs, ideas, or general feedback.
Use the Kaggle CLI (kaggle) to publish a local directory as a new Kaggle dataset, or push a new version to one that already exists.
kaggle CLI installed. Check with which kaggle. If missing, install via pip install kaggle (or pipx install kaggle).
API token configured. The CLI accepts either form:
KAGGLE_API_TOKEN environment variable, or~/.kaggle/access_token file (mode 600), single line containing the token.If neither is present, ask the user for the token before continuing — do not invent one. Once provided, write it to ~/.kaggle/access_token with chmod 600 so subsequent invocations work without re-prompting:
mkdir -p ~/.kaggle && echo "<TOKEN>" > ~/.kaggle/access_token && chmod 600 ~/.kaggle/access_token
Authentication sanity check — confirm the token works before doing anything destructive:
kaggle datasets list --user <username> 2>&1 | head -5
A non-empty table = good. An auth error means re-prompt for the token.
Look for dataset-metadata.json in the target directory.
id field tells you the slug (<owner>/<dataset-name>).If updating, also confirm the slug exists on Kaggle:
kaggle datasets metadata -p /tmp/_kchk <owner>/<dataset-name> 2>&1
If that succeeds, the dataset is real and you're versioning. If it 404s, the slug doesn't exist yet — fall through to the "create new" path.
Confirm the directory contains the files the user wants to publish, and only those. Kaggle uploads everything in the directory non-recursively by default; pass -r for recursion if subdirectories are intentional. Anything you don't want published — drafts, secrets, large irrelevant files — must be excluded before the call.
Generate the metadata template inside the target directory:
kaggle datasets init -p <target-dir>
This writes dataset-metadata.json with placeholders.
Edit dataset-metadata.json:
title — human-readable name (≤50 chars, must be unique within the user's account).id — <owner>/<dataset-slug>. The slug is lowercase, hyphenated, ≤50 chars. Must not collide with an existing dataset on the account.licenses — usually [{"name": "CC0-1.0"}] for fully open, or whatever the user specifies. Confirm before assuming.keywords — optional list, helps discovery.subtitle, description — optional but recommended.Show the user the final metadata file before publishing — let them sanity-check the slug, title, and license.
Create the dataset:
kaggle datasets create -p <target-dir> # files at top level only
kaggle datasets create -p <target-dir> -r zip # zip the directory tree
kaggle datasets create -p <target-dir> -r tar # tar the directory tree
Use -r zip or -r tar only if there are subdirectories that need to be preserved. Otherwise omit -r.
Add -u (--public) to publish immediately as public. Without it the dataset is private; the user can flip visibility later in the Kaggle UI.
Confirm with the user that the dataset URL works: https://www.kaggle.com/datasets/<owner>/<dataset-slug>.
Ensure dataset-metadata.json is present in the target directory and the id field matches the live dataset slug. If the file is missing, fetch it:
kaggle datasets metadata -p <target-dir> <owner>/<dataset-name>
Refresh the directory contents to whatever should be in the new version. Same caveat: only what should be published belongs there.
Ask the user for a version-notes message. This is required and shows up in the dataset's version history. Examples: "Add deweathered series + cross-city panel", "Fix unit conversion in NO₂ column".
Push the new version:
kaggle datasets version -p <target-dir> -m "<version-notes>"
kaggle datasets version -p <target-dir> -m "<version-notes>" -r zip
Use -r zip/-r tar if the directory tree has meaningful subdirectories.
Add --dir-mode skip if you want to skip files that haven't changed (default is to re-upload everything in the directory). For most cases the default is fine.
Confirm by listing the dataset's versions:
kaggle datasets list -m --user <owner> 2>&1 | grep <dataset-slug>
Or open the dataset URL — Kaggle will show the new version under the "Data Explorer" → version dropdown after a short ingest delay (usually under a minute).
kaggle datasets create fails with a generic error if the slug already exists. Check with kaggle datasets metadata -p /tmp/_chk <owner>/<slug> before creating.-r zip or -r tar, files in subdirectories of the target dir are not uploaded. If a user reports "my data folder didn't make it", this is almost always why.-m. The CLI errors out without it. Don't pass an empty string — use a real description.title can be changed later via kaggle datasets metadata; the slug (id) is permanent — confirm with the user before creating.-u on create makes it public on first publish. To flip an existing dataset's visibility, the user has to do it in the Kaggle web UI — there's no CLI flag for it.Once the publish or version-push succeeds, tell the user:
private/ subfolder which I excluded — let me know if you wanted that included").