From ship
Automates code shipping: commits changes, pushes branch, creates/updates PR, monitors GitHub CI/CD checks, fixes failures/review comments/conflicts until merge-ready.
npx claudepluginhub heliohq/ship --plugin shipThis skill is limited to using the following tools:
```bash
Automates end-to-end branch shipping: detects base, merges it, runs tests, audits coverage on changes, reviews diffs, bumps version, commits, pushes, creates PR.
Enforces GRFP-style iterative approval workflow for shipping: pre-ship reviews, commit strategies, changelog updates, human-gated merges after claudikins-kernel:verify.
Automates code shipping: merges base branch, runs detected tests, performs multi-review with auto-fixes, commits, pushes, creates PR. Use /ship or say 'ship it'.
Share bugs, ideas, or general feedback.
SHIP_PLUGIN_ROOT="${SHIP_PLUGIN_ROOT:-$(ship-plugin-root 2>/dev/null || echo "$HOME/.codex/ship")}"
SHIP_SKILL_NAME=handoff source "${SHIP_PLUGIN_ROOT}/scripts/preflight.sh"
Commit the related changes, push the branch, create or update the PR, then keep looping until GitHub checks are fully green and the PR is merge-ready.
Do not stop when the PR is created. Do not stop while any GitHub check is pending. If any GitHub check fails, fix the problem, push again, and wait again. If the PR is not merge-ready, sync with base or resolve conflicts inside the same fix loop.
Escalate to the user only for judgment decisions or after retry limits are exhausted.
Done means every condition in Completion is satisfied: the PR exists, checks are green with no relevant pending contexts, the PR is merge-ready with no unresolved conflicts or required branch update, and no actionable review or bot feedback remains.
(Full termination + escalation criteria in "Completion" at the bottom.)
Run this loop:
.github/workflows and current PR checks so you know what this repo treats as CI/CD.cancelled checks unless they block the repo's normal CI/CD path.Never:
--force-with-leasepending checks as "good enough"mergeStateStatus is still blockedgit add -A when unrelated local changes are presentUse TodoWrite to track your own progress through the handoff phases.
Create todos at the start based on what the repo actually needs.
Not every repo has a CHANGELOG, CI, or docs to update — only include
items for work that will actually happen.
Principle: one todo per phase the user would wait on. Fix rounds are dynamic — add them only when a check fails.
Example (repo with CHANGELOG and CI):
TodoWrite([
{ content: "Pre-flight (resolve branch and scope)", status: "in_progress", activeForm: "Resolving branch and scope" },
{ content: "Run local verification", status: "pending", activeForm: "Running local verification" },
{ content: "Update CHANGELOG and docs", status: "pending", activeForm: "Updating CHANGELOG and docs" },
{ content: "Push and create PR", status: "pending", activeForm: "Pushing and creating PR" },
{ content: "Wait for GitHub checks", status: "pending", activeForm: "Watching PR checks" }
])
Adaptations (not exhaustive — use judgment):
"Fix round N/3 — <issue summary>" with in_progressResolve only the context needed to ship the PR:
git status --short, git diff <base>...HEAD --stat, git diff --cached --stat, and git diff --stat.git add -A unless every dirty file belongs to this handoff.task_dir, use it. Otherwise do not guess one here; resolve it only if a later phase needs to write artifacts.Output a short start summary with the branch, base branch, and scope being shipped.
Before the first push in handoff, run the most relevant local verification available for this repo.
Output a short summary of what was run and whether it passed.
[ -f CHANGELOG.md ] || echo 'NO_CHANGELOG'
If CHANGELOG.md exists:
git log <base>..HEAD --onelinegit add CHANGELOG.md && git commit -m "docs: update CHANGELOG"After changelog handling, check whether the shipped changes also changed any user-facing or repo-facing documentation truths before the first push.
Use references/documentation.md for the documentation decision tree and
ownership rules.
When opening or updating the PR, keep the title and body concise.
Include only:
Do not invent a long template if the change is simple.
Push and create:
git push -u origin HEADtask_dir exists, write or update <task_dir>/handoff.md with:
PR URL, branch, base, verification commands/results, docs outcome,
current check summary, current mergeStateStatus, and fix-round count.
This file is the handoff evidence consumed by the stop gate.Output: [Handoff] PR created: <url>
Inspect .github/workflows, branch protection signals, and the current PR
checks once so you understand what this repo expects to run. A repo can
have required checks from GitHub Apps even when it has no local workflow
files, so never skip this phase based on .github/workflows alone.
Arm a Monitor, don't poll. gh pr checks --watch blocks locally and
polls GitHub itself every ~10s — you stay idle until checks terminate. This
replaces the older 30-second agent-side poll loop (which burned ~20
round-trips per 10-minute CI wait) with a single arm + single handle cycle.
Before arming, check whether a Monitor for this PR is already running — on resume after escalation the prior watch may still be alive. If so, wait for its event; do not arm a duplicate.
Arm the watch with persistent: true so it survives across fix rounds:
Monitor(
command: 'gh pr checks --watch; echo "TERMINAL exit=$?"',
description: "PR <number> checks settling",
persistent: true,
timeout_ms: 3600000
)
When a TERMINAL exit=<code> event arrives, pull the authoritative state
once:
# Full snapshot for interpretation
gh pr view --json state,statusCheckRollup,reviews,reviewDecision,mergeable,mergeStateStatus,comments
# Machine-readable check summary
gh pr checks --json name,state,bucket,link,workflow
# Read failing check logs if any. Prefer the failed run URL/check URL from
# the snapshot; use gh run view only after identifying the run id.
gh run view <run-id> --log-failed
Also inspect unresolved review threads when review comments may be
actionable. gh pr view --json comments,reviews is not enough because it
does not reliably expose thread resolution state.
OWNER=$(gh repo view --json owner --jq '.owner.login')
REPO=$(gh repo view --json name --jq '.name')
PR_NUMBER=$(gh pr view --json number --jq '.number')
gh api graphql -f owner="$OWNER" -f repo="$REPO" -F number="$PR_NUMBER" -f query='
query($owner: String!, $repo: String!, $number: Int!) {
repository(owner: $owner, name: $repo) {
pullRequest(number: $number) {
reviewThreads(first: 100) {
nodes {
id
isResolved
comments(first: 20) {
nodes {
id
author { login }
body
path
line
outdated
url
}
}
}
}
}
}
}'
Interpret the snapshot:
SUCCESS, NEUTRAL, or intentionally ignored
optional SKIPPED/CANCELLED → check gate greenFAILURE, ERROR, ACTION_REQUIRED, or failed check
bucket → Phase 6 fix loopmergeStateStatus is DIRTY, BEHIND, BLOCKED, DRAFT, or
UNKNOWN after one re-query → Phase 6 fix loop or escalationmergeable is CONFLICTING → Phase 6 fix loopCANCELLED checks → informational only when they are optional and do
not block normal CI/CDFallback. If no TERMINAL event fires within the 1h timeout, TaskStop
the monitor and escalate as an external GitHub wait — not a code fix
failure. Record which checks were still pending at escalation time so the
user can investigate on GitHub.
Re-entering the fix loop. When Phase 6 finishes pushing a fix, re-arm the Monitor (the previous one exited on the prior terminal event) and loop back to the event-wait above.
If CI failures, review comments, or merge conflicts exist, fix them. Max 3 rounds — after that, escalate.
In each fix round:
Re-read the current PR status on GitHub, including checks,
mergeStateStatus, and unresolved review threads.
If checks failed, inspect the failing check logs and fix the smallest real cause.
If review comments are actionable, fix mechanical or correctness issues.
If a comment requires product, security, or architecture judgment, escalate instead of guessing.
If mergeStateStatus reports conflicts, base drift, branch protection
blockage, or a repo policy requires an update from base, sync with base
and resolve it carefully.
Use this strategy:
git fetch origin <base-branch>.git rebase origin/<base-branch> when it can preserve a
clean linear history without disrupting collaborators. This is
always appropriate before the branch is pushed.git push --force-with-lease,
never plain --force.git merge --no-ff origin/<base-branch> (or the repo's equivalent update-branch
operation) so the fix can be pushed without rewriting review
history.For an already-pushed PR branch, prove the safety gates before rebasing. Default to not safe if any command fails, returns ambiguous output, or shows collaboration:
BRANCH=$(git branch --show-current)
BASE=$(gh pr view "$BRANCH" --json baseRefName --jq '.baseRefName')
PR_AUTHOR=$(gh pr view "$BRANCH" --json author --jq '.author.login')
ME=$(gh api user --jq '.login')
export PR_AUTHOR ME
# 1. Agent-owned branch naming convention.
case "$BRANCH" in
ship/*|codex/*) echo "agent-owned-name" ;;
*) echo "NOT_SAFE: branch name is not agent-owned"; exit 1 ;;
esac
# 2. No human approvals, change requests, comments, or review threads.
gh pr view "$BRANCH" --json reviews,comments --jq '
[
.reviews[]?.author.login,
.comments[]?.author.login
]
| map(select(. != "github-actions[bot]" and . != "dependabot[bot]"))
| map(select(. != env.PR_AUTHOR and . != env.ME))
| length == 0
' || { echo "NOT_SAFE: human review/comment signal"; exit 1; }
# 3. No unresolved review threads from humans.
OWNER=$(gh repo view --json owner --jq '.owner.login')
REPO=$(gh repo view --json name --jq '.name')
PR_NUMBER=$(gh pr view "$BRANCH" --json number --jq '.number')
gh api graphql -f owner="$OWNER" -f repo="$REPO" -F number="$PR_NUMBER" -f query='
query($owner: String!, $repo: String!, $number: Int!) {
repository(owner: $owner, name: $repo) {
pullRequest(number: $number) {
reviewThreads(first: 100) {
nodes {
isResolved
comments(first: 20) {
nodes { author { login } }
}
}
}
}
}
}' --jq '
[
.data.repository.pullRequest.reviewThreads.nodes[]?
| select(.isResolved == false)
| .comments.nodes[]?.author.login
]
| map(select(. != "github-actions[bot]" and . != "dependabot[bot]"))
| map(select(. != env.PR_AUTHOR and . != env.ME))
| length == 0
' || { echo "NOT_SAFE: unresolved human review thread"; exit 1; }
# 4. No other commit authors on this PR branch.
git fetch origin "$BASE"
MY_EMAIL=$(git config user.email)
UNEXPECTED_AUTHORS=$(git log --format='%ae' "origin/$BASE..HEAD" | \
sort -u | grep -vxF "$MY_EMAIL" || true)
[ -z "$UNEXPECTED_AUTHORS" ] || {
echo "NOT_SAFE: unexpected commit authors: $UNEXPECTED_AUTHORS"
exit 1
}
# 5. Repo appears to prefer/require linear history.
OWNER=$(gh repo view --json owner --jq '.owner.login')
REPO=$(gh repo view --json name --jq '.name')
gh api "repos/$OWNER/$REPO" --jq '
(.allow_rebase_merge == true or .allow_squash_merge == true)
and (.allow_merge_commit == false)
' || { echo "NOT_SAFE: repo does not clearly require linear history"; exit 1; }
Only when all gates are proven safe may the agent run:
git rebase "origin/$BASE"
<relevant local verification command>
git push --force-with-lease
Do not resolve conflicts mechanically with --ours or --theirs unless
one side is clearly disposable.
Read both sides of the conflict and preserve the behavior this PR is trying to ship. If both sides contain valid changes, merge them.
If you cannot resolve the conflict confidently, escalate instead of guessing.
After any code change, run the relevant local verification again.
Commit the fix and push it.
Update <task_dir>/handoff.md if task_dir exists.
If the push fully addresses GitHub feedback, mark the addressed feedback as resolved:
RESOLVEDNever resolve, hide, or minimize feedback that is only partially addressed or still needs user judgment.
Go back to Phase 5.
Use GitHub GraphQL when needed:
# Resolve a PR review thread
gh api graphql -f query='
mutation($threadId: ID!) {
resolveReviewThread(input: {threadId: $threadId}) {
thread { id isResolved }
}
}' -F threadId="<thread-id>"
# Hide/minimize an obsolete bot or workflow comment as resolved
gh api graphql -f query='
mutation($subjectId: ID!) {
minimizeComment(input: {subjectId: $subjectId, classifier: RESOLVED}) {
minimizedComment { isMinimized }
}
}' -F subjectId="<comment-node-id>"
Output: [Handoff] Fix round <i>/3 — <what was fixed>. Tests pass. Re-checking CI...
Output the report card (read skills/shared/report-card.md for the standard format):
## [Handoff] Report Card
| Field | Value |
|-------|-------|
| Status | <DONE / BLOCKED> |
| Summary | PR #<N> — checks <green / pending / failed>, merge <ready / blocked> |
### Metrics
| Metric | Value |
|--------|-------|
| PR URL | <url> |
| Check status | <green / N passing, M failed> |
| Merge state | <mergeStateStatus> |
| Fix rounds | <N>/3 |
| Docs outcome | <updated / checked-no-update / debt-noted> |
### Artifacts
| File | Purpose |
|------|---------|
| PR on GitHub | Shipped code |
| .ship/tasks/<task_id>/handoff.md | PR URL, checks, merge state, verification, docs outcome |
| CHANGELOG.md | Updated changelog (if repo has one) |
Condensed to show the loop shape. The full log would include the same verify/commit/push pattern after every fix round.
[Handoff] Start — branch feat/auth, base main, 4 files + 2 doc edits
[Handoff] Verify → npm test, npm run lint: PASS
[Handoff] CHANGELOG entry added, README updated
[Handoff] Push, PR created: https://github.com/org/repo/pull/123
[Handoff] Wait → ci/test FAILURE
[Handoff] Fix round 1/3 — added nil guard, re-verify PASS, push
[Handoff] Wait → AI review: requested error-path coverage
[Handoff] Fix round 2/3 — added error-path test, re-verify PASS, push
resolved review thread, minimized obsolete bot comment
[Handoff] Wait → all checks green
[Handoff] Merge state → CLEAN
[Handoff] DONE — PR #123 green and merge-ready
Key invariants the example preserves:
Done when:
mergeStateStatus is merge-ready (CLEAN, HAS_HOOKS, or UNSTABLE
only when all failing checks are irrelevant/non-blocking)mergeable is not CONFLICTING, and there are no unresolved merge
conflicts in the local worktreeEscalate when: