From digital-innovation-agents
Reverse-engineers existing codebases into V-Model artifacts (plan-context, ADRs, arc42, FEATURE inventory, backlog). Produces evidence-based documentation sourced from code and docs.
How this skill is triggered — by the user, by Claude, or both
Slash command
/digital-innovation-agents:reverse-engineeringThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
RE bootstraps the entire backlog in one run, so it is the exception
RE bootstraps the entire backlog in one run, so it is the exception
to the per-item branching rule. All RE artefacts land on
feature/reverse-engineer-<repo-name>. Per-item branches kick in
AFTER RE merges.
Branch check at start:
main / master / dev: refuse; AskUserQuestion to create
the RE branch and switch.RE does not create per-item GitHub issues or per-item phase tags
during Phase 0-7. After RE, /dia-guide runs the one-shot pass
that creates issues and tags <item-id>/reverse-engineered.
State in .git/dia-active-skill. Full rules:
skills/project-conventions/references/team-workflow.md and
branch-protection.md.
Reverse engineering scans existing code to produce artifacts. Every artifact this skill creates lands in one of these categories:
The skill assigns each artifact to its category before writing.
Frontmatter feature: and epic: are mandatory for FIX and IMP.
Every artifact this skill creates also lands as a backlog row in
_devprocess/context/BACKLOG.md. Status, phase, last-change, and
claim live in the row, NOT in the artifact frontmatter.
Defaults for reverse-engineered artifacts. The BACKLOG
Status column uses the GitHub-aligned vocabulary
(Backlog | Ready | In Progress | In Review | Done). The ADR
frontmatter and the BA frontmatter carry their own status fields.
| Item | BACKLOG Status default | Frontmatter status | BACKLOG Phase |
|---|---|---|---|
| Feature observed in code (shipped) | Done | (none) | Released |
| Feature observed in code (partial) | In Progress | (none) | Building |
| ADR inferred | In Progress | Proposed (ADR / MADR) | Building |
| BA draft | Backlog | Draft (Reverse-Engineered) | (n/a) |
Reverse-engineered features marked Done go straight to phase
Released so the post-RE BA validation walk picks them up
correctly.
Sync chain (binding order):
/consistency-check mode A at the end of the skill phaseWayfinder (templates under skills/architecture/templates/):
src/ARCHITECTURE.map: one row per entry-point file.src/ directory with more than 3 source
files or any cross-module API.Rules layer at _devprocess/rules/ (hard cap 500 lines total):
technical.md: stack (from manifests), build commands, test setup,
conventions visible in 10+ files.design.md (if UI surface exists): tokens, component patterns.domain.md: glossary from class/module names, invariants in code.You ingest an existing codebase and produce the V-Model artifacts that should have existed from day one, so the team gets a stable, shared project context. You walk the V backwards, from Coding up through Architecture, Requirements, and Business Analysis, and fill each level only with what can be proven from the code or from existing documentation.
The result is not a product. It is a foundation: a set of artifacts every team member can trust, ready to be validated and carried forward through the normal V-Model phases.
Writing style. See skills/project-conventions/SKILL.md#canonical-specs (Writing style). Applies to every artifact this skill produces.
Three-layer model and frontmatter spec. See skills/project-conventions/SKILL.md#canonical-specs (Three-layer model boundaries, Frontmatter spec). Status, phase, last-change, and claim live in the backlog row, not in artifact frontmatter.
Work outside a Feature is either FIX-{ee}-{ff}-{nn} (bug) at
_devprocess/requirements/fixes/ or IMP-{ee}-{ff}-{nn} (other) at
_devprocess/requirements/improvements/. Both require frontmatter
feature: and epic:. Frontmatter spec: see canonical specs link
above.
Any artifact MAY carry depends-on: [ID, ID, ...] in frontmatter.
The resulting graph is acyclic; targets must be existing IDs.
Epic hypothesis statements and How-Might-We headings are full prose
paragraphs in the user's working language, not leftover template
placeholders (FOR, WHO, THE, IS A, THAT, UNLIKE,
OUR SOLUTION). The persona / problem / solution / differentiation
structure stays in the substance.
Backward walk, evidence only. Code tells you what exists. It does
not tell you whether it solves the right problem. You do not invent
personas, HMW questions, or value propositions from endpoint names or
directory layouts. If a claim is not backed by a concrete source
(path:line for code, doc:section for documentation), it becomes a
[NEEDS USER INPUT] placeholder instead of a guess.
Draft, not ground truth. Everything this skill produces is marked
as draft / observed / inferred / snapshot. The next skill (/business-analysis)
validates each claim with the user and promotes the status to
Validated or Accepted one section at a time.
Forward again from the validated state. After reverse engineering,
the user goes through /business-analysis → /requirements-engineering
→ /architecture (if refactoring) → /coding. The reverse-engineered
artifacts become the Phase 0 state for that forward walk.
_devprocess/requirements/handoff/plan-context.md. Tech stack and
codebase snapshot, ready for /coding._devprocess/architecture/ADR-{XXX}-{slug}.md. One per observable
architecture decision, Status: Inferred from codebase._devprocess/architecture/arc42.md. Structural snapshot,
Status: Reverse-engineered snapshot._devprocess/requirements/epics/EPIC-{nn}-{slug}.md. One or more
anticipated Epics grouping observed capabilities by theme
(Status: Anticipated (not yet validated)). /business-analysis
and /requirements-engineering later refine, split, merge, or
rename._devprocess/requirements/features/FEAT-{ee}-{ff}-{slug}.md. One
per observable user-facing capability
(Status: Observed (not validated)), nested under its Epic._devprocess/analysis/BA-{PROJECT}.md. Project-BA draft only
(singleton). Item-BAs are created by /business-analysis._devprocess/context/BACKLOG.md (TODOs,
FIXMEs, gaps, tech debt, undocumented dependencies)./business-analysis and /requirements-engineering./requirements-engineering fills the rest after BA validation).These rules are non-negotiable. Every artifact this skill writes must comply with them, and the Quality Gates at the end check that they were followed.
Source: line. Trivial restatements of an
already-cited fact do not need their own Source line. The BA draft
(Phase 4) is the exception: every non-placeholder sentence there
still carries Source:. Format:Source: src/api/auth/handlers.ts:42-58Source: README.md § "Getting Started"Source: package.json "dependencies.prisma"[NEEDS USER INPUT. No evidence found in {searched sources}.
/business-analysis will fill this in.]
You do not write a "reasonable assumption" in its place.
No persona from code structure. You never infer personas from route names, directory names, or endpoint signatures. Endpoints are technical facts, not user research. Personas come only from explicit statements in documentation (README, marketing copy, docs/, CHANGELOG). If docs mention no user types, the persona section is a placeholder.
No HMW question without an explicit problem statement. If the existing documentation nowhere states the problem the product solves, the HMW section is a placeholder.
Provenance marker on every file. Each file carries a
source: /reverse-engineering on {date} marker in its frontmatter
(BA/ADR additionally keep their own status field; the BACKLOG row
owns lifecycle status per the three-layer model).
One decision per ADR (with tight-coupling exception). Default to one decision per ADR. Tightly coupled choices that share the same Context and Consequences MAY be combined into one ADR; keep them split when Context or Consequences diverge.
You walk backwards through the V, one phase at a time. Each phase produces one or more artifacts before you move up to the next.
Probe the project for existing workflow residues before any scan. Greenfield projects skip this phase.
Signals to check:
docs/ or _devprocess/FEATURE-NNNN (4-digit), EPIC-NNN/ADR-NNN
(3-digit), status:/phase: in YAML, > **Status**: ... lines,
_devprocess/context/fixes/, _devprocess/context/20_bugs.md,
numeric-prefixed 10_backlog.md, any archive/ folderIf any signal hits, stop and ask the user via AskUserQuestion:
"Existing workflow artifacts under {paths}. Proceed how? (a) normalize to current DIA conventions first (Phase -1.5 runs the migration scripts), then reverse-engineer gaps; (b) keep untouched, produce new artifacts alongside (flagged as separate source); (c) replace with reverse-engineered versions (destructive)."
Recommend (a) for DIA v1 patterns, (b) for non-DIA workflows worth preserving, (c) only when the user confirms existing artefacts are obsolete.
Runs only if Phase -1 chose option (a). Shares the canonical
migration mechanics with /dia-migration (which confirms phase by
phase; RE runs them as one consolidated pass).
Sequence (each script idempotent):
tools/migration/detect_state.py -- inventory v1/v2/mixed signals.strip_frontmatter_status.py -- pull status: / phase: /
last_updated: out of YAML.strip_body_status.py -- pull > **Status**: ... lines.migrate_naming.py -- rename ID schemas, rewrite cross-refs.flatten_analysis.py -- collapse analysis/ to the four
canonical prefixes (BA, EXPLORE, RESEARCH, AUDIT).build_backlog.py -- regenerate BACKLOG.md.migrate_skill_names.py -- rewrite legacy skill names in
CLAUDE.md / README / inline scripts.Numbering collisions. If two ADR series coexist, the series with the higher count of external references in code/commits/backlog wins; renumber the smaller series with a note in the renumbered ADR header.
Dedup. Two files describing the same topic: merge under the newer structure and add a "Previous variants" note. No silent deletes.
Ask the user which scope applies, same tiers as /business-analysis:
What is the scope of this reverse-engineering run?
A) Simple Test / single-feature onboarding
-> Scan the affected module, produce minimal artifacts
-> Timeframe: 30-60 min
B) Proof of Concept / small repo
-> Full tech-stack extraction, 3-8 ADRs, 5-15 features, BA draft
-> Timeframe: 1-3 h
C) Minimum Viable Product / full project onboarding
-> Full arc42 snapshot, 8+ ADRs, 15+ features, full BA draft,
complete backlog seed
-> Timeframe: 3-8 h
Then scan the codebase structure and list:
package.json, pyproject.toml,
Cargo.toml, go.mod, pom.xml, Gemfile)main.*, app.*, index.*, src/index.*).github/workflows/*, .gitlab-ci.yml, etc.)README.md, docs/, CHANGELOG.md,
CONTRIBUTING.md, ARCHITECTURE.md)Report this as a Codebase Map before proceeding. This is the inventory you will draw sources from for the rest of the walk.
Extract the concrete tech stack from the manifests and entry points.
One Sources: line per Tech Stack block is sufficient (not per row):
## Tech Stack
- **Runtime:** Node.js >=20
- **Language:** TypeScript 5.4
- **Framework:** Next.js 14 App Router
- **Database:** PostgreSQL via Prisma
- **Auth:** NextAuth 5.x
- **Testing:** Vitest + Playwright
Sources: package.json, tsconfig.json, prisma/schema.prisma, vitest.config.ts, e2e/
Write the result into _devprocess/requirements/handoff/plan-context.md
using the same structure the /architecture skill produces, with the
header:
---
status: Snapshot from existing code
source: /reverse-engineering on {date}
---
The Codebase Layout, Conventions, and Existing Patterns sections
of plan-context.md are filled from the scan in Phase 0.
Walk through the codebase and identify decisions that are visible and consequential. For each, write one ADR in MADR format with:
Status: Inferred from codebase in the frontmatterContext: what you see in the code that implies this decision was
made (with source)Decision: the observable choiceAlternatives considered: leave as [NEEDS USER INPUT, not visible in code] unless the alternatives are mentioned in a comment or docConsequences: only the ones you can see (e.g. lock-in, operational
implications that are visible in CI config)Source: footer with all files/lines that support the decisionWhen to write an ADR. Only when the decision is consequential AND non-obvious from framework defaults. Skip the rest.
Write ADRs to _devprocess/architecture/ADR-{XXX}-{slug}.md, numbered
in the order you discovered them.
Then produce _devprocess/architecture/arc42.md as a snapshot.
Fill only the sections you can back with sources:
Header of arc42:
---
status: Reverse-engineered snapshot
source: /reverse-engineering on {date}
---
Identify observable user-facing capabilities. A feature is anything the system lets a user (or an API consumer) do. Sources:
describe('user can ...'), it('admin should ...'))Step 3a: Anticipated Epics. Before writing FEATURE files, group
the observable capabilities into 1-N thematic clusters (e.g. by
domain, module, user group). For each cluster, write an Epic
placeholder at _devprocess/requirements/epics/EPIC-{nn}-{slug}.md
from EPIC-TEMPLATE.md with:
---
status: Anticipated (not yet validated)
source: /reverse-engineering on {date}
needs-validation: true
---
# EPIC-{nn}: {thematic name, e.g. "User and access management"}
> **Status**: Anticipated. Derived from observed capabilities,
> not from a validated business motivation. `/business-analysis`
> refines or replaces the Hypothesis Statement and outcomes.
## Anticipated Scope
{1-2 sentences: which observed capabilities this epic groups, and why}
## Evidence
- {module or directory, short description}
- {route or API surface}
- {test file that describes this capability cluster}
When no obvious clusters exist, create a single catch-all
EPIC-01-observed-capabilities.md. Split later.
Step 3b: FEATURE files. For each observable capability, write
_devprocess/requirements/features/FEAT-{ee}-{ff}-{slug}.md
using the existing FEATURE-TEMPLATE.md but with reduced scope.
{EPIC} is the 2-digit number of the anticipated Epic the feature
belongs to, {NNN} is the local counter inside that Epic.
---
status: Observed (not validated)
source: /reverse-engineering on {date}
---
# FEAT-{ee}-{ff}: {short name}
## Feature Description
{What the code does, in 2-3 sentences.}
Source: {file paths and line ranges that implement this feature}
## Benefits Hypothesis
[NEEDS USER INPUT. /requirements-engineering will define this
after /business-analysis has validated the WHY.]
## User Stories
[NEEDS USER INPUT]
## Success Criteria
[NEEDS USER INPUT]
## Technical NFRs
{Any non-functional constraints visible in code: rate limits, timeout
settings, retry policies, auth requirements.}
Source: {config or middleware locations}
Keep FEATURE names short and capability-focused ("User login", "Project export", "Admin user management"). Do not lump multiple capabilities into one feature.
Step 3c: Observable Success Criteria. Write one SC per observable capability with three columns:
[AWAITING BA] unless the code itself declares a
deterministic target (timeout constants, rate limits, perf
assertions); then the observed target goes in with Source:.Example table:
| ID | Kriterium (observable) | Target | Messung |
| ----- | ----------------------- | ------------------ | -------------------------- |
| SC-01 | Nutzer kann Unterhaltung erneut oeffnen | [AWAITING BA] | Pilot-Interview |
| SC-02 | Startup-Abbruch wenn Sandbox nach 30s nicht bereit | 30s (Source: src/main/index.ts:1088) | Integration-Test |
This satisfies invariant N-4 (every feature has at least one SC).
/business-analysis later fills [AWAITING BA] with validated
business targets.
This is the most constrained phase. Read:
README.md for intro, use cases, motivationdocs/ or documentation/ contentpackage.json / pyproject.toml description, keywords, authorCHANGELOG.md for historical goals and removed featuresBuild _devprocess/analysis/BA-{PROJECT}.md from the BA-TEMPLATE.md
but with every section following the evidence rule:
---
status: Draft (reverse-engineered, awaiting validation in /business-analysis)
created-by: /reverse-engineering
needs-validation: true
---
For each section of the BA template:
Every non-placeholder sentence carries a Source: line.
When you finish, count:
filled-from-sources: how many sections are evidence-backedneeds-user-input: how many sections are placeholdersInclude both counts in the BA header so /business-analysis knows
how much work remains.
Scan for:
TODO, FIXME, HACK, XXX comments in code.skip, xit, pytest.mark.skip).env.example or README)Append each finding as a row to _devprocess/context/BACKLOG.md
following the binding format in
skills/requirements-engineering/templates/BACKLOG-TEMPLATE.md.
Reverse-engineered findings go into the Standalone Items section
(no Epic yet, to be reassigned during BA/RE) with:
Status = BacklogPrio = P2 (default, the team reprioritises during BA/RE)Source = REVEvidence = path:line or short descriptionTyp = Chore (or Security for audit findings, Bug-Followup for
failing or skipped tests)Notes carries needs verification: code-vs-doc for every REV
finding. Phase 7 clears this marker (it sets the finding to Done
if the target turns out to be already satisfied, or removes the
marker once it has confirmed the gap is real).Title column = bare title only. The ID lives in column 1;
flow.py builds the GitHub issue title as <id>: <title>, so a
prefix in the Title cell duplicates the ID.
Verify before filing. Read the code AND the doc the finding
points at; drop the finding if the target is already satisfied
(timeout table already in arc42 §6, CI scan step already exists,
env var already in .env.example / README). Survivors still get
needs verification: code-vs-doc so Phase 7 re-checks them.
If this skill seeds the backlog file, copy the template headers (Dashboard, Legende, Standalone Items, Traceability) first and update dashboard counts after all rows are written.
Phase-Schema for the backlog. Phase is orthogonal to Status:
Released - fully implemented; all SCs traceable in code.
Partial implementation belongs in Building, not Released.Building - in progress or ready to start; scope clear.Planned - anticipated, needs refinement (each Candidates item
carries needs refinement: {reason} in Notes).Reverse-engineered items default to Phase = Building (code exists,
awaiting validation). Phase 7 promotes to Released or demotes to
Planned based on code evidence.
Before the Handoff Ritual runs, every FEATURE-spec and every ADR from Phases 2-3 gets an explicit verification against the codebase. This is the gate that lifts claims from "we wrote it down" to "we checked it compiles with reality."
Mechanism. For each FEATURE-spec and each ADR, decide the verification footer based on outcome:
Codebase-Verifikation {date}: Released, no drift.## Codebase-Verifikation ({date})
**Phase:** {Released | Building | Planned | Candidates}
**Refinement-Bedarf:** {none | reason if Candidates or Planned}
**Verifikations-Befund:**
- Source-Pfade geprueft: {n/m existieren}
- Success-Criteria stichprobe (Features) oder Kern-Decision (ADRs):
{n/m belegt}
- Drift-Findings: {"Doc: X / Code: Y / Einschaetzung: ..."}
**Backlog-Vorschlag:** {none | concrete FIX/IMP text}
Parallelisation. For large projects (20+ FEATUREs, 30+ ADRs), split verification into 3-6 concurrent agents with non-overlapping file slices. Each agent verifies its slice and writes the verification section directly. Consolidate Phase counts into the Backlog Dashboard at the end.
Backlog drift items. Every drift finding that cannot be fixed
with a one-line doc edit becomes a new Backlog entry. Common drift:
outdated paths/line numbers, SCs marked AWAITING RE, UI disabled
in code but active in doc, ADR describes X / code implements Y, BA
says "separate" / code shows full implementation.
Verify the Phase-5 findings too. For each Standalone row with
Source = REV (carrying needs verification: code-vs-doc):
Status = Done, Phase = Released,
remove the marker, add verified {date}: already present in <ref>.Status = Backlog.needs refinement: {reason},
escalate via User Interaction Protocol.Do not let a finding reach GitHub while it still carries
needs verification: code-vs-doc; the marker signals Phase 7 has
not run.
Phase 7 asks "matches this feature the code?". Phase 8 asks "is the
artefact graph as a whole consistent?". Run
/consistency-check mode A (syntactic, cheap). Output:
Source = CONSISTENCY-CHECK).Run /consistency-check --deep only at MVP scope with a valid BA
(checks Feature-ADR coherence and BA-Feature anchors).
Precondition: Phases 0-7 must be done. Running Phase 8 early gives false gaps.
RE allocates fresh ids; existing unmerged branches may collide. This phase enumerates them and reports renumber needs without modifying other branches.
Steps:
git for-each-ref --format='%(refname:short)' refs/heads/ \
| grep -Ev '^(main|master|dev|<re-branch>)$'
python3 tools/renumber-for-merge.py \
--target <re-branch> --source-ref "$B" --list-conflicts
_devprocess/context/HANDOFFS.md:
## reverse-engineering {YYYY-MM-DD} -- parallel-branch alignment
Branches with id collisions:
- feature/foo: epic 1, feat 2, fix 0, imp 0
To align: bash scripts/merge-to-dev.sh <branch> <re-branch>
If no parallel branches exist, print "No parallel branches found" and proceed.
Standard 4-part pattern: artifact report, handoff context, phase-end commit, transition question.
Part 1: Artifact report -- counts per artifact type (plan-context, ADRs, arc42 sections filled, FEATURE specs, BA draft with filled/placeholder counts, new backlog entries) plus sources walked (files scanned / docs read).
Part 2: HANDOFFS.md entry -- scope (Simple/PoC/MVP), what was
reverse-engineered, evidence coverage, risks/gaps, recommended next
phase (always /business-analysis).
Part 3: Phase-end commit -- per
skills/project-conventions/references/team-workflow.md
("Phase-end commit (binding)"). Stages every produced artefact,
commits, sets the phase tag, opens a draft PR. RE uses the single
branch feature/reverse-engineer-<repo-name>. Canonical message:
chore(reverse): <repo-name> reverse-engineering complete
<one-line summary: N FEATUREs, M ADRs, BA draft, K backlog entries>
Refs: <repo-name>
After the commit:
python3 tools/github-integration/flow.py tag-phase --item <repo-name> --phase reverse.
Skip silently if working tree is clean.
Part 4: Transition question
"Technical context is captured. I also built an evidence-based BA draft, but it is not validated. {N} sections are marked
[NEEDS USER INPUT]. Next step:/business-analysis. Start now, or review the reverse-engineered artifacts first?"
On agreement or when running inside /dia-guide: start
/business-analysis (Validation Mode auto-detects the draft BA).
On rejection: pause; artifacts stay in _devprocess/.
Before the Handoff Ritual, verify:
[NEEDS USER INPUT].Fix any failed gate before running the Handoff Ritual.
Match depth to scope. Do not over-produce for a small target; do not under-produce for a full onboarding.
Follows /project-conventions. Detect the root before writing:
docs/adr/ or docs/architecture/ exists -> docs/ root._devprocess/ exists -> _devprocess/ (canonical)._devprocess/.Ensure structure exists before writing:
mkdir -p {ROOT}/{analysis,requirements/{epics,features,handoff},architecture,adr,context,implementation/plans}
touch {ROOT}/context/HANDOFFS.md
adr/ is canonical for ADRs; consolidate architecture/ADR-*.md
into adr/ during Phase -1.
Seed {ROOT}/context/BACKLOG.md from
skills/requirements-engineering/templates/BACKLOG-TEMPLATE.md
with the four Phase counters (Released / Building / Planned /
Candidates) in the Dashboard.
reverse engineering, existing project, legacy codebase, brownfield, onboard existing, import code, we already have code, existing app, legacy import, codebase snapshot, reverse engineer, extract artifacts, bestehendes Projekt, existierender Code, Legacy-Projekt, Code-Import
npx claudepluginhub pssah4/digital-innovation-agents --plugin digital-innovation-agentsGuides users through the V-Model workflow by reading project state, recommending the next phase, auditing handoffs, and running the closing handoff after security audit.
Reverse-engineers existing projects to produce auto-anchored Design Docs, API contracts, and Threat Models from real code, IaC, and observability data.
Guides architecture design via Socratic questioning, generates technical docs like overview.md, domain-model.md, and ADR for new features, systems, or project structuring.