Skill

neo4j-modeling-skill

Designs, reviews, and refactors Neo4j graph data models. Use when choosing labels vs relationships vs properties, migrating schemas to graph, detecting anti-patterns, or enforcing schema with constraints and indexes.

Neo4j

database

Popularity

Stars

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/neo4j-skills:neo4j-modeling-skill

User invocable

Model invocable

Inline context

Default effort

Tool Access

This skill is limited to the following tools:

WebFetchBash

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

- Designing graph model from scratch (domain → nodes, rels, props)

Supporting Files

README.md

SKILL.md

362 lines · ~3.8k tokens

Stats

LanguagePython

Stars80

Forks31

MaintenanceExcellent

Last CommitJun 9, 2026

Actions

View Source View Plugin View on GitHub View README

When to Use

Designing graph model from scratch (domain → nodes, rels, props)
Reviewing existing model for anti-patterns
Deciding node vs property vs relationship vs label
Migrating relational or document schema to graph
Designing intermediate nodes for n-ary or complex relationships
Detecting and mitigating supernode / high-fanout problems
Choosing and creating constraints + indexes for a model

When NOT to Use

Writing or optimizing Cypher → neo4j-cypher-skill
Spring Data Neo4j (@Node, @Relationship) → neo4j-spring-data-skill
GraphQL type definitions → neo4j-graphql-skill
Importing data (LOAD CSV, APOC import) → neo4j-import-skill

Inspect Before Designing

On existing database, run first — never propose changes without current state:

CALL db.schema.visualization() YIELD nodes, relationships RETURN nodes, relationships;
SHOW CONSTRAINTS YIELD name, type, labelsOrTypes, properties RETURN name, type, labelsOrTypes, properties;
SHOW INDEXES YIELD name, type, labelsOrTypes, state WHERE state = 'ONLINE' RETURN name, type, labelsOrTypes;

If APOC available:

CALL apoc.meta.schema() YIELD value RETURN value;

MCP tool map:

Operation	Tool
Inspect schema	`get-schema`
`SHOW CONSTRAINTS`, `SHOW INDEXES`	`read-cypher`
`CREATE CONSTRAINT ... IF NOT EXISTS`	`write-cypher` (show + confirm first)

Defaults — Apply to Every Model

Use-case first — list 5+ queries the model must answer before designing
Nodes = entities (nouns) with identity; rels = connections (verbs) with direction
Labels PascalCase; rel types SCREAMING_SNAKE_CASE; properties camelCase
Every node type used in MERGE has a uniqueness constraint on its key property
Add property type constraints (REQUIRE n.prop IS :: STRING) where the type is known — helps the query planner and catches bad writes early
No generic labels (:Entity, :Node, :Thing); no generic rel types (:RELATED_TO, :HAS)
Security labels (used for row-level access control) should start with a common prefix (e.g. Sec) so application code can reliably filter them out of the domain schema
Rel direction encodes semantic meaning — not arbitrary
Inspect schema before proposing any change on an existing database
All constraint/index DDL uses IF NOT EXISTS — safe to rerun
On Neo4j 2026.02+ (Enterprise/Aura): consider ALTER CURRENT GRAPH TYPE SET { … } or EXTEND GRAPH TYPE WITH { … } to declare the full model in one block instead of individual CREATE CONSTRAINT statements — see neo4j-cypher-skill/references/graph-type.md. PREVIEW — syntax may change before GA.

Key Patterns

Node vs Relationship vs Property — Decision Table

Question	Answer	Model as
Is it a thing with identity, queried as entry point?	Yes	Node
Is it a connection between two things with direction?	Yes	Relationship
Does the connection have its own properties or multiple targets?	Yes	Intermediate node
Is it a scalar always returned with its parent, never filtered alone?	Yes	Property on parent
Is it a category used for type-based filtering or path traversal?	Yes	Label (not a property)
Does the same attribute value repeat across many nodes (low cardinality)?	Yes	Label, not a property node
Is it a fact connecting >2 entities?	Yes	Intermediate node

Property vs Label — Decision Table

Use label when	Use property when
Values are few, fixed, used as traversal filters (`WHERE n:Active`)	Values are many, dynamic, or unique per node
You traverse by type (`MATCH (n:VIPCustomer)`)	You filter by value (`WHERE n.tier = 'vip'`)
Category drives index selection	Fine-grained value drives range scans
Example: `:Active`, `:Verified`, `:Premium`	Example: `status`, `score`, `email`

Rule: adding a label is cheap; scanning all :Label nodes is fast. Never model high-cardinality values as labels.

Intermediate Node Pattern

Use when a relationship needs its own properties, connects >2 entities, or is independently queryable.

Before (relationship with property — limited):

(Person)-[:ACTED_IN {role: "Neo"}]->(Movie)
// Cannot query roles independent of movies

After (intermediate node — queryable, extensible):

(Person)-[:PLAYED]->(Role {name: "Neo"})-[:IN]->(Movie)
// MATCH (r:Role) WHERE r.name STARTS WITH 'Neo' RETURN r

Employment overlap example:

// Find colleagues who overlapped at same company
MATCH (p1:Person)-[:WORKED_AT]->(e1:Employment)-[:AT]->(c:Company)<-[:AT]-(e2:Employment)<-[:WORKED_AT]-(p2:Person)
WHERE p1 <> p2
  AND e1.startDate <= e2.endDate AND e2.startDate <= e1.endDate
RETURN p1.name, p2.name, c.name

Promote relationship to intermediate node when:

Relationship has >2 properties
Relationship is the subject of another query
Multiple entities share the same connection context
You need to connect >2 entities in one fact

Relational → Graph Migration Table

Relational construct	Graph equivalent	Notes
Table row	Node	One label per table (add more as needed)
Column (scalar)	Node property
Primary key	Uniqueness constraint property	Use `tmdbId`, not `id` (too generic)
Foreign key	Relationship	Direction: from dependent → referenced
Many-to-many junction table	Intermediate node	Especially if junction has own columns
Junction table (no own columns)	Direct relationship	Simpler; upgrade to intermediate node later
NULL FK (optional relation)	Absent relationship	No node created; absence is the signal
Polymorphic FK (Rails-style)	Multiple labels or relationship types	Split into type-specific rels
Self-referential FK	Same-label relationship	`:Employee {managerId}` → `(e)-[:REPORTS_TO]->(m)`
Audit/history columns	Intermediate versioning node	See References for versioning pattern

Supernode Detection and Mitigation

Detect:

// Find top-10 highest-degree nodes
MATCH (n)
RETURN labels(n) AS labels, elementId(n) AS id, count{ (n)--() } AS degree
ORDER BY degree DESC LIMIT 10

Node with degree >> median for its label = supernode candidate. Any node with >100K relationships will degrade traversal queries that pass through it.

Causes:

Domain supernodes: airports, celebrities, popular hashtags — unavoidable
Modeling supernodes: gender, country, status modeled as nodes with millions of edges — avoidable

Mitigation strategies (in priority order):

Strategy	When to use	Implementation
Query direction	Directional asymmetry exists	Query from low-degree side; exploit direction
Relationship type split	Supernode serves multiple roles	`:FOLLOWS` + `:FAN` instead of single `:RELATED_TO`
Label segregation	Supernode conflates entity types	`:Celebrity` vs `:User` → query only relevant subtype
Bucket pattern	Time-series or high-volume event nodes	See below
Avoid modeling	Low-cardinality categoricals	Use label instead of node (`:Active` not `(:Status {name:"Active"})`)
Join hint	Query tuning last resort	`USING JOIN ON n` in Cypher

Bucket pattern (time-series / high-volume):

// Instead of: (:User)-[:VIEWED]->(:Page) (millions of rels per user)
// Bucket by hour:
(u:User)-[:VIEWED_IN]->(b:ViewBucket {userId: u.id, hour: '2025-04-28T14'})-[:VIEWED]->(p:Page)

// Query last hour's views without traversing full history:
MATCH (u:User {id: $uid})-[:VIEWED_IN]->(b:ViewBucket {hour: $hour})-[:VIEWED]->(p)
RETURN p.url

Naming Conventions

Element	Convention	Good	Bad
Node label	PascalCase, singular noun	`:Person`, `:BlogPost`	`:person`, `:blog_posts`, `:Entity`
Relationship type	SCREAMING_SNAKE_CASE, verb phrase	`:ACTED_IN`, `:WORKS_FOR`	`:actedin`, `:relatedTo`, `:HAS`
Property key	camelCase	`firstName`, `createdAt`	`FirstName`, `first_name`
Constraint name	snake_case descriptive	`person_id_unique`	`constraint1`
Index name	snake_case descriptive	`person_name_idx`	`index2`

Schema Enforcement — What to Create for Each Element

Run all DDL with IF NOT EXISTS. Apply before importing data.

// 1. Uniqueness constraint — every node type used in MERGE
CREATE CONSTRAINT person_id_unique IF NOT EXISTS
  FOR (p:Person) REQUIRE p.id IS UNIQUE;

// 2. Existence constraint (Enterprise) — mandatory properties
CREATE CONSTRAINT person_name_exists IF NOT EXISTS
  FOR (p:Person) REQUIRE p.name IS NOT NULL;

// 3. Property type constraint (Enterprise) — enforce data type
CREATE CONSTRAINT person_born_integer IF NOT EXISTS
  FOR (p:Person) REQUIRE p.born IS :: INTEGER;

// 4. Key constraint (Enterprise) — unique + exists in one
CREATE CONSTRAINT movie_tmdbid_key IF NOT EXISTS
  FOR (m:Movie) REQUIRE m.tmdbId IS NODE KEY;

// 5. Range index — equality and range filters on properties
CREATE INDEX person_name_idx IF NOT EXISTS
  FOR (p:Person) ON (p.name);

// 6. Fulltext index — CONTAINS, STARTS WITH, free text search
CREATE FULLTEXT INDEX person_fulltext IF NOT EXISTS
  FOR (n:Person) ON EACH [n.name, n.bio];

// 7. Vector index — embedding similarity search
CREATE VECTOR INDEX chunk_embedding_idx IF NOT EXISTS
  FOR (c:Chunk) ON (c.embedding)
  OPTIONS { indexConfig: { `vector.dimensions`: 1536, `vector.similarity_function`: 'cosine' } };

// 8. Relationship index — filter on rel properties
CREATE INDEX acted_in_year_idx IF NOT EXISTS
  FOR ()-[r:ACTED_IN]-() ON (r.year);

After creating indexes, poll until ONLINE:

SHOW INDEXES YIELD name, state WHERE state <> 'ONLINE' RETURN name, state;

Do NOT use an index until state = ONLINE.

Vector / Embedding Property Modeling

Store embeddings on dedicated :Chunk nodes, never on business nodes:

(:Document)-[:HAS_CHUNK]->(c:Chunk {text: "...", embedding: [...]})

Rules:

Chunk node: text (source text), embedding (float array), chunkIndex (int)
Parent document: metadata only (title, url, createdAt)
Vector index on c.embedding only
Chunk size 200–500 tokens with 20% overlap is production default [field]
Do NOT put embedding on :Document — makes the node too large and pollutes traversal

Anti-Patterns Table

Anti-pattern	Problem	Fix
Generic labels `:Entity`, `:Node`	No filtering benefit; all nodes scan	Use domain labels `:Person`, `:Product`
Generic rel types `:RELATED_TO`, `:HAS`	Can't filter by relationship type	Use semantic types `:PURCHASED`, `:AUTHORED`
Low-cardinality value as node	Supernode (`:Status {name:"active"}` → millions of edges)	Use label `:Active` instead
Property as label (`n.type = 'VIP'` + `:VIP` label both exist)	Inconsistency, duplication	Pick one; prefer label if used in traversal
Storing embeddings on business node	Node bloat, slow traversal	Dedicated `:Chunk` node
MERGE without uniqueness constraint	Duplicate nodes silently created	Add constraint before any MERGE
Missing relationship direction meaning	Arbitrary direction; confusing model	Direction = semantic flow of action
Junction table modeled as bare property	Loses history and extensibility	Intermediate node with its own properties
`id` as property name	`id(n)` is a deprecated Cypher function (use `elementId(n)`); bare `id` is fine as a property name in practice, but domain-qualified names (`personId`, `movieId`) are clearer and avoid any future ambiguity	Prefer `personId`, `movieId`, `tmdbId` where it aids readability
All dates as strings	No range queries; no temporal operators	Use Neo4j `date()` or `datetime()` type

Output Format — Schema Assessment

When reviewing an existing model:

## Schema Assessment

### Compliant
- [constraint / pattern that is correct]

### Issues Found
#### [Title] — Severity: ERROR / WARNING / INFO
- **Current**: what the model does
- **Problem**: why it is an issue
- **Fix**: specific Cypher DDL or model change

## Recommended Schema
### Node Labels
- :Label {key: TYPE, prop: TYPE, ...}  → constraints: [list]

### Relationships
- (:LabelA)-[:TYPE {prop: TYPE}]->(:LabelB)

### Constraints to Create
[CREATE CONSTRAINT ... statements]

### Indexes to Create
[CREATE INDEX ... statements]

Severity semantics:

Severity	Meaning	Action
`ERROR`	Model correctness failure (duplicates possible, data loss risk)	Stop; fix before proceeding
`WARNING`	Performance or extensibility risk	Report; ask user before proceeding
`INFO`	Style or convention deviation	Surface; continue

Provenance Labels

[official] — stated directly in Neo4j docs
[derived] — follows from documented behavior
[field] — community heuristic; treat as default but validate

Checklist

References

Load on demand:

references/modeling-patterns.md — time-series, versioning, multi-tenancy, linked list, access control patterns
Neo4j Data Modeling Guide
Neo4j Modeling Tips
GraphAcademy: Graph Data Modeling Fundamentals
Super Nodes — All About Super Nodes (David Allen)

neo4j-modeling-skill

Popularity

Invocation

Tool Access

Context Preview

Supporting Files

SKILL.md

neo4j-modeling-skill

Popularity

Invocation

Tool Access

Context Preview

Supporting Files

SKILL.md

When to Use

When NOT to Use

Inspect Before Designing

Defaults — Apply to Every Model

Key Patterns

Node vs Relationship vs Property — Decision Table

Property vs Label — Decision Table

Intermediate Node Pattern

Relational → Graph Migration Table

Supernode Detection and Mitigation

Naming Conventions

Schema Enforcement — What to Create for Each Element

Vector / Embedding Property Modeling

Anti-Patterns Table

Output Format — Schema Assessment

Provenance Labels

Checklist

References

Similar Skills

When to Use

When NOT to Use

Inspect Before Designing

Defaults — Apply to Every Model

Key Patterns

Node vs Relationship vs Property — Decision Table

Property vs Label — Decision Table

Intermediate Node Pattern

Relational → Graph Migration Table

Supernode Detection and Mitigation

Naming Conventions

Schema Enforcement — What to Create for Each Element

Vector / Embedding Property Modeling

Anti-Patterns Table

Output Format — Schema Assessment

Provenance Labels

Checklist

References

Similar Skills