Skill

Knowledge Base Manager

Design, build, and maintain comprehensive knowledge bases. Bridges document-based (RAG) and entity-based (graph) knowledge systems. Use when building knowledge-intensive applications, managing organizational knowledge, or creating intelligent information systems.

Install

npx claudepluginhub joshuarweaver/cascade-content-creation-misc-1 --plugin daffy0208-ai-dev-standards

Tool Access

This skill uses the workspace's default tool permissions.

Preview

Build and maintain high-quality knowledge bases for AI systems and human consumption.

Supporting Assets

README.mdmanifest.yaml

SKILL.md

Similar Skills

using-git-worktrees

Creates isolated Git worktrees for feature branches with prioritized directory selection, gitignore safety checks, auto project setup for Node/Python/Rust/Go, and baseline verification.

superpowers

168.3k

subagent-driven-development

3 files

Executes implementation plans in current session by dispatching fresh subagents per independent task, with two-stage reviews: spec compliance then code quality.

superpowers

168.3k

dispatching-parallel-agents

Dispatches parallel agents to independently tackle 2+ tasks like separate test failures or subsystems without shared state or dependencies.

superpowers

168.3k

Stats

Stars25

Forks7

Last CommitDec 25, 2025

Actions

View Source View Plugin View on GitHub View README

Knowledge Base Manager

Build and maintain high-quality knowledge bases for AI systems and human consumption.

Core Principle

Knowledge Base = Structured Information + Quality Curation + Accessibility

A knowledge base is not just a data dump—it's curated, validated, versioned information designed to answer questions and enable reasoning.

When to Use Knowledge Bases

Use Knowledge Bases When:

✅ Need to answer factual questions consistently
✅ Information changes frequently and needs version control
✅ Multiple sources need to be unified and reconciled
✅ Provenance and citation tracking is critical
✅ Building AI systems that need grounded, verifiable information
✅ Organizational knowledge needs to be preserved and searchable
✅ Complex domain with interconnected concepts

Don't Use Knowledge Bases When:

❌ Static documentation is sufficient (use docs + search)
❌ No one will maintain/update it (knowledge rot guaranteed)
❌ Simple FAQ covers all questions (<50 items)
❌ Information doesn't change (static site faster/cheaper)
❌ Team lacks resources for curation

Knowledge Base Types: Decision Framework

1. Document-Based Knowledge Base (RAG)

What it is: Collection of documents, chunked and embedded for semantic search

Best for:

Technical documentation
Support articles, FAQs
Policy documents
Research papers
Blog content
User manuals

Strengths:

Easy to add new documents
Preserves full context
Natural for text-heavy content

Weaknesses:

Hard to query relationships ("Who works where?")
Duplicate information across documents
Difficult to keep facts consistent

Use: rag-implementer skill + vector-database-mcp

2. Entity-Based Knowledge Base (Knowledge Graph)

What it is: Network of entities (people, places, things) connected by relationships

Best for:

Organizational charts
Product catalogs with relationships
Social networks
Recommendation systems
Fraud detection
Supply chain tracking

Strengths:

Excellent for "how are X and Y related?" queries
Consistent facts (one source of truth)
Powerful traversal ("friends of friends")

Weaknesses:

Upfront modeling required (ontology design)
Harder to add unstructured information
Learning curve for graph queries

Use: knowledge-graph-builder skill + graph-database-mcp

3. Hybrid Knowledge Base (RAG + Graph)

What it is: Documents for unstructured knowledge + Graph for structured entities/relationships

Best for:

Enterprise knowledge management
Research with citations and relationships
Medical systems (documents + patient/drug relationships)
Legal systems (cases + precedents + entities)
E-commerce (products + specs + relationships)

Strengths:

Best of both worlds
Flexible for different knowledge types
Rich querying capabilities

Weaknesses:

Most complex to build and maintain
Requires expertise in both RAG and graphs
Higher infrastructure costs

Use: Both rag-implementer + knowledge-graph-builder skills

Decision Tree: Which KB Type?

What kind of knowledge do you have?

├─ Mostly unstructured text (docs, articles, content)?
│  └─ Document-Based KB (RAG)
│     Use: rag-implementer skill
│
├─ Mostly structured entities with relationships?
│  └─ Entity-Based KB (Graph)
│     Use: knowledge-graph-builder skill
│
└─ Mix of both?
   └─ Hybrid KB (RAG + Graph)
      Use: Both skills + This skill for integration

6-Phase Knowledge Base Implementation

Phase 1: Knowledge Audit & Architecture

Goal: Understand what knowledge exists and how to structure it

Actions:

Inventory existing knowledge sources
- Internal: databases, documents, wikis, Slack, emails
- External: public data, APIs, third-party sources
- Tribal: SME interviews, recorded conversations
Classify knowledge types
- Factual: Verifiable facts ("Product X costs $50")
- Procedural: How-to knowledge ("How to deploy")
- Conceptual: Definitions and explanations
- Relationship: Connections between entities
Choose KB architecture
- Document-based? Entity-based? Hybrid?
- Decision: Use framework above
Define knowledge schema
- For documents: metadata fields (source, date, author, category)
- For entities: ontology (entity types, relationship types, properties)

Validation:

All knowledge sources inventoried and prioritized
KB architecture chosen and justified
Schema defined and validated with users
Success metrics established

Phase 2: Knowledge Curation & Ingestion

Goal: Transform raw information into high-quality knowledge

Actions:

Extract knowledge from sources
- Automated: scraping, API ingestion, file parsing
- Manual: expert input, annotation, validation
Clean and normalize
- Remove duplicates
- Standardize formats
- Fix inconsistencies
- Enrich with metadata
Structure knowledge
- For documents: chunk intelligently (semantic boundaries)
- For entities: extract entities, relationships, properties
Add provenance
- Source URL or reference
- Last updated timestamp
- Author/contributor
- Confidence score (if applicable)

Curation Best Practices:

Single Source of Truth: One canonical answer per question
Deduplication: Merge similar knowledge entries
Conflict Resolution: When sources disagree, establish priority rules
Metadata Richness: More metadata = better filtering and search

Validation:

Knowledge extracted and structured
Quality metrics above threshold (accuracy >95%)
Provenance tracked for all entries
Sample queries return relevant results

Phase 3: Storage & Retrieval Setup

Goal: Implement technical infrastructure for knowledge access

Architecture Patterns:

For Document-Based KB:

// Vector database for semantic search
interface DocumentKB {
  store: 'Pinecone' | 'Weaviate' | 'pgvector'
  chunks: {
    content: string
    embedding: number[]
    metadata: {
      source: string
      title: string
      updated_at: string
      category: string
    }
  }[]
}

For Entity-Based KB:

// Graph database for relationship queries
interface EntityKB {
  store: 'Neo4j' | 'ArangoDB'
  nodes: {
    id: string
    type: 'Person' | 'Organization' | 'Product' | 'Concept'
    properties: Record<string, any>
  }[]
  relationships: {
    from: string
    to: string
    type: string
    properties: Record<string, any>
  }[]
}

For Hybrid KB:

// Both vector DB + graph DB
interface HybridKB {
  vectorDB: DocumentKB
  graphDB: EntityKB
  linker: {
    // Links documents to entities mentioned in them
    linkDocumentToEntities(docId: string): string[]
    // Links entities to documents that mention them
    linkEntityToDocuments(entityId: string): string[]
  }
}

Actions:

Choose database(s)
- Document: Pinecone, Weaviate, pgvector
- Entity: Neo4j, ArangoDB
- Hybrid: Both + linking layer
Implement search/query layer
- Vector similarity search (for documents)
- Graph traversal (for entities)
- Hybrid queries (combining both)
Add caching and optimization
- Cache frequent queries
- Optimize for common access patterns

Validation:

Database deployed and accessible
Search/query functionality working
Performance meets requirements (<100ms for most queries)

Phase 4: Quality Control & Validation

Goal: Ensure knowledge base accuracy and reliability

Quality Metrics:

Accuracy: % of correct answers to test questions
Coverage: % of user questions answerable
Freshness: Average age of knowledge
Consistency: % of conflicts/contradictions
Source Quality: % from authoritative sources

Validation Strategies:

1. Test Question Sets Create 100+ test questions with known correct answers:

interface TestQuestion {
  question: string
  expected_answer: string
  category: string
  difficulty: 'easy' | 'medium' | 'hard'
}

2. Human Review

Sample random knowledge entries
Subject matter expert validation
User feedback loops

3. Automated Checks

Duplicate Detection: Find near-identical entries
Conflict Detection: Find contradictory facts
Staleness Detection: Flag outdated information
Citation Validation: Verify sources still exist

4. Continuous Monitoring

interface KBHealthMetrics {
  accuracy_score: number // 0-100
  coverage_score: number // % questions answered
  freshness_score: number // avg days since update
  consistency_score: number // % no conflicts
  user_satisfaction: number // feedback rating
}

Actions:

Run test question validation (target: >90% accuracy)
Conduct human review (sample 10% of entries)
Fix detected issues (duplicates, conflicts, staleness)
Establish monitoring dashboards

Validation:

Accuracy >90% on test questions
Coverage >80% of user questions
<5% conflicting information
Monitoring dashboard operational

Phase 5: Versioning & Evolution

Goal: Track knowledge changes over time and enable rollback

Why Versioning Matters:

Knowledge changes (facts update, policies change)
Need audit trail (who changed what when)
Rollback capability (undo bad updates)
Historical queries ("What was policy on X in 2023?")

Versioning Strategies:

1. Snapshot Versioning

interface KnowledgeEntry {
  id: string
  content: string
  version: number
  created_at: string
  updated_at: string
  updated_by: string
  changelog: string
  previous_version?: string // ID of prior version
}

2. Event Sourcing

interface KnowledgeEvent {
  event_id: string
  entity_id: string
  event_type: 'created' | 'updated' | 'deleted'
  timestamp: string
  changes: {
    field: string
    old_value: any
    new_value: any
  }[]
  author: string
}

3. Git-Style Versioning

Treat knowledge like code
Commit-based changes
Branch for experimental knowledge
Merge when validated

Actions:

Implement version tracking
Add changelog for all updates
Create rollback mechanism
Build version comparison tools

Validation:

All changes tracked with versions
Rollback tested and working
Historical queries supported
Audit trail complete

Phase 6: Maintenance & Governance

Goal: Keep knowledge base healthy long-term

Maintenance Tasks:

Daily:

Monitor for errors and failures
Review user feedback
Address urgent corrections

Weekly:

Review new content submissions
Update time-sensitive knowledge
Run automated quality checks

Monthly:

Audit knowledge freshness
Review and resolve conflicts
Analyze usage patterns
Update stale content

Quarterly:

Comprehensive quality audit
Schema/ontology review
Performance optimization
User satisfaction survey

Governance Framework:

1. Roles & Responsibilities

Knowledge Owners: Domain experts responsible for content
Curators: Review and approve changes
Contributors: Submit new knowledge
Consumers: Use knowledge and provide feedback

2. Change Process

Submit → Review → Approve → Publish → Monitor

3. Quality Standards

Minimum source quality requirements
Citation requirements
Update frequency requirements
Conflict resolution process

Actions:

Establish maintenance schedule
Assign roles and responsibilities
Create governance documentation
Train team on processes

Validation:

Maintenance schedule in place
Governance documented and communicated
Team trained on processes
Quality trending upward

Knowledge Base Anti-Patterns

❌ Anti-Pattern 1: Data Dump Without Curation

Problem: Ingesting everything without quality filtering

Impact: Low signal-to-noise ratio, poor search results, user frustration

Solution: Curate before ingesting. Quality > Quantity

❌ Anti-Pattern 2: No Version Control

Problem: Knowledge changes but no history tracked

Impact: Can't audit changes, can't rollback errors, no accountability

Solution: Implement versioning from Phase 5

❌ Anti-Pattern 3: Stale Knowledge

Problem: Knowledge base outdated but no one knows

Impact: AI systems hallucinate using old facts, users get wrong answers

Solution: Freshness monitoring + scheduled updates

❌ Anti-Pattern 4: Duplicate Information

Problem: Same fact in multiple places, becomes inconsistent

Impact: Conflicting answers, confused users

Solution: Deduplication + single source of truth

❌ Anti-Pattern 5: No Provenance

Problem: Knowledge without source citations

Impact: Can't verify accuracy, can't trace errors

Solution: Always track source + timestamp + author

Integration with Other Skills

With rag-implementer

Use for document-based portion of hybrid KB
Follow RAG implementation phases
Integrate vector search with KB queries

With knowledge-graph-builder

Use for entity-based portion of hybrid KB
Follow graph design patterns
Integrate graph traversal with KB queries

With data-engineer

For ETL pipelines (extract, transform, load knowledge)
For data quality monitoring
For performance optimization

With quality-auditor

For automated quality checks
For testing and validation
For continuous monitoring

With technical-writer

For knowledge documentation
For user guides on KB usage
For governance documentation

Tools & Technologies

Document-Based KB Stack

Vector DB: Pinecone, Weaviate, pgvector
Embeddings: OpenAI, Cohere, custom
Search: Semantic + keyword hybrid

Entity-Based KB Stack

Graph DB: Neo4j, ArangoDB
Query: Cypher, AQL
Visualization: Neo4j Bloom, Gephi

Curation Tools

Deduplication: Custom algorithms, fuzzy matching
Conflict Detection: Rule-based, ML-based
Validation: Test question sets, human review

Monitoring

Metrics: Custom dashboard (Grafana)
Logging: Structured logging of queries/updates
Alerts: Freshness, accuracy, error rate alerts

Success Metrics

Knowledge Quality

Accuracy: >90% on test questions
Coverage: >80% of user questions answered
Freshness: <30 days average age
Consistency: <5% conflicting information

User Satisfaction

Relevance: >85% query results rated relevant
Usefulness: >80% users find KB valuable
Speed: <100ms median query time

Operational Health

Uptime: >99.9%
Update frequency: Weekly minimum
Team engagement: Regular contributions

Common Pitfalls & Solutions

Pitfall 1: "Build it and they will come"

Problem: No user validation, KB doesn't meet needs

Solution: Start with user research, validate continuously

Pitfall 2: Perfectionism

Problem: Waiting to launch until KB is "perfect"

Solution: Launch with 80% coverage, iterate based on usage

Pitfall 3: Over-engineering

Problem: Building complex hybrid system when simple docs would work

Solution: Start simple, add complexity only when needed

Pitfall 4: Maintenance neglect

Problem: Build once, never update

Solution: Establish maintenance schedule from day 1

Quick Start Checklist

Before you start:

Read this entire skill
Review rag-implementer if using document KB
Review knowledge-graph-builder if using entity KB
Have clear use case and success metrics

Phase 1 - Architecture (Week 1):

Inventory knowledge sources
Choose KB type (document/entity/hybrid)
Define schema/ontology
Set up infrastructure

Phase 2 - Initial Build (Week 2-3):

Ingest and curate initial knowledge
Implement search/query functionality
Create test question set
Validate with users

Phase 3 - Iterate (Ongoing):

Add more knowledge based on usage
Monitor quality metrics
Fix issues as discovered
Establish maintenance cadence

Related Resources

Skills: rag-implementer, knowledge-graph-builder, data-engineer, quality-auditor
MCPs: vector-database-mcp, graph-database-mcp, knowledge-base-mcp, semantic-search-mcp
Patterns: STANDARDS/architecture-patterns/rag-pattern.md, knowledge-base-pattern.md (coming soon)
Integrations: INTEGRATIONS/pinecone/, INTEGRATIONS/graph-databases/neo4j/

Knowledge Base Manager

Install

Tool Access

Preview

Supporting Assets

SKILL.md

Similar Skills

Knowledge Base Manager

Install

Tool Access

Preview

Supporting Assets

SKILL.md

Knowledge Base Manager

Core Principle

When to Use Knowledge Bases

Use Knowledge Bases When:

Don't Use Knowledge Bases When:

Knowledge Base Types: Decision Framework

1. Document-Based Knowledge Base (RAG)

2. Entity-Based Knowledge Base (Knowledge Graph)

3. Hybrid Knowledge Base (RAG + Graph)

Decision Tree: Which KB Type?

6-Phase Knowledge Base Implementation

Phase 1: Knowledge Audit & Architecture

Phase 2: Knowledge Curation & Ingestion

Phase 3: Storage & Retrieval Setup

Phase 4: Quality Control & Validation

Phase 5: Versioning & Evolution

Phase 6: Maintenance & Governance

Knowledge Base Anti-Patterns

❌ Anti-Pattern 1: Data Dump Without Curation

❌ Anti-Pattern 2: No Version Control

❌ Anti-Pattern 3: Stale Knowledge

❌ Anti-Pattern 4: Duplicate Information

❌ Anti-Pattern 5: No Provenance

Integration with Other Skills

With rag-implementer

With knowledge-graph-builder

With data-engineer

With quality-auditor

With technical-writer

Tools & Technologies

Document-Based KB Stack

Entity-Based KB Stack

Curation Tools

Monitoring

Success Metrics

Knowledge Quality

User Satisfaction

Operational Health

Common Pitfalls & Solutions

Pitfall 1: "Build it and they will come"

Pitfall 2: Perfectionism

Pitfall 3: Over-engineering

Pitfall 4: Maintenance neglect

Quick Start Checklist

Related Resources

Further Reading

Similar Skills

Knowledge Base Manager

Core Principle

When to Use Knowledge Bases

Use Knowledge Bases When:

Don't Use Knowledge Bases When:

Knowledge Base Types: Decision Framework

1. Document-Based Knowledge Base (RAG)

2. Entity-Based Knowledge Base (Knowledge Graph)

3. Hybrid Knowledge Base (RAG + Graph)

Decision Tree: Which KB Type?

6-Phase Knowledge Base Implementation

Phase 1: Knowledge Audit & Architecture

Phase 2: Knowledge Curation & Ingestion

Phase 3: Storage & Retrieval Setup

Phase 4: Quality Control & Validation

Phase 5: Versioning & Evolution

Phase 6: Maintenance & Governance

Knowledge Base Anti-Patterns

❌ Anti-Pattern 1: Data Dump Without Curation

❌ Anti-Pattern 2: No Version Control