Generates platform-native social content for X, LinkedIn, TikTok, YouTube, newsletters from source material like articles, demos, docs, or notes. Adapts voice and format per platform.
Caches expensive file processing (PDFs, text extraction, images) using SHA-256 content hashes for path-independent, auto-invalidating JSON storage in Python.
Reorganizes X and LinkedIn networks: review-first pruning of low-value follows, priority-based add/follow recommendations, and drafts warm outreach in user's voice.
GenAI Architecture: Architecture for Generative AI Systems
GenAI architecture defines how LLM-powered systems retrieve knowledge, orchestrate models, execute agent workflows, and ensure quality. This skill produces comprehensive architecture documentation covering RAG design, LLM orchestration with multi-model tiering, agent workflows, vector database selection, knowledge connector integration, and quality assurance for generative AI systems.
Principio Rector
RAG no es un feature — es una arquitectura. Conectar un LLM a una base de datos vectorial no es RAG. RAG es un pipeline completo: query processing, retrieval, re-ranking, context assembly, generation, validation. Cada etapa tiene decisiones arquitectonicas que determinan la calidad del sistema.
Filosofia de GenAI Architecture
Retrieval quality > Generation quality. El LLM mas avanzado del mundo producira respuestas incorrectas si la informacion recuperada es irrelevante o incompleta. La inversion arquitectonica debe priorizar la calidad de retrieval (chunking, embeddings, re-ranking, hybrid search) sobre la seleccion del LLM.
Multi-model tiering, no model monolito. Usar el modelo mas grande y costoso para cada consulta es insostenible. La arquitectura debe enrutar consultas al tier apropiado: modelos ligeros para tareas simples, modelos frontier para razonamiento complejo, cache para patrones conocidos.
Guardrails son arquitectura, no afterthought. Hallucination detection, safety filtering, PII masking, cost controls, y rate limiting son componentes arquitectonicos de primera clase, no checks opcionales que se agregan "despues del MVP".
Inputs
The user provides a system or project name as $ARGUMENTS. Parse $1 as the system/project name used throughout all output artifacts.
Teams with infra expertise, data residency requirements
Hybrid retrieval
Best of vector + keyword search
Two index systems, fusion complexity
Most production RAG systems
Structured connectors
Rich, accurate context from business systems
Integration complexity, security scope
Enterprise systems with CRM/ERP/ITSM
Assumptions
Use case benefits from generative AI (not all AI problems need LLMs)
Knowledge sources are identified and accessible
LLM API access is available (cloud provider or self-hosted)
Budget for LLM inference costs is allocated and bounded
Team has or will build experience with LLM application patterns
Data privacy and security requirements are understood
Limits
Focuses on GenAI architecture, not general ML architecture (see metodologia-ai-software-architecture)
Does not design traditional ML pipelines (see metodologia-ai-pipeline-architecture)
Does not select general AI patterns (see metodologia-ai-design-patterns)
Does not define comprehensive testing strategy (see metodologia-ai-testing-strategy)
Infrastructure provisioning for GenAI requires metodologia-infrastructure-architecture
LLM fine-tuning methodology is out of scope (operational, not architectural)
Edge Cases
Enterprise with Strict Data Residency:
Cloud-managed vector DBs may not meet data residency requirements. Self-hosted vector DB (Qdrant, Milvus) with region-specific deployment. Self-hosted LLM (vLLM, TGI) if data cannot leave premises. All connectors must enforce data residency at query level.
Multi-Language Knowledge Base:
Embedding model must support all required languages. Multilingual models (Cohere multilingual, mE5) for cross-language retrieval. Consider language-specific collections for highest quality. Query language detection for routing to appropriate collection.
High-Volume, Low-Latency Requirements:
CAG for core knowledge + aggressive caching + Tier 3 models for common queries. Pre-computed responses for high-frequency patterns. Streaming for perceived latency reduction. Vector DB with in-memory index for sub-millisecond retrieval.
Rapidly Changing Knowledge:
Real-time indexing pipeline for new documents. Stale content detection and automatic removal. Version-aware retrieval (prefer recent documents). Web search integration for real-time information not yet indexed.
Multi-Tenant GenAI System:
Namespace or collection isolation per tenant. Row-level security in structured connectors per tenant. Cost allocation and rate limiting per tenant. Tenant-specific fine-tuning or prompt customization.
Validation Gate
Before finalizing delivery, verify:
RAG pipeline designed with all stages (query processing, retrieval, assembly, generation, validation)
Retrieval strategy justified (vector-only, keyword, hybrid) with rationale
Chunking strategy selected based on content type and retrieval requirements
LLM orchestration includes multi-model tiering with routing logic
Agent workflow defined with tool governance (whitelist, validation, rate limits, audit)
Vector DB selected based on evaluation criteria (not technology preference)
Embedding model selected with consistency requirement documented
Knowledge connectors designed with security model (row-level security, auth, audit)
Quality framework includes hallucination reduction, evaluation metrics, and continuous improvement
Guardrails are architectural components (input, output, operational, content)
Cross-References
metodologia-ai-software-architecture: GenAI components fit within the AI system module view
metodologia-ai-conops: CONOPS defines interaction level and success metrics for GenAI system
metodologia-ai-pipeline-architecture: Embedding pipeline and indexing pipeline are pipeline components
metodologia-ai-design-patterns: Champion-Challenger and Canary apply to LLM model promotion
metodologia-ai-testing-strategy: GenAI-specific tests (hallucination, retrieval quality) extend the testing matrix