Help us improve
Share bugs, ideas, or general feedback.
From chief-data-officer-advisor
Chief Data Officer advisory for startups: AI training data rights and consent provenance, data product strategy (warehouse vs lakehouse vs mesh, build-vs-buy), B2B customer-data-as-asset valuation and M&A readiness, data team org evolution. Use when deciding whether to train models on customer data, choosing data architecture, valuing data for fundraising or M&A, sequencing data hires, or when user mentions CDO, chief data officer, data strategy, data mesh, lakehouse, training data, data product, data monetization, or customer data asset. NOT a tactical data engineering skill — strategic decisions only.
npx claudepluginhub ciciliaeth/claude-skills --plugin chief-data-officer-advisorHow this skill is triggered — by the user, by Claude, or both
Slash command
/chief-data-officer-advisor:chief-data-officer-advisorThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Strategic data leadership for startup CDOs and founders without one. **Four decisions, no surveys:**
Provides UI/UX resources: 50+ styles, color palettes, font pairings, guidelines, charts for web/mobile across React, Next.js, Vue, Svelte, Tailwind, React Native, Flutter. Aids planning, building, reviewing interfaces.
Fetches up-to-date documentation from Context7 for libraries and frameworks like React, Next.js, Prisma. Use for setup questions, API references, and code examples.
Guides Payload CMS config (payload.config.ts), collections, fields, hooks, access control, APIs. Debugs validation errors, security, relationships, queries, transactions, hook behavior.
Share bugs, ideas, or general feedback.
Strategic data leadership for startup CDOs and founders without one. Four decisions, no surveys:
This skill does not cover tactical data engineering. For schema design, observability, query optimization, RAG, or ML platform implementation, see engineering/database-designer/, engineering/observability-designer/, engineering/data-quality-auditor/, engineering/sql-database-assistant/, engineering/rag-architect/, engineering/llm-cost-optimizer/.
CDO, chief data officer, AI training data, consent provenance, training rights, GDPR Article 6 lawful basis, GDPR Article 22, EU AI Act high-risk, ePrivacy, copyright fair use, hiQ v. LinkedIn, scraped data, synthetic data, data product, data mesh, lakehouse, medallion architecture, dbt, Snowflake, BigQuery, Databricks, Fivetran, Airbyte, reverse ETL, feature store, customer data as asset, data monetization, data productization, anonymization, k-anonymity, differential privacy, M&A data diligence, data org, analytics engineer, data engineer, data scientist, data product manager, centralize vs embed, hub and spoke
# Audit data sources for AI training eligibility
python scripts/ai_training_data_audit.py # uses embedded sample
python scripts/ai_training_data_audit.py path/to/sources.json
# Pick data architecture + build-vs-buy + sequencing
python scripts/data_product_strategy_picker.py # uses embedded Series A SaaS
python scripts/data_product_strategy_picker.py path/to/profile.json
# Value the customer data corpus + productization viability
python scripts/data_asset_valuator.py # uses embedded B2B sample
python scripts/data_asset_valuator.py path/to/corpus.json
The 2026 question every startup is facing: can we use customer data to train our model?
The answer is rarely binary. It depends on three independent dimensions:
| Dimension | Values |
|---|---|
| Origin | 1st-party-explicit-opt-in / 1st-party-TOS-only / partner-licensed / scraped / synthetic |
| Data class | Anonymous aggregate / behavioral / PII / 3rd-party content / regulated (PHI, PCI, kids) |
| Use case | In-product personalization / fine-tune our model / train foundation model / external sharing |
Each combination produces GO / MITIGATE / NO-GO. Run ai_training_data_audit.py on a JSON inventory of sources.
See references/ai_training_data_rights.md for the full matrix + GDPR Art. 6 lawful basis decision tree + EU AI Act high-risk triggers.
Architecture choice (warehouse vs lakehouse vs mesh) is stage-driven, not preference-driven:
Build vs buy is decided per layer:
| Layer | Buy unless | Build only if |
|---|---|---|
| Storage / warehouse | Never build | (You’re a data infra company) |
| ELT / ingest | Never build | Source isn’t supported by Fivetran/Airbyte |
| Modeling (dbt) | Always build | This is your IP |
| BI / dashboards | Buy at <100 consumers | Embedded analytics for customers |
| Feature store | Defer until 3+ prod models | Then build OR buy Tecton/Hopsworks |
| ML platform | Defer until 5+ prod models | Then buy SageMaker/Vertex/Databricks |
Run data_product_strategy_picker.py for a stage-specific recommendation. See references/data_product_strategy.md for kill criteria per architecture and the build-vs-buy decision tree.
The shift: at Series B+, customer data is no longer just operational — it’s an asset that can be:
But it can also be a liability:
Run data_asset_valuator.py with corpus characteristics to get strategic value score + productization paths + risk-adjusted value.
See references/customer_data_as_asset.md for the valuation framework, M&A diligence prep checklist, and contractual constraint audit pattern.
The wrong question: "Should we hire a data scientist?" The right question: "What’s the next decision we can’t make because we lack data, and what role unblocks that?"
Stage-to-role map (B2B SaaS baseline):
| Stage | First hire | Then | Then |
|---|---|---|---|
| Pre-seed / seed | Founder-as-analyst (SQL + spreadsheets) | — | — |
| Series A (Series A) | Analyst | Analytics engineer (dbt) | — |
| Series B | Data engineer | Senior analyst (embedded in GTM) | Data PM (if 3+ teams need data) |
| Growth | Manager of analytics | ML engineer (if model is core) | Head of Data |
| Late-stage | Head of Data → CDO | Specialized: BI, MLE, DPO | Federated owners per domain (mesh) |
Centralize-vs-embed trigger: when 3+ functional areas (sales, marketing, product, ops, CS) need bespoke data weekly, the central team becomes the bottleneck. Move to hub-and-spoke (central platform + embedded analysts) before that becomes a hiring crisis.
See references/data_team_org_evolution.md.
Goal: Decide whether a specific data source can train a specific use case.
# 1. Build sources.json with one entry per data source
# 2. Run the audit
python scripts/ai_training_data_audit.py sources.json
# 3. For each MITIGATE: assign owner + remediation
# 4. For each NO-GO: document the kill reason for the legal log
# 5. Cross-check with cs-general-counsel-advisor on top-3 mitigation items
# 6. Log via /cs:decide
Goal: Pick warehouse / lakehouse / mesh and the build-vs-buy split for the next 12 months.
python scripts/data_product_strategy_picker.py profile.json
# Cross-check with cs-cto-advisor on engineering capacity
# Cross-check with cs-cfo-advisor on 3-year TCO
# Log via /cs:decide; consider /cs:freeze 90 if signing a multi-year SaaS contract
Goal: Value the data corpus and prepare for due diligence.
data_asset_valuator.pycustomer_data_as_asset.mdGoal: Build the next 18 months of data hires aligned to business decisions.
**Bottom Line:** [one sentence — decision and rationale]
**The Decision:** [one of the 4 framings]
**The Evidence:** [numbers, not adjectives]
**How to Act:** [3 concrete next steps]
**Your Decision:** [the call only the founder can make]
../cto-advisor/ — architecture capacity, scaling cliffs../ciso-advisor/ — data security, threat modeling for productized data../general-counsel-advisor/ — contractual constraints, DPA, training-data rights../cfo-advisor/ — build-vs-buy TCO, M&A valuation math../chro-advisor/ — data team hiring, leveling, comp../../../engineering/database-designer/ — tactical schema design../../../engineering/rag-architect/ — tactical AI/RAG implementation../../../engineering/llm-cost-optimizer/ — model cost managementVersion: 1.0.0 Status: Production Ready Disclaimer: Decisions touching training data rights, data productization, or M&A data diligence should involve qualified counsel. This skill surfaces decisions and tradeoffs — it does not replace legal review.