Skill

ai-data-retention

Manages AI model retention and machine unlearning for GDPR compliance. Covers training data deletion verification, model versioning, SISA/gradient-based techniques, and retraining triggers.

ai-ml

security

npx claudepluginhub mukul975/privacy-data-protection-skills --plugin privacy-skills-complete

Tool Access

This skill uses the workspace's default tool permissions.

Preview

GDPR Art. 5(1)(e) storage limitation requires that personal data be kept no longer than necessary for the processing purpose. For AI systems, this creates complex retention challenges: training data used to build a model may no longer be needed once training is complete, but the model itself encodes information about the training data. Machine unlearning — the process of removing the influence ...

Supporting Assets

assets/template.mdreferences/standards.mdreferences/workflows.mdscripts/process.py

SKILL.md

Similar Skills

github-deep-research

63.9k

Conducts multi-round deep research on GitHub repos via API and web searches, generating markdown reports with executive summaries, timelines, metrics, and Mermaid diagrams.

2 files

bytedance-deer-flow-1

surprise-me

63.9k

Dynamically discovers and combines enabled skills into cohesive, unexpected delightful experiences like interactive HTML or themed artifacts. Activates on 'surprise me', inspiration, or boredom cues.

bytedance-deer-flow-1

image-generation

63.9k

Generates images from structured JSON prompts via Python script execution. Supports reference images and aspect ratios for characters, scenes, products, visuals.

2 files

bytedance-deer-flow-1

Stats

Parent Repo Stars37

Parent Repo Forks4

Last CommitMar 15, 2026

Used By2 plugins

Actions

View Source View Plugin View on GitHub View README

Data Category	Description	Retention Consideration
Raw training data	Original personal data used for model training	Delete after training unless retraining justifies retention
Processed training data	Cleaned, augmented, feature-engineered data	Same as raw — delete when training purpose exhausted
Validation/test data	Data used for model evaluation	Retain for model audit and comparison; pseudonymise
Model weights/parameters	Trained model artefacts encoding training data information	Retain while model is deployed; delete on decommission
Inference logs	Inputs and outputs of model predictions	Retention based on purpose (audit, debugging, rights exercise)
Model metadata	Training configuration, hyperparameters, provenance	Retain for compliance documentation; low privacy risk
Embedding vectors	Dense representations derived from personal data	May contain personal data — apply retention policy

Phase	Retention Rule
Development	Training data retained during active development
Deployment	Training data deleted unless retraining is planned within defined period
Operation	Inference logs retained per purpose (30 days debug, 1 year audit)
Retraining	New training data collected; old data deleted post-training
Decommission	All model artefacts, training data, and logs deleted; retain only compliance documentation

Property	Value
Guarantee	Complete — model has no knowledge of deleted data
Cost	Very high — full training cost for each deletion request
Feasibility	Impractical for large models or frequent deletion requests
When to use	Small models, infrequent requests, high-sensitivity data

Property	Value
Guarantee	Exact within the affected shard
Cost	1/k of full retraining (k = number of shards)
Feasibility	Requires SISA architecture from the start
Trade-off	Model accuracy may decrease with fewer shards contributing

Property	Value
Guarantee	Approximate — statistically similar to retrained model
Cost	Low — few gradient steps
Feasibility	Works for most differentiable models
Verification	Requires membership inference testing to verify

Property	Value
Guarantee	Approximate — first-order approximation
Cost	Medium — requires Hessian computation
Feasibility	Best for smaller models or linear models

Element	Documentation
Model version ID	Unique identifier (e.g., model-v2.3.1-20260314)
Training data snapshot	Hash of training dataset used for this version
Training date	When training was executed
Data deletions applied	Which data subject deletions are reflected in this version
Unlearning applied	Any approximate unlearning applied since last full retraining
Privacy properties	DP epsilon, MI test results for this version
Deployment dates	When deployed and when retired

Trigger	Action
Accumulated deletion requests exceed threshold	Full retraining on updated dataset
Scheduled periodic retraining	Incorporate all pending deletions
Privacy audit reveals unacceptable leakage	Retrain with enhanced privacy measures
Model performance degradation	Retrain with current data (post-deletions)
Regulatory change	Assess if retraining needed for compliance

ai-data-retention

Tool Access

Preview

Supporting Assets

SKILL.md

Similar Skills

ai-data-retention

Tool Access

Preview

Supporting Assets

SKILL.md

AI Model Retention and Unlearning

Overview