From records-of-processing-skills
Generates GDPR Records of Processing Activities from IT inventories including Active Directory, cloud catalogs, API gateway logs, and database schemas. Automates field population, data flow discovery, and RoPA mapping.
npx claudepluginhub mukul975/privacy-data-protection-skills --plugin records-of-processing-skillsThis skill uses the workspace's default tool permissions.
Manual RoPA creation through interviews and questionnaires is time-consuming, subjective, and prone to omitting processing activities that stakeholders forget or are unaware of. Automated RoPA generation leverages existing IT system inventories, cloud service catalogs, API gateway logs, and database schemas to discover processing activities and pre-populate Art. 30(1) fields. This approach redu...
Creates isolated Git worktrees for feature branches with prioritized directory selection, gitignore safety checks, auto project setup for Node/Python/Rust/Go, and baseline verification.
Executes implementation plans in current session by dispatching fresh subagents per independent task, with two-stage reviews: spec compliance then code quality.
Dispatches parallel agents to independently tackle 2+ tasks like separate test failures or subsystems without shared state or dependencies.
Manual RoPA creation through interviews and questionnaires is time-consuming, subjective, and prone to omitting processing activities that stakeholders forget or are unaware of. Automated RoPA generation leverages existing IT system inventories, cloud service catalogs, API gateway logs, and database schemas to discover processing activities and pre-populate Art. 30(1) fields. This approach reduces the RoPA creation burden by 60-80% while improving coverage, as it captures processing activities that manual interviews routinely miss (shadow IT, automated data pipelines, third-party integrations).
What it reveals:
Art. 30 fields populated:
| Field | Data Extracted from AD |
|---|---|
| Art. 30(1)(a) — Controller identity | Organisation name, department hierarchy, DPO group membership |
| Art. 30(1)(d) — Recipients | Security group memberships reveal internal access patterns |
| Art. 30(1)(g) — Security measures | Conditional access policies, MFA enforcement status |
Discovery approach for Helix Biotech Solutions:
Azure AD Tenant: helix-biotech.onmicrosoft.com
├── Enterprise Applications (87 registered)
│ ├── SAP SuccessFactors → HR processing activities
│ ├── Veeva Vault CDMS → Clinical trial data management
│ ├── Salesforce → Customer relationship management
│ ├── Google Analytics → Website analytics
│ ├── ADP Workforce Now → Payroll processing
│ └── ... (82 more applications)
├── Security Groups (134 groups)
│ ├── SG-HR-Payroll (4 members) → Access to payroll data
│ ├── SG-Clinical-Data (12 members) → Access to clinical trial data
│ ├── SG-Finance-AP (6 members) → Access to accounts payable
│ └── ... (131 more groups)
└── Conditional Access Policies (23 policies)
├── Require MFA for all users
├── Block legacy authentication
├── Require compliant device for clinical systems
└── ... (20 more policies)
What it reveals:
AWS example for Helix Biotech Solutions:
| AWS Service | Region | Personal Data Indicator | Mapped Processing Activity |
|---|---|---|---|
| RDS (PostgreSQL) | eu-central-1 | Tagged: data-classification=confidential | Employee HR database |
| S3 Bucket: helix-clinical-data | eu-central-1 | Tagged: data-classification=restricted, contains-special-category=true | Clinical trial document storage |
| S3 Bucket: helix-marketing | eu-central-1 | Tagged: data-classification=internal | Marketing analytics data |
| Lambda: payroll-processor | eu-central-1 | Invoked by ADP integration | Payroll data transformation |
| CloudFront Distribution | Global (edge) | Logs contain IP addresses | Website content delivery |
| SES (Simple Email Service) | eu-west-1 | Sends to email addresses | Customer and employee email communications |
Azure example:
| Azure Service | Region | Data Classification | Mapped Processing Activity |
|---|---|---|---|
| Azure SQL | West Europe | Confidential | Finance and accounting database |
| Azure Blob Storage | West Europe | Restricted | Clinical trial imaging data |
| Azure AD | Global | Internal | Identity and access management |
| Azure Monitor | West Europe | Internal | System monitoring (may capture user activity) |
What it reveals:
API Gateway analysis for Helix Biotech Solutions:
| API Endpoint | Method | Personal Data Fields | Upstream System | Downstream System | Daily Volume |
|---|---|---|---|---|---|
| /api/v2/employees | GET/POST | employee_id, name, email, department | SAP SuccessFactors | Internal HR portal | 340 requests |
| /api/v2/patients/{id}/records | GET | patient_id, diagnosis, treatment | Veeva Vault CDMS | Clinical reporting dashboard | 1,200 requests |
| /api/v2/orders | POST | customer_name, email, billing_address | Salesforce | SAP ERP | 890 requests |
| /api/v1/analytics/events | POST | ip_address, user_agent, page_url | Website (JS tracker) | Google Analytics | 45,000 requests |
| /api/v2/vendor/payments | POST | vendor_name, bank_account, tax_id | SAP ERP | Banking API (Deutsche Bank) | 120 requests |
What it reveals:
Schema analysis approach:
Personal data column name patterns:
| Pattern | Data Category | Art. 9 Special Category |
|---|---|---|
*name*, *first_name*, *last_name*, *full_name* | Name | No |
*email*, *e_mail* | Email address | No |
*phone*, *mobile*, *telephone* | Phone number | No |
*address*, *street*, *city*, *postcode*, *zip* | Postal address | No |
*dob*, *date_of_birth*, *birth_date* | Date of birth | No |
*ssn*, *social_security*, *tax_id*, *national_id* | National identifier | No |
*salary*, *compensation*, *bank_account*, *iban* | Financial data | No |
*ip_address*, *ip_addr* | IP address | No |
*diagnosis*, *medical*, *health*, *condition* | Health data | Yes |
*genetic*, *dna*, *genome* | Genetic data | Yes |
*biometric*, *fingerprint*, *facial* | Biometric data | Yes |
*ethnicity*, *race*, *ethnic_origin* | Racial/ethnic origin | Yes |
*religion*, *religious*, *belief* | Religious beliefs | Yes |
*union*, *trade_union* | Trade union membership | Yes |
*political*, *party* | Political opinions | Yes |
*criminal*, *conviction*, *offence* | Criminal data (Art. 10) | Art. 10 |
Source: Azure AD tenant configuration + organisation management. Automation: Fully automated from AD tenant metadata.
Source: Application descriptions in AD enterprise application registry + API documentation. Automation: Partially automated. The system name and description provide a starting point, but specific purpose articulation requires human refinement to meet Art. 5(1)(b) specificity requirements.
Approach: Generate a draft purpose from the application description, flag it as "DRAFT — requires DPO review," and assign to the processing owner for refinement.
Source: Database schema analysis + API request/response schemas. Automation: Highly automated for data categories (column name matching). Data subject categories require inference from table context.
Source: AD security groups (internal recipients) + API gateway logs (external integrations) + cloud service catalog (processors). Automation: Highly automated. AD groups reveal internal access. API gateway reveals external data flows. Cloud catalog identifies processors.
Source: Cloud service region configuration + API gateway destination IPs + CDN configuration. Automation: Highly automated. If a cloud service is deployed in a non-EEA region, or API calls route to non-EEA endpoints, the system flags a potential international transfer.
Source: Database table statistics (oldest records), backup retention policies, lifecycle management rules. Automation: Partially automated. Can detect actual retention (how long data exists) but not intended retention (how long it should exist). Flags discrepancies between actual and policy retention.
Source: Cloud security configuration (encryption settings, access policies), AD conditional access policies, WAF rules, certificate configuration. Automation: Highly automated for technical measures. Organisational measures require manual documentation.
| Art. 30(1) Field | Automation Level | Human Review Required |
|---|---|---|
| (a) Controller identity | High (90%) | Verify legal entity name, DPO current |
| (b) Purposes | Low (30%) | Purpose must be articulated specifically |
| (c) Data subject categories | Medium (60%) | Verify inferred categories |
| (c) Personal data categories | High (85%) | Verify pattern-matched columns |
| (d) Recipients | High (80%) | Verify external recipients, add DPA references |
| (e) International transfers | High (85%) | Verify transfer mechanisms |
| (f) Retention periods | Low (25%) | Define policy retention, not just actual |
| (g) Security measures | Medium (70%) | Add organisational measures |
┌─────────────────────┐ ┌──────────────────┐ ┌────────────────────┐
│ Azure AD / Okta │ │ Cloud Catalog │ │ API Gateway Logs │
│ (Org structure, │ │ (AWS/Azure/GCP │ │ (Data flows, │
│ app registrations,│ │ service list, │ │ external calls, │
│ security groups) │ │ regions, tags) │ │ volumes) │
└────────┬────────────┘ └────────┬─────────┘ └─────────┬──────────┘
│ │ │
▼ ▼ ▼
┌────────────────────────────────────────────────────────────────────┐
│ RoPA Auto-Generation Engine │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────┐ │
│ │ AD Parser │ │ Cloud Parser │ │ API Log Analyzer │ │
│ └──────┬───────┘ └──────┬───────┘ └──────────┬───────────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ Field Mapping and Population Engine │ │
│ │ • Pattern matching for personal data columns │ │
│ │ • Data flow graph construction │ │
│ │ • Transfer detection (non-EEA region identification) │ │
│ │ • Security control extraction │ │
│ └────────────────────────┬───────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ Draft RoPA Generator │ │
│ │ • Creates draft entries with populated fields │ │
│ │ • Flags fields requiring human review │ │
│ │ • Assigns to processing owners for validation │ │
│ └────────────────────────┬───────────────────────────────────┘ │
└───────────────────────────┼───────────────────────────────────────┘
│
▼
┌────────────────────────────────────────────────────────────────────┐
│ Database Schema Scanner │
│ (Connects to identified databases, extracts schemas, │
│ applies PII detection patterns) │
└────────────────────────────────────────────────────────────────────┘
│
▼
┌────────────────────────────────────────────────────────────────────┐
│ RoPA Management Platform │
│ (OneTrust / TrustArc / Collibra / Custom) │
│ Draft entries queued for DPO and processing owner review │
└────────────────────────────────────────────────────────────────────┘
False positives in PII detection: Column names like "company_address" may be flagged as personal data when they are business data. Mitigation: human review of all auto-detected PII columns.
Missing context for purposes: Automated systems cannot infer the business purpose of processing from technical metadata alone. Mitigation: purposes are always flagged as "DRAFT" and require human articulation.
Shadow IT blind spots: Systems not registered in AD or cloud catalogs will not be discovered. Mitigation: combine automated discovery with periodic manual questionnaires to capture unregistered processing.
Encrypted or tokenised data: If data is encrypted at the column level or tokenised, schema analysis may not detect personal data. Mitigation: maintain a manual register of encryption/tokenisation mappings.
Consent and lawful basis: Cannot be determined from technical metadata. Always requires legal/DPO input.