Skill

automated-ropa-generation

Generates GDPR Records of Processing Activities from IT inventories including Active Directory, cloud catalogs, API gateway logs, and database schemas. Automates field population, data flow discovery, and RoPA mapping.

Azure

security

automation

Popularity

Parent stars

Parent forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/records-of-processing-skills:automated-ropa-generation

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Supporting Files

assets/template.mdreferences/standards.mdreferences/workflows.mdscripts/process.py

SKILL.md

264 lines · ~3.7k tokens

Stats

LanguagePython

Parent stars86

Parent forks17

MaintenanceGood

Last CommitMar 15, 2026

Actions

View Source View Plugin View on GitHub View README

Automated RoPA Generation

Overview

Manual RoPA creation through interviews and questionnaires is time-consuming, subjective, and prone to omitting processing activities that stakeholders forget or are unaware of. Automated RoPA generation leverages existing IT system inventories, cloud service catalogs, API gateway logs, and database schemas to discover processing activities and pre-populate Art. 30(1) fields. This approach reduces the RoPA creation burden by 60-80% while improving coverage, as it captures processing activities that manual interviews routinely miss (shadow IT, automated data pipelines, third-party integrations).

Data Sources for Automated RoPA Population

Source 1: Active Directory / Azure AD

What it reveals:

Organisational structure (departments, teams) for mapping processing owners
Group memberships revealing who has access to what systems (recipient categories)
Application registrations and service principals (processing system inventory)
Conditional access policies (security measures documentation)

Art. 30 fields populated:

Field	Data Extracted from AD
Art. 30(1)(a) — Controller identity	Organisation name, department hierarchy, DPO group membership
Art. 30(1)(d) — Recipients	Security group memberships reveal internal access patterns
Art. 30(1)(g) — Security measures	Conditional access policies, MFA enforcement status

Discovery approach for Helix Biotech Solutions:

Azure AD Tenant: helix-biotech.onmicrosoft.com
├── Enterprise Applications (87 registered)
│   ├── SAP SuccessFactors → HR processing activities
│   ├── Veeva Vault CDMS → Clinical trial data management
│   ├── Salesforce → Customer relationship management
│   ├── Google Analytics → Website analytics
│   ├── ADP Workforce Now → Payroll processing
│   └── ... (82 more applications)
├── Security Groups (134 groups)
│   ├── SG-HR-Payroll (4 members) → Access to payroll data
│   ├── SG-Clinical-Data (12 members) → Access to clinical trial data
│   ├── SG-Finance-AP (6 members) → Access to accounts payable
│   └── ... (131 more groups)
└── Conditional Access Policies (23 policies)
    ├── Require MFA for all users
    ├── Block legacy authentication
    ├── Require compliant device for clinical systems
    └── ... (20 more policies)

Source 2: Cloud Service Catalog

What it reveals:

All cloud services (IaaS, PaaS, SaaS) in use
Data residency (which regions/countries data is stored in)
Service provider identity (processors and sub-processors)
Data classification tags (if implemented)

AWS example for Helix Biotech Solutions:

AWS Service	Region	Personal Data Indicator	Mapped Processing Activity
RDS (PostgreSQL)	eu-central-1	Tagged: data-classification=confidential	Employee HR database
S3 Bucket: helix-clinical-data	eu-central-1	Tagged: data-classification=restricted, contains-special-category=true	Clinical trial document storage
S3 Bucket: helix-marketing	eu-central-1	Tagged: data-classification=internal	Marketing analytics data
Lambda: payroll-processor	eu-central-1	Invoked by ADP integration	Payroll data transformation
CloudFront Distribution	Global (edge)	Logs contain IP addresses	Website content delivery
SES (Simple Email Service)	eu-west-1	Sends to email addresses	Customer and employee email communications

Azure example:

Azure Service	Region	Data Classification	Mapped Processing Activity
Azure SQL	West Europe	Confidential	Finance and accounting database
Azure Blob Storage	West Europe	Restricted	Clinical trial imaging data
Azure AD	Global	Internal	Identity and access management
Azure Monitor	West Europe	Internal	System monitoring (may capture user activity)

Source 3: API Gateway Logs

What it reveals:

Which APIs handle personal data (based on request/response inspection)
Data flow patterns between systems (upstream and downstream)
Third-party API integrations (external recipients)
Volume and frequency of data processing (helps assess whether processing is "occasional" for Art. 30(5))

API Gateway analysis for Helix Biotech Solutions:

API Endpoint	Method	Personal Data Fields	Upstream System	Downstream System	Daily Volume
/api/v2/employees	GET/POST	employee_id, name, email, department	SAP SuccessFactors	Internal HR portal	340 requests
/api/v2/patients/{id}/records	GET	patient_id, diagnosis, treatment	Veeva Vault CDMS	Clinical reporting dashboard	1,200 requests
/api/v2/orders	POST	customer_name, email, billing_address	Salesforce	SAP ERP	890 requests
/api/v1/analytics/events	POST	ip_address, user_agent, page_url	Website (JS tracker)	Google Analytics	45,000 requests
/api/v2/vendor/payments	POST	vendor_name, bank_account, tax_id	SAP ERP	Banking API (Deutsche Bank)	120 requests

Source 4: Database Schemas

What it reveals:

Exact data categories stored (column names and types)
Table relationships revealing data flows
Personal data identification through column name pattern matching
Data volume and growth patterns

Schema analysis approach:

Connect to each database identified in the cloud catalog.
Extract table schemas (column names, data types, constraints).
Apply personal data identification patterns to column names.
Map identified personal data columns to Art. 30(1)(c) data categories.

Personal data column name patterns:

Pattern	Data Category	Art. 9 Special Category
`name`, `first_name`, `last_name`, `full_name`	Name	No
`email`, `e_mail`	Email address	No
`phone`, `mobile`, `telephone`	Phone number	No
`address`, `street`, `city`, `postcode`, `zip`	Postal address	No
`dob`, `date_of_birth`, `birth_date`	Date of birth	No
`ssn`, `social_security`, `tax_id`, `national_id`	National identifier	No
`salary`, `compensation`, `bank_account`, `iban`	Financial data	No
`ip_address`, `ip_addr`	IP address	No
`diagnosis`, `medical`, `health`, `condition`	Health data	Yes
`genetic`, `dna`, `genome`	Genetic data	Yes
`biometric`, `fingerprint`, `facial`	Biometric data	Yes
`ethnicity`, `race`, `ethnic_origin`	Racial/ethnic origin	Yes
`religion`, `religious`, `belief`	Religious beliefs	Yes
`union`, `trade_union`	Trade union membership	Yes
`political`, `party`	Political opinions	Yes
`criminal`, `conviction`, `offence`	Criminal data (Art. 10)	Art. 10

Automated Field Population Logic

Art. 30(1)(a) — Controller Identity

Source: Azure AD tenant configuration + organisation management. Automation: Fully automated from AD tenant metadata.

Art. 30(1)(b) — Purposes

Source: Application descriptions in AD enterprise application registry + API documentation. Automation: Partially automated. The system name and description provide a starting point, but specific purpose articulation requires human refinement to meet Art. 5(1)(b) specificity requirements.

Approach: Generate a draft purpose from the application description, flag it as "DRAFT — requires DPO review," and assign to the processing owner for refinement.

Art. 30(1)(c) — Data Subject and Data Categories

Source: Database schema analysis + API request/response schemas. Automation: Highly automated for data categories (column name matching). Data subject categories require inference from table context.

Art. 30(1)(d) — Recipients

Source: AD security groups (internal recipients) + API gateway logs (external integrations) + cloud service catalog (processors). Automation: Highly automated. AD groups reveal internal access. API gateway reveals external data flows. Cloud catalog identifies processors.

Art. 30(1)(e) — International Transfers

Source: Cloud service region configuration + API gateway destination IPs + CDN configuration. Automation: Highly automated. If a cloud service is deployed in a non-EEA region, or API calls route to non-EEA endpoints, the system flags a potential international transfer.

Art. 30(1)(f) — Retention Periods

Source: Database table statistics (oldest records), backup retention policies, lifecycle management rules. Automation: Partially automated. Can detect actual retention (how long data exists) but not intended retention (how long it should exist). Flags discrepancies between actual and policy retention.

Art. 30(1)(g) — Security Measures

Source: Cloud security configuration (encryption settings, access policies), AD conditional access policies, WAF rules, certificate configuration. Automation: Highly automated for technical measures. Organisational measures require manual documentation.

Automation Coverage Summary

Art. 30(1) Field	Automation Level	Human Review Required
(a) Controller identity	High (90%)	Verify legal entity name, DPO current
(b) Purposes	Low (30%)	Purpose must be articulated specifically
(c) Data subject categories	Medium (60%)	Verify inferred categories
(c) Personal data categories	High (85%)	Verify pattern-matched columns
(d) Recipients	High (80%)	Verify external recipients, add DPA references
(e) International transfers	High (85%)	Verify transfer mechanisms
(f) Retention periods	Low (25%)	Define policy retention, not just actual
(g) Security measures	Medium (70%)	Add organisational measures

Implementation Architecture

┌─────────────────────┐  ┌──────────────────┐  ┌────────────────────┐
│   Azure AD / Okta   │  │  Cloud Catalog   │  │  API Gateway Logs  │
│  (Org structure,    │  │  (AWS/Azure/GCP   │  │  (Data flows,      │
│   app registrations,│  │   service list,   │  │   external calls,  │
│   security groups)  │  │   regions, tags)  │  │   volumes)         │
└────────┬────────────┘  └────────┬─────────┘  └─────────┬──────────┘
         │                        │                       │
         ▼                        ▼                       ▼
┌────────────────────────────────────────────────────────────────────┐
│                    RoPA Auto-Generation Engine                     │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────────────┐    │
│  │ AD Parser    │  │ Cloud Parser │  │ API Log Analyzer     │    │
│  └──────┬───────┘  └──────┬───────┘  └──────────┬───────────┘    │
│         │                  │                      │               │
│         ▼                  ▼                      ▼               │
│  ┌────────────────────────────────────────────────────────────┐   │
│  │              Field Mapping and Population Engine            │   │
│  │  • Pattern matching for personal data columns              │   │
│  │  • Data flow graph construction                            │   │
│  │  • Transfer detection (non-EEA region identification)      │   │
│  │  • Security control extraction                             │   │
│  └────────────────────────┬───────────────────────────────────┘   │
│                           │                                       │
│                           ▼                                       │
│  ┌────────────────────────────────────────────────────────────┐   │
│  │              Draft RoPA Generator                          │   │
│  │  • Creates draft entries with populated fields             │   │
│  │  • Flags fields requiring human review                     │   │
│  │  • Assigns to processing owners for validation             │   │
│  └────────────────────────┬───────────────────────────────────┘   │
└───────────────────────────┼───────────────────────────────────────┘
                            │
                            ▼
┌────────────────────────────────────────────────────────────────────┐
│                Database Schema Scanner                              │
│  (Connects to identified databases, extracts schemas,              │
│   applies PII detection patterns)                                  │
└────────────────────────────────────────────────────────────────────┘
                            │
                            ▼
┌────────────────────────────────────────────────────────────────────┐
│               RoPA Management Platform                             │
│  (OneTrust / TrustArc / Collibra / Custom)                        │
│  Draft entries queued for DPO and processing owner review          │
└────────────────────────────────────────────────────────────────────┘

Limitations and Risk Mitigation

False positives in PII detection: Column names like "company_address" may be flagged as personal data when they are business data. Mitigation: human review of all auto-detected PII columns.
Missing context for purposes: Automated systems cannot infer the business purpose of processing from technical metadata alone. Mitigation: purposes are always flagged as "DRAFT" and require human articulation.
Shadow IT blind spots: Systems not registered in AD or cloud catalogs will not be discovered. Mitigation: combine automated discovery with periodic manual questionnaires to capture unregistered processing.
Encrypted or tokenised data: If data is encrypted at the column level or tokenised, schema analysis may not detect personal data. Mitigation: maintain a manual register of encryption/tokenisation mappings.
Consent and lawful basis: Cannot be determined from technical metadata. Always requires legal/DPO input.

automated-ropa-generation

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

automated-ropa-generation

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

Automated RoPA Generation

Overview

Data Sources for Automated RoPA Population

Source 1: Active Directory / Azure AD

Source 2: Cloud Service Catalog

Source 3: API Gateway Logs

Source 4: Database Schemas

Automated Field Population Logic

Art. 30(1)(a) — Controller Identity

Art. 30(1)(b) — Purposes

Art. 30(1)(c) — Data Subject and Data Categories

Art. 30(1)(d) — Recipients

Art. 30(1)(e) — International Transfers

Art. 30(1)(f) — Retention Periods

Art. 30(1)(g) — Security Measures

Automation Coverage Summary

Implementation Architecture

Limitations and Risk Mitigation

Reused across plugins

Similar Skills

Automated RoPA Generation

Overview

Data Sources for Automated RoPA Population

Source 1: Active Directory / Azure AD

Source 2: Cloud Service Catalog

Source 3: API Gateway Logs

Source 4: Database Schemas

Automated Field Population Logic

Art. 30(1)(a) — Controller Identity

Art. 30(1)(b) — Purposes

Art. 30(1)(c) — Data Subject and Data Categories

Art. 30(1)(d) — Recipients

Art. 30(1)(e) — International Transfers

Art. 30(1)(f) — Retention Periods

Art. 30(1)(g) — Security Measures

Automation Coverage Summary

Implementation Architecture

Limitations and Risk Mitigation

Reused across plugins

Similar Skills