AI Infrastructure Security

Protect AI/ML infrastructure from attacks targeting model storage, APIs, and compute resources.

Quick Reference

Skill:       infrastructure-security
Agent:       06-api-security-tester
OWASP:       LLM03 (Supply Chain), LLM10 (Unbounded Consumption)
NIST:        Govern, Manage
Use Case:    Secure AI deployment infrastructure

Infrastructure Attack Surface

                    [External Threats]
                          ↓
[API Gateway] → [Load Balancer] → [Inference Servers]
      ↓              ↓                    ↓
[Rate Limit]   [DDoS Protection]   [Model Storage]
      ↓              ↓                    ↓
[Auth/AuthZ]   [TLS Termination]   [Secrets Manager]

Security Layers

1. API Security

Authentication:
  methods:
    - API keys (rotation: 90 days)
    - OAuth 2.0 / OIDC
    - mTLS for service-to-service
  requirements:
    - Strong key generation
    - Secure transmission
    - Revocation capability

Rate Limiting:
  per_user: 100 req/min
  per_ip: 1000 req/min
  burst: 50
  cost_based: true  # Token-aware limiting

Input Validation:
  max_length: 4096 tokens
  content_type: application/json
  schema_validation: strict
  encoding: UTF-8 normalized

# API Security Configuration
class APISecurityConfig:
    def __init__(self):
        self.auth_config = {
            'type': 'oauth2',
            'token_expiry': 3600,
            'refresh_enabled': True,
        }

        self.rate_limits = {
            'default': {'requests': 100, 'window': 60},
            'premium': {'requests': 1000, 'window': 60},
            'burst_multiplier': 2,
        }

        self.input_validation = {
            'max_tokens': 4096,
            'blocked_patterns': self._load_blocked_patterns(),
            'sanitization': True,
        }

2. Model Protection

Storage Security:
  encryption: AES-256-GCM
  access_control: RBAC
  audit_logging: enabled
  backup: encrypted, offsite

Theft Prevention:
  query_limits: 10000/day per user
  output_perturbation: enabled
  watermarking: model and output
  access_logging: all queries

class ModelProtection:
    def __init__(self, model):
        self.model = model
        self.watermark = self._generate_watermark()

    def protected_inference(self, input_data, user_id):
        # Log the query
        self.log_query(user_id, input_data)

        # Check query limits
        if self.exceeds_limit(user_id):
            raise RateLimitError("Query limit exceeded")

        # Run inference
        output = self.model(input_data)

        # Add output perturbation (anti-extraction)
        output = self.add_perturbation(output)

        # Apply watermark
        output = self.apply_watermark(output)

        return output

3. Network Security

Network Configuration:
  internal_only: true
  vpc_isolation: enabled
  firewall_rules:
    - allow: internal_services
    - deny: all_external (except API gateway)

TLS Configuration:
  version: "1.3"
  cipher_suites: [TLS_AES_256_GCM_SHA384]
  certificate_rotation: 90 days
  mtls: service_to_service

4. Compute Security

Container Security:
  base_image: distroless
  user: non-root
  filesystem: read-only
  capabilities: minimal
  seccomp: enabled

Resource Limits:
  cpu: 4 cores max
  memory: 16GB max
  gpu_memory: 24GB max
  disk: ephemeral only

Isolation:
  runtime: gvisor
  network: namespace isolated
  secrets: mounted, not in env

Security Checklist

API Layer:
  - [ ] Strong authentication (OAuth2/mTLS)
  - [ ] Rate limiting implemented
  - [ ] Input validation enabled
  - [ ] Error messages sanitized
  - [ ] Logging comprehensive

Storage Layer:
  - [ ] Encryption at rest
  - [ ] Access controls configured
  - [ ] Audit logging enabled
  - [ ] Backup encryption

Network Layer:
  - [ ] TLS 1.3 enforced
  - [ ] Internal VPC only
  - [ ] Firewall rules configured
  - [ ] DDoS protection enabled

Compute Layer:
  - [ ] Non-root containers
  - [ ] Resource limits set
  - [ ] Secrets in vault
  - [ ] Immutable infrastructure

Vulnerability Testing

class InfrastructureSecurityTester:
    def test_api_security(self, endpoint):
        results = []

        # Test authentication bypass
        results.append(self.test_auth_bypass(endpoint))

        # Test rate limiting
        results.append(self.test_rate_limits(endpoint))

        # Test input validation
        results.append(self.test_input_validation(endpoint))

        # Test error handling
        results.append(self.test_error_disclosure(endpoint))

        return results

    def test_auth_bypass(self, endpoint):
        payloads = [
            {'Authorization': ''},
            {'Authorization': 'Bearer invalid'},
            {'Authorization': 'Bearer ' + 'a' * 1000},
        ]
        for payload in payloads:
            response = requests.get(endpoint, headers=payload)
            if response.status_code != 401:
                return Finding("auth_bypass", "CRITICAL")
        return None

Severity Classification

CRITICAL:
  - Authentication bypass
  - Model theft possible
  - Data exposure

HIGH:
  - Rate limiting bypassable
  - Weak encryption
  - Insufficient logging

MEDIUM:
  - Missing input validation
  - Verbose error messages
  - Outdated dependencies

LOW:
  - Non-optimal configurations
  - Minor policy gaps

Troubleshooting

Issue: API rate limiting not effective
Solution: Implement token-based limits, add IP reputation

Issue: Model extraction detected
Solution: Lower query limits, add output perturbation

Issue: High latency from security layers
Solution: Optimize validation, use caching, async logging

Integration Points

Component	Purpose
Agent 06	Security testing
Agent 08	CI/CD security gates
/test api	Security scanning
SIEM	Security monitoring

Protect AI infrastructure with defense-in-depth security.

infrastructure-security

AI Infrastructure Security

Quick Reference

Infrastructure Attack Surface

Security Layers

1. API Security

2. Model Protection

3. Network Security

4. Compute Security

Security Checklist

Vulnerability Testing

Severity Classification

Troubleshooting

Integration Points

Similar Skills