From harness-claude
Designs structured, tamper-evident audit logs for security events like authentication, authorization, and data access. Useful for new apps, adding logs to existing systems, SOC2/HIPAA/PCI-DSS compliance, or incident investigation.
npx claudepluginhub intense-visions/harness-engineering --plugin harness-claudeThis skill uses the workspace's default tool permissions.
> Log the who, what, when, where, and outcome of every security-relevant event in a
Guides compliance logging for SOC2, GDPR, HIPAA, PCI-DSS: maps controls, retention policies, audit trails, and balances data minimization with breach detection.
Implements tamper-evident audit logging, SIEM integration, vulnerability scanning, and compliance reporting for Python, Go, TypeScript apps.
Detects logging failures including log injection (CWE-117), insufficient logging, secrets in logs, and audit trail issues in Python, Java, Go, TypeScript, and PHP during whitebox pentesting.
Share bugs, ideas, or general feedback.
Log the who, what, when, where, and outcome of every security-relevant event in a structured, tamper-evident format that enables both real-time detection and forensic reconstruction
Attackers who gain access to a system routinely attempt to cover their tracks by deleting or modifying logs. Log tampering is not theoretical -- it is standard attacker tradecraft:
Logging is not optional overhead -- it is the immune system of the application. A system without audit logging is a system where breaches are invisible.
Define the security-relevant events that must be logged. At minimum, capture these event categories:
Use a structured event format for every audit entry. Every audit event must include these fields:
timestamp: ISO 8601 with timezone, always UTC (e.g., 2024-01-15T14:30:00.123Z)event_type: Enumerated, dot-separated category (e.g., auth.login.success,
authz.permission.denied, data.export.initiated)actor: Who performed the action -- user ID, service account name, or system for
automated processesaction: The verb -- create, read, update, delete, login, logout,
export, denyresource: What was acted upon -- resource type and identifier (e.g., user:u-12345,
order:o-67890, config:feature-flags)outcome: success or failure, with reason for failures (e.g.,
insufficient_permissions, invalid_credentials, rate_limited)source_ip: The IP address of the request originsession_id: Session or request correlation ID for linking related eventsmetadata: Additional context -- user agent, request path, changed fields for update
operations, previous and new values for configuration changesLog at the application layer, not just infrastructure. Infrastructure logs (web
server access logs, firewall logs, load balancer logs) capture network-level events but
miss application-level semantics. The infrastructure sees
GET /api/customers?limit=99999 200 OK. The application knows "User 123 exported all
50,000 customer records to CSV." Both are needed; the application event is far more
actionable for security investigation. Implement audit logging as a first-class
application concern, not an afterthought bolted onto HTTP middleware.
Make logs tamper-evident. An attacker who compromises the application server will attempt to delete or modify logs to cover their tracks. Tamper evidence ensures modifications are detectable:
entry_hash = SHA-256(entry_data + previous_entry_hash). The first entry uses a known
sentinel value. Any modification, deletion, or insertion of entries breaks the hash
chain from that point forward.Never log secrets or PII unnecessarily. Audit logs must not contain passwords, API keys, session tokens, credit card numbers, social security numbers, or other sensitive data. Log the fact that an action occurred, not the sensitive content. Correct: "User u-12345 updated their password at 2024-01-15T14:30:00Z." Incorrect: "User u-12345 changed password from 'oldpass123' to 'newpass456'." For PII, log only the minimum needed for investigation -- user ID rather than full name and email, unless specific compliance requirements mandate otherwise. Implement log scrubbing as a safety net: regex-based filters that detect and redact patterns matching known secret formats before log entries are written.
Define retention and archival policies aligned with compliance requirements. Common minimums: SOC2 requires 1 year of audit log retention. PCI-DSS requires 1 year with 3 months immediately accessible for analysis. HIPAA requires 6 years. GDPR does not specify a retention period but requires that retention be justified and proportionate. Archive older logs to cold storage (S3 Glacier, Azure Cool/Archive Blob) to manage costs, but ensure archived logs remain searchable and verifiable. Test restoration from cold storage regularly -- logs that cannot be retrieved when needed are equivalent to no logs.
Alert on high-severity events in real time. Logs that are only read during
post-incident investigation provide forensic value but no detection value. Configure
real-time alerts for: multiple failed login attempts from a single IP or against a single
account (brute force, credential stuffing), privilege escalation events, access to
sensitive data outside normal patterns (time of day, volume, geographic location),
administrative actions from unusual source IPs, any outcome: failure on critical
operations (payment processing, data export), and new device/location for privileged
accounts. Route alerts to the security team's incident response channel (PagerDuty,
Slack, email) with enough context to triage without reading raw logs.
The OWASP Logging Cheat Sheet event categories: Authentication (success, failure, lockout), Authorization (access granted, denied, permission changes), Session management (creation, destruction, timeout, concurrent session detection), Input validation failures (rejected input, WAF blocks, malformed requests), Application errors (unhandled exceptions, dependency failures, timeout events), High-value transactions (financial operations, data exports, bulk operations, configuration changes). Use this as a starting checklist and add domain-specific events for your application.
Structured logging format example: A complete JSON audit event:
{
"timestamp": "2024-01-15T14:30:00.123Z",
"event_type": "authz.permission.denied",
"actor": { "user_id": "u-123", "session_id": "s-456", "service": "order-api" },
"action": "delete",
"resource": { "type": "order", "id": "o-789" },
"outcome": "failure",
"reason": "insufficient_permissions",
"source_ip": "198.51.100.42",
"user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64)",
"request_id": "req-abc-def",
"metadata": {
"required_permission": "order:delete",
"actual_permissions": ["order:read", "order:update"]
}
}
Use an enumerated event_type taxonomy defined in code so events can be queried
consistently across services. Avoid free-text event descriptions that vary between
developers.
Tamper evidence with hash chaining in detail: Each log entry includes:
entry_hash = SHA-256(canonical_json(entry_data) + previous_entry_hash). The first entry
uses a well-known sentinel (e.g., SHA-256("GENESIS")) as the "previous hash."
Verification: recompute the chain from any starting point. If any entry has been modified,
deleted, or inserted, all subsequent hashes fail verification. Store periodic "anchor
hashes" in an independent system (a separate database, a blockchain, a signed timestamping
service) so that the integrity of the chain can be verified even if the primary log store
is compromised. This technique is used by AWS CloudTrail log file integrity validation.
Log levels vs audit events -- they are different concerns: Application log levels (DEBUG, INFO, WARN, ERROR) serve operational troubleshooting. Audit events record security-relevant business events regardless of severity. A successful login is not an "error" or a "warning" -- it is a security event that must be recorded. Implement audit logging as a separate subsystem with its own transport, storage, and retention. Do not rely on application log level configuration to control audit event emission -- changing the log level to WARN to reduce noise must never suppress security audit events.
No logging of failed authentication attempts. Failed logins are the single most important signal for detecting brute force attacks, credential stuffing campaigns, and account takeover attempts. An application that only logs successful logins is blind to the 10,000 failed attempts that preceded the one successful compromise.
Logging everything at DEBUG level in production. Produces such volume that security events are buried in noise. Debug logging also risks exposing sensitive data -- request bodies containing passwords, internal state containing encryption keys, SQL queries containing user data. Use structured audit events for security concerns, not verbose debug output. Keep DEBUG logging disabled in production or restricted to specific components during active troubleshooting.
Logs stored only on the application server. If the server is compromised, the attacker deletes the local logs and the evidence is gone. Always forward logs to an independent aggregator in real time. The local log file on the application server should be treated as a buffer, not as the permanent record.
No correlation IDs across services. In a distributed system, a single user action may span multiple services. Without a request or correlation ID propagated through all service calls and included in every audit event, reconstructing an attack timeline requires manual timestamp correlation across dozens of log sources -- slow, error-prone, and often impossible when clocks are skewed. Propagate a correlation ID (via HTTP header, message metadata, or distributed tracing context) and include it in every audit event.
Audit logging as an afterthought, added post-launch. The most vulnerable period for an application is immediately after launch -- configuration is fresh, security hardening is incomplete, and the team is still learning the system's behavior. Adding audit logging weeks or months after launch means this critical period has no audit trail. Design and implement audit logging alongside the feature it monitors, shipping them together.
Logging PII in violation of privacy regulations. Audit logs that contain full names, email addresses, phone numbers, IP addresses (considered PII under GDPR), or health information create a secondary data protection liability. The audit log itself becomes a sensitive data store requiring access controls, encryption, and data subject access request handling. Log the minimum identifiers needed (user IDs, resource IDs) and join with the primary data store only when investigation requires it.
Alert fatigue from poorly tuned thresholds. Alerting on every failed login produces thousands of alerts per day, all ignored. Tune alert thresholds to reduce false positives: alert on 10+ failed logins from a single IP in 5 minutes, not on individual failures. Use progressive escalation: first alert goes to a Slack channel, repeated alerts page the on-call engineer. Review and adjust thresholds monthly based on alert-to-incident ratio.