Help us improve
Share bugs, ideas, or general feedback.
From cx-skills
OpenTelemetry Collector deployment, configuration, and troubleshooting for Coralogix users. Use when writing or debugging collector configs — the `coralogix` exporter (`domain:` vs `endpoint:`, `${env:CORALOGIX_PRIVATE_KEY}` bracket syntax, `coralogix/resource_catalog` variant), the universal processor chain, agent → cluster-collector → gateway topology, spanmetrics/tail_sampling/k8sattributes placement, component stability checks, data-safety/redaction routing including `url_sanitizer`, `sanitize_span_name`, `allow_all_keys`, and before/after validation for broad sanitizer over-sanitization, spanmetrics cardinality protection, and Coralogix-specific presets. Covers the `otel-integration` Helm chart (EKS/GKE/AKS/ OpenShift, GKE Autopilot Warden, EKS Fargate), ECS EC2 daemonset, ECS Fargate sidecar, Linux/Windows/macOS standalone, Docker, and the universal installer. Not for OTTL authoring (use the `opentelemetry-ottl` skill) or OpAMP supervisor/Fleet Manager internals beyond the config-precedence callout.
npx claudepluginhub coralogix/cx-skills --plugin coralogixHow this skill is triggered — by the user, by Claude, or both
Slash command
/cx-skills:opentelemetry-collectorThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
The Coralogix-flavored OpenTelemetry Collector — the `coralogix` exporter, the
evals/coralogix-exporter-domain-env-var-syntax/capability.txtevals/coralogix-exporter-domain-env-var-syntax/criteria.jsonevals/coralogix-exporter-domain-env-var-syntax/task.mdevals/coralogix-exporter-minimum-yaml-output/capability.txtevals/coralogix-exporter-minimum-yaml-output/criteria.jsonevals/coralogix-exporter-minimum-yaml-output/task.mdevals/deprecated-otel-agent-chart-redirect/capability.txtevals/deprecated-otel-agent-chart-redirect/criteria.jsonevals/deprecated-otel-agent-chart-redirect/task.mdevals/ecs-ec2-daemonset-ecs-detector-remove/capability.txtevals/ecs-ec2-daemonset-ecs-detector-remove/criteria.jsonevals/ecs-ec2-daemonset-ecs-detector-remove/task.mdevals/ecs-ec2-daemonset-localhost-network/capability.txtevals/ecs-ec2-daemonset-localhost-network/criteria.jsonevals/ecs-ec2-daemonset-localhost-network/task.mdevals/ecs-fargate-sidecar-ecs-detector-enable/capability.txtevals/ecs-fargate-sidecar-ecs-detector-enable/criteria.jsonevals/ecs-fargate-sidecar-ecs-detector-enable/task.mdevals/ecs-windows-ecsattributes-fallback/capability.txtevals/ecs-windows-ecsattributes-fallback/criteria.jsonGuides technical evaluation of code review feedback: read fully, restate for understanding, verify against codebase, respond with reasoning or pushback before implementing.
Share bugs, ideas, or general feedback.
The Coralogix-flavored OpenTelemetry Collector — the coralogix exporter, the
otel-integration Helm chart for Kubernetes, CDOT for ECS, and the universal installer
for standalone hosts. Load this skill when a user is deploying, configuring, or
debugging a collector that ships to Coralogix. Most failures come down to a handful of
Coralogix-specific defaults that vanilla OpenTelemetry docs don't cover.
| Use case | Reference |
|---|---|
Configure the coralogix exporter (domain, private key, app/subsystem) | config-exporters.md · config-processors.md |
| Infrastructure Explorer / Resource Catalog | preset-kubernetes.md |
| Pick a deployment mode | setup-index.md |
Kubernetes — otel-integration Helm chart (EKS/GKE/AKS/OpenShift/Autopilot/EKS Fargate) | setup-kubernetes.md |
| OpenTelemetry Operator / Target Allocator | setup-opentelemetry-operator.md |
| ECS EC2 (Linux daemonset) | setup-ecs-ec2.md |
| ECS Fargate (sidecar) | setup-ecs-fargate.md |
| Linux / macOS standalone | setup-linux-standalone.md |
| Windows standalone | setup-windows-standalone.md |
| Universal installer (all OS) | setup-installer.md |
spanmetrics, tail_sampling, k8sattributes placement | config-connectors.md |
Span Metrics DB labels differ between calls_total and db_calls_total | config-connectors.md — place DB label compatibility transforms under top-level spanMetrics.transformStatements |
| Cardinality, URL/span-name sanitization, and PII redaction routing | data-safety-cardinality.md |
| Collector component maturity, alpha/beta/stable/deprecated guidance | component-stability.md |
Memory — memory_limiter firing, RSS vs Go heap | ops-memory-performance.md |
| Troubleshoot "no data", "no traces", "Resource Catalog empty" | ops-troubleshooting.md |
| OpAMP supervisor / Fleet Manager config overlap | preset-fleet-management.md |
For these recurring cases, include the exact corrective detail in the final answer. These are also the authoritative statements of Coralogix-specific defaults — do not contradict them elsewhere in the answer.
domain: is a bare hostname, not a URL. eu2.coralogix.com — not https://ingress.eu2.coralogix.com, not a UI hostname.${env:CORALOGIX_PRIVATE_KEY}, not $CORALOGIX_PRIVATE_KEY — unbracketed form silently fails in v0.76+. Minimum exporter block: domain: "<region>.coralogix.com" and private_key: "${env:CORALOGIX_PRIVATE_KEY}".coralogix/resource_catalog exporter for Infrastructure Explorer
with the x-coralogix-ingress: metadata-as-otlp-logs/v1 header. The default
coralogix exporter won't light up the entity views.resourcedetection/resource_catalog crash on daemonset — error can't get K8s Instance Metadata; node name is empty means this processor is on a daemonset agent. It belongs on the opentelemetry-cluster-collector Deployment only. Fix: remove it from the daemonset config. Do not conflate with the coralogix/resource_catalog exporter, which is a separate component.alpha is for limited non-critical use, beta is
broader but can still break, stable is the production default, and deprecated means
avoid new deployments and plan migration. A config that validates is not enough evidence
that the component is safe for production.memory_limiter first, batch last.k8sattributes extraction — typically gateway; agents use passthrough: true.spanmetrics on agent (before sampling), tail_sampling on gateway. Run transactions/groupbytrace/transactions before spanmetrics; never on both agent and gateway simultaneously — this causes double-counting because each tier sees all spans and emits separate metric series that accumulate. tail_sampling on a daemonset agent causes incomplete traces because each agent only sees spans from its own node — a single trace is split across agents and the sampler decides on partial data. Fix: move tail_sampling to a central gateway tier and add a loadbalancing exporter on the agents that routes spans to gateway by trace_id, so all spans for a trace reach the same gateway replica.spanMetrics.transformStatements. Do not put them only under
spanMetrics.dbMetrics.transformStatements; that can populate
db_calls_total while leaving normal calls_total with blank
db_namespace.service.pipelines wholesale — use extraProcessors/extraReceivers hooks; wholesale overrides silently break resource/metadata (cx.agent.type) and chart upgrades.trace_id on spans/logs instead, or normalize them before metrics are
generated.url.full, k8s.pod.name, and k8s.pod.ip are dangerous Span Metrics dimensions.
Prefer http.route, low-cardinality host/operation labels, and stable service/resource
labels. If a customer insists on url.full, sanitize it before spanmetrics consumes the
span and make the risk explicit.aggregation_cardinality_limit is a guardrail, not a fix. For Helm
spanMetrics.aggregationCardinalityLimit / collector aggregation_cardinality_limit,
use it to collapse overflow series, but still remove or normalize high-cardinality labels.spanmetrics, batch, and coralogix.
For broad URL-like span names or URL attributes, explicitly recommend
redactionprocessor with the literal keys url_sanitizer and sanitize_span_name, and
include allow_all_keys: true unless intentionally using an explicit allowed_keys
whitelist; otherwise unspecified attributes are dropped. Warn that broad sanitizers can
over-sanitize and validate before/after examples. Use the opentelemetry-ottl skill for
targeted transforms such as SHA256, replace_pattern, replace_all_patterns,
delete_key, and nil-safe guards.logsCollection.storeCheckpoints: false; disable coralogix-ebpf-profiler, hostMetrics, hostEntityEvents, resourceDetection on agent; disable resourceDetection on cluster-collector. Use gke-autopilot-values.yaml.localhost); daemonset needs networkMode: host. Remove ecs from resourcedetection.detectors — it stamps the collector's own container ID.healthCheck + dependsOn: [{containerName: otel-collector, condition: HEALTHY}] on the app. Use the CDOT image.otel-installer/one-liner with both
CORALOGIX_PRIVATE_KEY and CORALOGIX_DOMAIN.kubernetesResources and hostEntityEvents
are enabled by default in the chart — do not disable them. kubernetesResources
must stay on the opentelemetry-cluster-collector only (enabling it on the agent
crashes with can't get K8s Instance Metadata; node name is empty). Use a dedicated
coralogix/resource_catalog exporter with x-coralogix-ingress: metadata-as-otlp-logs/v1.resourcedetection/resource_catalog belongs on
the opentelemetry-cluster-collector Deployment only; remove it from daemonset agents.k8sattributes: Exactly one role should do full extraction; set
passthrough: true on the others.domain: and needs the
full URL, e.g. https://ingress.eu2.coralogix.com/opamp/v1.extensions: [opamp] fails on old image pins: The K8s Windows sub-preset
defaults to coralogixrepo/opentelemetry-collector-contrib-windows:0.92.0, which predates
OpAMP on Windows — enabling extensions: [opamp] there causes the collector to refuse to
start. Fix: bump the image to ≥ v0.130. When bumping the image is not an option (e.g.
locked in a production freeze), use the -Supervisor wrapper instead — this runs
opampsupervisor as a separate Windows Service and works regardless of collector version.F (full/final) — the standard P→F recombine never triggers. Use firstEntryRegex
on the filelog recombine operator to detect new entries by timestamp pattern.spanNameReplacePattern escaping: There are two layers: single-quote or block
scalar for YAML/OTTL backslashes, and write backreferences as $$1/$$2 because the
collector envprovider expands $...; verify with helm template.svc/coralogix-opentelemetry-targetallocator on 8080; inspect /jobs and
/scrape_configs; then check RBAC, selectors, and watched namespaces.Work through these steps in order before touching any pipeline configuration:
Step 1 — Prove the collector is running and exporting
# Kubernetes: check exporter metrics
kubectl exec -n <namespace> <collector-pod> -- wget -qO- http://localhost:8888/metrics \
| grep -E 'otelcol_exporter_(sent|send_failed|enqueue_failed|queue)'
# Success: otelcol_exporter_sent_* > 0 and climbing
# Failure indicator: otelcol_exporter_send_failed_* > 0 — proceed to Step 2
Step 2 — Verify DNS and TLS reach the ingestion endpoint
# From inside the collector pod / host
nslookup ingress.<domain> # e.g. ingress.coralogix.com
curl -v https://ingress.<domain> # expect 400/401, NOT a TLS or connection error
If DNS fails → network/VPC/proxy issue, not a collector config issue.
If TLS fails → certificate bundle or proxy MITM — check NO_PROXY / HTTPS_PROXY env vars.
Step 3 — Confirm the private key is expanded correctly
# Kubernetes: inspect the live env
kubectl exec -n <namespace> <collector-pod> -- env | grep CORALOGIX
# The key must appear as a 36-char UUID-like string, not the literal "${env:...}" text
# Literal text → bracket syntax wrong, or Secret not mounted
Step 4 — Check the exporter domain: value
In the running config (/etc/otelcol-contrib/config.yaml or kubectl get cm), verify:
domain: is a bare hostname such as eu2.coralogix.com — no https:// prefix, no UI hostname (app.coralogix.com is wrong)private_key: resolved to the actual key (Step 3)Step 5 — Enable debug logging for one minute
service:
telemetry:
logs:
level: debug
Look for Exporting failed or grpc status lines. A StatusUnauthenticated confirms a key/region mismatch. A context deadline exceeded suggests egress/proxy or ingress-side latency — also check coralogix.timeout (default 5s; increase to 30s).
Step 6 — Inspect pipeline wiring only after Steps 1–5 pass
If export is healthy but data is missing in the Coralogix UI: check receiver connectivity, processor filters (filter processor dropping everything), and that the pipeline is wired in service.pipelines. Full symptom → root-cause table: references/ops-troubleshooting.md.
otel-integrationUse references/setup-kubernetes.md for install flow, per-platform variants, Target Allocator, and chart-specific failure modes. Use references/preset-kubernetes.md when the question is about Helm presets or Infrastructure Explorer.
Use references/setup-ecs-fargate.md. The fragile pieces are
sidecar health checks, dependsOn: HEALTHY, essential flags, and keeping the ecs
detector enabled only for sidecar mode.
memory_limiter firing constantlyUse references/ops-memory-performance.md. Compare
Go heap metrics to RSS before changing pod limits or memory_limiter settings.
Use references/data-safety-cardinality.md. Keep the
answer layered: prevent bad labels at instrumentation, normalize/sanitize before
spanmetrics, and only then discuss collector/backend cardinality limits.
opentelemetry-ottl skill.preset-fleet-management.md covers only the collector-config overlap (endpoint shape, values-vs-UI precedence, Windows image pitfall); deep supervisor/CDOT work is out of scope.opentelemetry-instrumentation skill.Upstream links: