mirror of
https://github.com/XRPLF/rippled.git
synced 2026-06-02 16:26:48 +00:00
fix(telemetry): make Loki ingestion and filelog parsing work end-to-end
Three interrelated fixes in otel-collector-config.yaml; without them the
Phase 8 log-trace correlation pipeline is silently broken.
1. `resource/logs` processor now upserts `job: xrpld` alongside
`service.name: xrpld`. Loki 3.x OTLP ingestion renames
`service.name` to the label `service_name`, so the runbook /
integration-test queries (`{job="xrpld"} |= "trace_id="`) returned
empty. Upserting the `job` resource attribute at the collector lets
the canonical Loki label flow through unchanged.
2. `filelog` regex makes the `partition:` capture non-capturing-optional.
`Logs::format()` omits the `partition:` prefix when partition is
empty (common for framework-level log lines); the old regex required
it and silently dropped those records.
3. Timestamp parser now matches the real log format. `Logs::format()`
writes microsecond-precision timestamps like
`2026-04-15 10:30:45.123456 UTC`. The layout was
`%Y-%b-%d %H:%M:%S` — missing fractional seconds and timezone —
which failed strptime and dropped timestamps. New layout is
`%Y-%b-%d %H:%M:%S.%f` with `location: UTC`.
Also adds a block-comment documenting the real log format so the
next person to touch this doesn't re-introduce the same gaps.
This commit is contained in:
@@ -38,11 +38,17 @@ receivers:
|
||||
filelog:
|
||||
include: [/var/log/rippled/*/debug.log]
|
||||
operators:
|
||||
# Log format emitted by Logs::format() is:
|
||||
# YYYY-Mmm-DD HH:MM:SS.ffffff UTC <partition>:<severity> [trace_id=... span_id=...] <message>
|
||||
# The `partition:` prefix is omitted when partition is empty, so the
|
||||
# capture group is non-capturing optional. Fractional seconds up to 6
|
||||
# digits are parsed via the `%f` strptime directive.
|
||||
- type: regex_parser
|
||||
regex: '^(?P<timestamp>\S+\s+\S+)\s+\S+\s+(?P<partition>\S+):(?P<severity>\S+)\s+(?:trace_id=(?P<trace_id>[a-f0-9]+)\s+span_id=(?P<span_id>[a-f0-9]+)\s+)?(?P<message>.*)$'
|
||||
regex: '^(?P<timestamp>\S+\s+\S+)\s+\S+\s+(?:(?P<partition>\S+):)?(?P<severity>\S+)\s+(?:trace_id=(?P<trace_id>[a-f0-9]+)\s+span_id=(?P<span_id>[a-f0-9]+)\s+)?(?P<message>.*)$'
|
||||
timestamp:
|
||||
parse_from: attributes.timestamp
|
||||
layout: "%Y-%b-%d %H:%M:%S"
|
||||
layout: "%Y-%b-%d %H:%M:%S.%f"
|
||||
location: UTC
|
||||
|
||||
processors:
|
||||
batch:
|
||||
@@ -53,6 +59,15 @@ processors:
|
||||
- key: service.name
|
||||
value: xrpld
|
||||
action: upsert
|
||||
# Loki 3.x OTLP ingestion converts `service.name` to the label
|
||||
# `service_name`. The runbook and integration-test queries use the
|
||||
# canonical Loki label `job` so operators can paste `{job="xrpld"}`
|
||||
# without guessing the otel-to-loki naming convention. Upsert the
|
||||
# `job` resource attribute here so it round-trips through OTLP
|
||||
# into Loki as the `job` label.
|
||||
- key: job
|
||||
value: xrpld
|
||||
action: upsert
|
||||
|
||||
connectors:
|
||||
spanmetrics:
|
||||
|
||||
Reference in New Issue
Block a user