fix: address CI rename checks (rippled -> xrpld) in phase-8 docs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Pratik Mankawde
2026-04-29 20:09:43 +01:00
parent 81b47afde7
commit 7ab6f4d34b
5 changed files with 35 additions and 35 deletions

View File

@@ -391,14 +391,14 @@ The `StatsDMeterImpl` in `StatsDCollector.cpp:706` sends metrics with `|m` suffi
### Motivation
rippled's `beast::Journal` logs and OpenTelemetry traces are currently two disjoint observability signals. When investigating an issue, operators must manually correlate timestamps between log files and Jaeger/Tempo traces. Phase 8 bridges this gap by injecting trace context (`trace_id`, `span_id`) into every log line emitted within an active span, and ingesting those logs into Grafana Loki via the OTel Collector's filelog receiver.
xrpld's `beast::Journal` logs and OpenTelemetry traces are currently two disjoint observability signals. When investigating an issue, operators must manually correlate timestamps between log files and Jaeger/Tempo traces. Phase 8 bridges this gap by injecting trace context (`trace_id`, `span_id`) into every log line emitted within an active span, and ingesting those logs into Grafana Loki via the OTel Collector's filelog receiver.
#### Gains
1. **One-click trace-to-log navigation** Click a trace in Tempo/Jaeger and immediately see the corresponding log lines in Loki, filtered by `trace_id`.
2. **Reverse lookup (log-to-trace)** Loki derived fields make `trace_id` values clickable links back to Tempo.
3. **Unified observability** All three pillars (traces, metrics, logs) flow through the same OTel Collector pipeline and are visible in a single Grafana instance.
4. **Zero new dependencies in rippled** Uses existing OTel SDK headers (`GetSpan`, `GetContext`) already linked in Phase 1.
4. **Zero new dependencies in xrpld** Uses existing OTel SDK headers (`GetSpan`, `GetContext`) already linked in Phase 1.
5. **Negligible overhead** `GetSpan()` + `GetContext()` are thread-local reads (<10ns/call). At ~1000 JLOG calls/min, this adds <10us/min.
#### Losses / Risks
@@ -416,13 +416,13 @@ The correlation value far outweighs the risks. The log format change is backward
Phase 8 has two independent sub-phases that can be developed in parallel:
- **Phase 8a (code change)**: Modify `Logs::format()` in `src/libxrpl/basics/Log.cpp` to append `trace_id=<hex32> span_id=<hex16>` when the current thread has an active OTel span. Guarded by `#ifdef XRPL_ENABLE_TELEMETRY`.
- **Phase 8b (infra only)**: Add Loki to the Docker Compose stack, configure the OTel Collector's `filelog` receiver to tail rippled's log file, parse out structured fields (timestamp, partition, severity, trace_id, span_id, message), and export to Loki via OTLP. Configure Grafana TempoLoki bidirectional linking.
- **Phase 8b (infra only)**: Add Loki to the Docker Compose stack, configure the OTel Collector's `filelog` receiver to tail xrpld's log file, parse out structured fields (timestamp, partition, severity, trace_id, span_id, message), and export to Loki via OTLP. Configure Grafana TempoLoki bidirectional linking.
#### Trace ID Injection Flow
```mermaid
flowchart LR
subgraph rippled["rippled process"]
subgraph xrpld["xrpld process"]
JLOG["JLOG(j.info())"]
Format["Logs::format()"]
OTelCtx["OTel Context<br/>(thread-local)"]
@@ -436,7 +436,7 @@ flowchart LR
Format --> LogLine
style rippled fill:#1a237e,stroke:#0d1642,color:#fff
style xrpld fill:#1a237e,stroke:#0d1642,color:#fff
style output fill:#1b5e20,stroke:#0d3d14,color:#fff
style JLOG fill:#283593,stroke:#1a237e,color:#fff
style Format fill:#283593,stroke:#1a237e,color:#fff
@@ -456,7 +456,7 @@ flowchart LR
FR --> RP --> BP --> LE
end
LogFile["rippled<br/>debug.log"] --> FR
LogFile["xrpld<br/>debug.log"] --> FR
LE --> Loki["Grafana Loki<br/>:3100"]
Loki <-->|"derivedFields ↔<br/>tracesToLogs"| Tempo["Grafana Tempo"]
@@ -487,7 +487,7 @@ flowchart LR
- [ ] Log lines within active spans contain `trace_id=<hex> span_id=<hex>`
- [ ] Log lines outside spans have no trace context (no empty fields)
- [ ] Loki ingests rippled logs via OTel Collector filelog receiver
- [ ] Loki ingests xrpld logs via OTel Collector filelog receiver
- [ ] Grafana Tempo Loki one-click correlation works
- [ ] Grafana Loki Tempo reverse lookup works via derived field
- [ ] Integration test verifies trace_id presence in logs

View File

@@ -495,7 +495,7 @@ xrpld_State_Accounting_Full_duration
> **Plan details**: [06-implementation-phases.md §6.8.1](./06-implementation-phases.md) — motivation, architecture, Mermaid diagrams
> **Task breakdown**: [Phase8_taskList.md](./Phase8_taskList.md) — per-task implementation details
Phase 8 injects OTel trace context into rippled's `Logs::format()` output, enabling log-trace correlation. When a log line is emitted within an active OTel span, the trace and span identifiers are automatically appended after the severity field:
Phase 8 injects OTel trace context into xrpld's `Logs::format()` output, enabling log-trace correlation. When a log line is emitted within an active OTel span, the trace and span identifiers are automatically appended after the severity field:
### Log Format
@@ -520,7 +520,7 @@ The trace context injection is implemented in `Logs::format()` (`src/libxrpl/bas
### Log Ingestion Pipeline
```
rippled debug.log -> OTel Collector filelog receiver -> regex_parser -> Loki exporter -> Grafana Loki
xrpld debug.log -> OTel Collector filelog receiver -> regex_parser -> Loki exporter -> Grafana Loki
```
The OTel Collector's `filelog` receiver tails `debug.log` files and uses a `regex_parser` operator to extract structured fields:
@@ -549,16 +549,16 @@ Grafana Loki (v2.9.0) serves as the log storage backend. It receives log entries
```logql
# Find all logs for a specific trace
{job="rippled"} |= "trace_id=abc123def456789012345678abcdef01"
{job="xrpld"} |= "trace_id=abc123def456789012345678abcdef01"
# Error logs with trace context
{job="rippled"} |= "ERR" |= "trace_id="
{job="xrpld"} |= "ERR" |= "trace_id="
# Logs from a specific partition with trace context
{job="rippled"} |= "LedgerMaster" | regexp `trace_id=(?P<trace_id>[a-f0-9]+)` | trace_id != ""
{job="xrpld"} |= "LedgerMaster" | regexp `trace_id=(?P<trace_id>[a-f0-9]+)` | trace_id != ""
# Count traced log lines over time
count_over_time({job="rippled"} |= "trace_id=" [5m])
count_over_time({job="xrpld"} |= "trace_id=" [5m])
```
---

View File

@@ -1,6 +1,6 @@
# Phase 8: Log-Trace Correlation and Centralized Log Ingestion — Task List
> **Goal**: Inject trace context (trace_id, span_id) into rippled's Journal log output for log-trace correlation, and add OTel Collector filelog receiver to ingest logs into Grafana Loki for unified observability.
> **Goal**: Inject trace context (trace_id, span_id) into xrpld's Journal log output for log-trace correlation, and add OTel Collector filelog receiver to ingest logs into Grafana Loki for unified observability.
>
> **Scope**: Two independent sub-phases — 8a (code change: trace_id in logs) and 8b (infra only: filelog receiver to Loki). No changes to the `beast::Journal` public API.
>
@@ -89,7 +89,7 @@
## Task 8.3: Add Filelog Receiver to OTel Collector
**Objective**: Configure the OTel Collector to tail rippled's log file and export to Loki.
**Objective**: Configure the OTel Collector to tail xrpld's log file and export to Loki.
**What to do**:
@@ -124,7 +124,7 @@
insecure: true
```
- Mount rippled's log directory into the collector container via docker-compose volume
- Mount xrpld's log directory into the collector container via docker-compose volume
**Key modified files**:
@@ -172,7 +172,7 @@
**What to do**:
- Edit `docker/telemetry/integration-test.sh`:
- After sending RPC requests (which create spans), grep rippled's log output for `trace_id=`
- After sending RPC requests (which create spans), grep xrpld's log output for `trace_id=`
- Verify trace_id matches a trace visible in Tempo
- Optionally: query Loki via API to confirm log ingestion
@@ -225,7 +225,7 @@
- [ ] Log lines within active spans contain `trace_id=<hex> span_id=<hex>`
- [ ] Log lines outside spans have no trace context (no empty fields)
- [ ] Loki ingests rippled logs via OTel Collector filelog receiver
- [ ] Loki ingests xrpld logs via OTel Collector filelog receiver
- [ ] Grafana Tempo -> Loki one-click correlation works
- [ ] Grafana Loki -> Tempo reverse lookup works via derived field
- [ ] Integration test verifies trace_id presence in logs

View File

@@ -469,14 +469,14 @@ Pre-configured datasources:
## Test 3: Log-Trace Correlation (Phase 8)
Phase 8 injects `trace_id` and `span_id` into rippled's log output when
Phase 8 injects `trace_id` and `span_id` into xrpld's log output when
a log line is emitted within an active OTel span. This test verifies the
end-to-end log-trace correlation pipeline.
### Step 1: Verify trace_id in log output
After running Test 1 or Test 2 (which generate RPC spans), check the
rippled debug.log for trace context:
xrpld debug.log for trace context:
```bash
grep 'trace_id=[a-f0-9]\{32\} span_id=[a-f0-9]\{16\}' /path/to/debug.log
@@ -506,13 +506,13 @@ Expected result: `1` (the trace exists in Jaeger).
### Step 3: Verify Loki log ingestion
The OTel Collector's filelog receiver tails rippled's debug.log and
The OTel Collector's filelog receiver tails xrpld's debug.log and
exports parsed entries to Loki. Verify Loki has received entries:
```bash
# Query Loki for any rippled logs
# Query Loki for any xrpld logs
curl -sG "http://localhost:3100/loki/api/v1/query" \
--data-urlencode 'query={job="rippled"}' \
--data-urlencode 'query={job="xrpld"}' \
--data-urlencode 'limit=5' | jq '.data.result | length'
```
@@ -529,7 +529,7 @@ Expected: > 0 results.
### Step 5: Verify Grafana Loki-to-Tempo correlation
1. In Grafana **Explore**, select **Loki** datasource
2. Query: `{job="rippled"} |= "trace_id="`
2. Query: `{job="xrpld"} |= "trace_id="`
3. In the log results, click the **TraceID** derived field link
4. Verify it navigates to the full trace in Tempo
@@ -588,7 +588,7 @@ Expected: > 0 results.
### No trace_id in log output (Phase 8)
1. Verify rippled was built with `telemetry=ON` (`-Dtelemetry=ON` in CMake)
1. Verify xrpld was built with `telemetry=ON` (`-Dtelemetry=ON` in CMake)
2. Verify `enabled=1` in the `[telemetry]` config section
3. Log lines only contain trace context when emitted inside an active span.
Background logs (startup, periodic tasks outside spans) will not have

View File

@@ -487,7 +487,7 @@ Requires `trace_peer=1` in the `[telemetry]` config section.
## Log-Trace Correlation (Phase 8)
When rippled is built with `telemetry=ON`, log lines emitted within an active OpenTelemetry span automatically include `trace_id` and `span_id` fields:
When xrpld is built with `telemetry=ON`, log lines emitted within an active OpenTelemetry span automatically include `trace_id` and `span_id` fields:
```
2024-01-15T10:30:45.123Z LedgerMaster:NFO trace_id=abc123def456789012345678abcdef01 span_id=0123456789abcdef Validated ledger 42
@@ -506,27 +506,27 @@ Log files are ingested by the OTel Collector's `filelog` receiver, which tails `
```logql
# Find all logs for a specific trace
{job="rippled"} |= "trace_id=abc123def456789012345678abcdef01"
{job="xrpld"} |= "trace_id=abc123def456789012345678abcdef01"
# Error logs with trace context (log lines with ERR severity that have a trace_id)
{job="rippled"} |= "ERR" |= "trace_id="
{job="xrpld"} |= "ERR" |= "trace_id="
# All logs from a specific partition that were emitted during a span
{job="rippled"} |= "LedgerMaster" | regexp `trace_id=(?P<trace_id>[a-f0-9]+)` | trace_id != ""
{job="xrpld"} |= "LedgerMaster" | regexp `trace_id=(?P<trace_id>[a-f0-9]+)` | trace_id != ""
# Logs from the last hour containing trace context
{job="rippled"} |= "trace_id=" | regexp `(?P<partition>\S+):(?P<sev>\S+)\s+trace_id=(?P<tid>[a-f0-9]+)`
{job="xrpld"} |= "trace_id=" | regexp `(?P<partition>\S+):(?P<sev>\S+)\s+trace_id=(?P<tid>[a-f0-9]+)`
# Count of traced vs untraced log lines
count_over_time({job="rippled"} |= "trace_id=" [5m])
count_over_time({job="xrpld"} |= "trace_id=" [5m])
```
### Verifying Log Correlation
1. Start the observability stack and rippled with telemetry enabled.
1. Start the observability stack and xrpld with telemetry enabled.
2. Send an RPC request: `curl http://localhost:5005 -d '{"method":"server_info"}'`
3. Check the debug.log for `trace_id=` entries: `grep trace_id= /path/to/debug.log`
4. Open Grafana at http://localhost:3000 -> Explore -> Loki and search for `{job="rippled"} |= "trace_id="`.
4. Open Grafana at http://localhost:3000 -> Explore -> Loki and search for `{job="xrpld"} |= "trace_id="`.
5. Click the TraceID link to navigate to the corresponding trace in Tempo.
## Troubleshooting
@@ -554,14 +554,14 @@ count_over_time({job="rippled"} |= "trace_id=" [5m])
### No trace_id in log output
- Verify rippled was built with `telemetry=ON` (the `XRPL_ENABLE_TELEMETRY` preprocessor flag)
- Verify xrpld was built with `telemetry=ON` (the `XRPL_ENABLE_TELEMETRY` preprocessor flag)
- Verify `enabled=1` in the `[telemetry]` config section
- Log lines only contain `trace_id`/`span_id` when emitted inside an active span — background logs outside of RPC/consensus/transaction processing will not have trace context
- Check that the specific trace category is enabled (e.g., `trace_rpc=1`)
### No logs in Loki
- Verify the log file mount in docker-compose.yml points to the correct rippled log directory
- Verify the log file mount in docker-compose.yml points to the correct xrpld log directory
- Check OTel Collector logs for filelog receiver errors: `docker compose logs otel-collector`
- Verify Loki is running: `curl http://localhost:3100/ready`
- Check the filelog receiver glob pattern matches your log file paths