mirror of https://github.com/XRPLF/rippled.git synced 2026-03-16 01:32:23 +00:00

Files

Pratik Mankawde 3e34708472 Fix CI: cspell EOJSON delimiter and telemetry workflow conan setup

- Rename EOJSON heredoc delimiter to EOF_JSON to avoid cspell unknown word
- Add conan installation step (pip3 install conan) to telemetry-validation workflow
- Use shared setup-conan action for proper Conan profile/remote configuration
- Align build commands with reusable-build-test-config.yml conventions

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-03-12 22:12:57 +00:00

benchmark-results

Phase 10: Synthetic workload generation and telemetry validation tools

2026-03-12 22:12:57 +00:00

benchmark.sh

Phase 10: Synthetic workload generation and telemetry validation tools

2026-03-12 22:12:57 +00:00

collect_system_metrics.sh

Fix CI: cspell EOJSON delimiter and telemetry workflow conan setup

2026-03-12 22:12:57 +00:00

expected_metrics.json

Phase 10: Synthetic workload generation and telemetry validation tools

2026-03-12 22:12:57 +00:00

expected_spans.json

Phase 10: Synthetic workload generation and telemetry validation tools

2026-03-12 22:12:57 +00:00

generate-validator-keys.sh

Phase 10: Synthetic workload generation and telemetry validation tools

2026-03-12 22:12:57 +00:00

README.md

Phase 10: Synthetic workload generation and telemetry validation tools

2026-03-12 22:12:57 +00:00

requirements.txt

Phase 10: Synthetic workload generation and telemetry validation tools

2026-03-12 22:12:57 +00:00

rpc_load_generator.py

Phase 10: Synthetic workload generation and telemetry validation tools

2026-03-12 22:12:57 +00:00

run-full-validation.sh

Phase 10: Synthetic workload generation and telemetry validation tools

2026-03-12 22:12:57 +00:00

test_accounts.json

Phase 10: Synthetic workload generation and telemetry validation tools

2026-03-12 22:12:57 +00:00

tx_submitter.py

Phase 10: Synthetic workload generation and telemetry validation tools

2026-03-12 22:12:57 +00:00

validate_telemetry.py

Phase 10: Synthetic workload generation and telemetry validation tools

2026-03-12 22:12:57 +00:00

xrpld-validator.cfg.template

Phase 10: Synthetic workload generation and telemetry validation tools

2026-03-12 22:12:57 +00:00

README.md

Telemetry Workload Tools

Synthetic workload generation and validation tools for rippled's OpenTelemetry telemetry stack. These tools validate that all spans, metrics, dashboards, and log-trace correlation work end-to-end under controlled load.

Quick Start

# Build rippled with telemetry enabled
conan install . --build=missing -o telemetry=True
cmake --preset default -Dtelemetry=ON
cmake --build --preset default

# Run full validation (starts everything, runs load, validates)
docker/telemetry/workload/run-full-validation.sh --xrpld .build/xrpld

# Cleanup when done
docker/telemetry/workload/run-full-validation.sh --cleanup

Architecture

run-full-validation.sh (orchestrator)
  |
  |-- docker-compose.workload.yaml
  |     |-- otel-collector (traces + StatsD)
  |     |-- jaeger (trace search)
  |     |-- tempo (trace storage)
  |     |-- prometheus (metrics)
  |     |-- loki (log aggregation)
  |     |-- grafana (dashboards)
  |
  |-- generate-validator-keys.sh
  |     -> validator-keys.json, validators.txt
  |
  |-- 5x xrpld nodes (local processes, full telemetry)
  |
  |-- rpc_load_generator.py (WebSocket RPC traffic)
  |-- tx_submitter.py (transaction diversity)
  |
  |-- validate_telemetry.py (pass/fail checks)
  |     -> validation-report.json
  |
  |-- benchmark.sh (baseline vs telemetry comparison)
        -> benchmark-report-*.md

Tools Reference

run-full-validation.sh

Orchestrates the complete validation pipeline. Starts the telemetry stack, starts a multi-node rippled cluster, generates load, and validates the results.

# Full validation with defaults
./run-full-validation.sh --xrpld /path/to/xrpld

# Custom load parameters
./run-full-validation.sh --xrpld /path/to/xrpld \
    --rpc-rate 100 --rpc-duration 300 \
    --tx-tps 10 --tx-duration 300

# Include performance benchmarks
./run-full-validation.sh --xrpld /path/to/xrpld --with-benchmark

# Skip Loki checks (if Phase 8 not deployed)
./run-full-validation.sh --xrpld /path/to/xrpld --skip-loki

rpc_load_generator.py

Generates RPC traffic matching realistic production distribution:

40% health checks (server_info, fee)
30% wallet queries (account_info, account_lines, account_objects)
15% explorer queries (ledger, ledger_data)
10% transaction lookups (tx, account_tx)
5% DEX queries (book_offers, amm_info)

# Basic usage
python3 rpc_load_generator.py --endpoints ws://localhost:6006 --rate 50 --duration 120

# Multiple endpoints (round-robin)
python3 rpc_load_generator.py \
    --endpoints ws://localhost:6006 ws://localhost:6007 \
    --rate 100 --duration 300

# Custom weights
python3 rpc_load_generator.py --endpoints ws://localhost:6006 \
    --weights '{"server_info": 80, "account_info": 20}'

tx_submitter.py

Submits diverse transaction types to exercise the full span and metric surface:

Payment (XRP transfers)
OfferCreate / OfferCancel (DEX activity)
TrustSet (trust line creation)
NFTokenMint / NFTokenCreateOffer (NFT activity)
EscrowCreate / EscrowFinish (escrow lifecycle)
AMMCreate / AMMDeposit (AMM pool operations)

# Basic usage
python3 tx_submitter.py --endpoint ws://localhost:6006 --tps 5 --duration 120

# Custom mix
python3 tx_submitter.py --endpoint ws://localhost:6006 \
    --weights '{"Payment": 60, "OfferCreate": 20, "TrustSet": 20}'

validate_telemetry.py

Automated validation that all expected telemetry data exists:

Span validation: All 16+ span types with required attributes
Metric validation: SpanMetrics, StatsD, Phase 9 metrics
Log-trace correlation: trace_id/span_id in Loki logs
Dashboard validation: All 10 Grafana dashboards accessible

# Run all validations
python3 validate_telemetry.py --report /tmp/report.json

# Skip Loki checks
python3 validate_telemetry.py --skip-loki --report /tmp/report.json

benchmark.sh

Compares baseline (no telemetry) vs telemetry-enabled performance:

./benchmark.sh --xrpld /path/to/xrpld --duration 300

Thresholds (configurable via environment):

Metric	Threshold	Env Variable
CPU overhead	< 3%	BENCH_CPU_OVERHEAD_PCT
Memory overhead	< 5MB	BENCH_MEM_OVERHEAD_MB
RPC p99 latency	< 2ms	BENCH_RPC_LATENCY_IMPACT_MS
Throughput impact	< 5%	BENCH_TPS_IMPACT_PCT
Consensus impact	< 1%	BENCH_CONSENSUS_IMPACT_PCT

Reading Validation Reports

The validation report (validation-report.json) is structured as:

{
  "summary": {
    "total": 45,
    "passed": 42,
    "failed": 3,
    "all_passed": false
  },
  "checks": [
    {
      "name": "span.rpc.request",
      "category": "span",
      "passed": true,
      "message": "rpc.request: 15 traces found",
      "details": { "trace_count": 15 }
    }
  ]
}

Categories:

span: Span type existence and attribute validation
metric: Prometheus metric existence
log: Log-trace correlation checks
dashboard: Grafana dashboard accessibility

CI Integration

The validation runs as a GitHub Actions workflow (.github/workflows/telemetry-validation.yml):

Triggered manually or on pushes to telemetry branches
Builds rippled, starts the full stack, runs load, validates
Uploads reports as artifacts
Posts summary to PR

Configuration Files

File	Purpose
`expected_spans.json`	Span inventory (names, attributes, hierarchies)
`expected_metrics.json`	Metric inventory (SpanMetrics, StatsD, Phase 9)
`test_accounts.json`	Test account roles (keys generated at runtime)
`xrpld-validator.cfg.template`	Node config template with placeholders
`requirements.txt`	Python dependencies