Files
rippled/docker/telemetry/workload/expected_metrics.json
Pratik Mankawde 787b496484 Phase 10: Synthetic workload generation and telemetry validation tools
Add comprehensive workload harness for end-to-end validation of the
Phases 1-9 telemetry stack:

Task 10.1 — Multi-node test harness:
  - docker-compose.workload.yaml with full OTel stack (Collector, Jaeger,
    Tempo, Prometheus, Loki, Grafana)
  - generate-validator-keys.sh for automated key generation
  - xrpld-validator.cfg.template for node configuration

Task 10.2 — RPC load generator:
  - rpc_load_generator.py with WebSocket client, configurable rates,
    realistic command distribution (40% health, 30% wallet, 15% explorer,
    10% tx lookups, 5% DEX), W3C traceparent injection

Task 10.3 — Transaction submitter:
  - tx_submitter.py with 10 transaction types (Payment, OfferCreate,
    OfferCancel, TrustSet, NFTokenMint, NFTokenCreateOffer, EscrowCreate,
    EscrowFinish, AMMCreate, AMMDeposit), auto-funded test accounts

Task 10.4 — Telemetry validation suite:
  - validate_telemetry.py checking spans (Jaeger), metrics (Prometheus),
    log-trace correlation (Loki), dashboards (Grafana)
  - expected_spans.json (17 span types, 22 attributes, 3 hierarchies)
  - expected_metrics.json (SpanMetrics, StatsD, Phase 9, dashboards)

Task 10.5 — Performance benchmark suite:
  - benchmark.sh for baseline vs telemetry comparison
  - collect_system_metrics.sh for CPU/memory/latency sampling
  - Thresholds: <3% CPU, <5MB memory, <2ms RPC p99, <5% TPS, <1% consensus

Task 10.6 — CI integration:
  - telemetry-validation.yml GitHub Actions workflow
  - run-full-validation.sh orchestrator script
  - Manual trigger + telemetry branch auto-trigger

Task 10.7 — Documentation:
  - workload/README.md with quick start and tool reference
  - Updated telemetry-runbook.md with validation and benchmark sections
  - Updated 09-data-collection-reference.md with validation inventory

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 10:59:16 +00:00

102 lines
3.3 KiB
JSON

{
"description": "Expected metric inventory for rippled telemetry validation. Sourced from 09-data-collection-reference.md.",
"spanmetrics": {
"description": "SpanMetrics-derived RED metrics from the OTel Collector spanmetrics connector.",
"metrics": [
"traces_span_metrics_calls_total",
"traces_span_metrics_duration_milliseconds_bucket",
"traces_span_metrics_duration_milliseconds_count",
"traces_span_metrics_duration_milliseconds_sum"
],
"required_labels": [
"span_name",
"status_code",
"service_name",
"span_kind"
],
"dimension_labels": [
"xrpl_rpc_command",
"xrpl_rpc_status",
"xrpl_consensus_mode",
"xrpl_tx_local",
"xrpl_peer_proposal_trusted",
"xrpl_peer_validation_trusted"
]
},
"statsd_gauges": {
"description": "beast::insight gauges emitted via StatsD UDP.",
"metrics": [
"rippled_LedgerMaster_Validated_Ledger_Age",
"rippled_LedgerMaster_Published_Ledger_Age",
"rippled_State_Accounting_Full_duration",
"rippled_Peer_Finder_Active_Inbound_Peers",
"rippled_Peer_Finder_Active_Outbound_Peers",
"rippled_job_count"
]
},
"statsd_counters": {
"description": "beast::insight counters emitted via StatsD UDP.",
"metrics": ["rippled_rpc_requests", "rippled_ledger_fetches"]
},
"statsd_histograms": {
"description": "beast::insight timers/histograms emitted via StatsD UDP.",
"metrics": ["rippled_rpc_time", "rippled_rpc_size", "rippled_ios_latency"]
},
"overlay_traffic": {
"description": "Overlay traffic metrics (subset — full list has 45+ categories).",
"metrics": [
"rippled_total_Bytes_In",
"rippled_total_Bytes_Out",
"rippled_total_Messages_In",
"rippled_total_Messages_Out"
]
},
"phase9_nodestore": {
"description": "Phase 9 NodeStore I/O metrics (via beast::insight extensions).",
"metrics": [
"rippled_nodestore_reads_total",
"rippled_nodestore_writes",
"rippled_nodestore_read_bytes",
"rippled_nodestore_written_bytes"
]
},
"phase9_cache": {
"description": "Phase 9 cache hit rate metrics (via OTel MetricsRegistry).",
"metrics": ["rippled_cache_SLE_hit_rate", "rippled_cache_treenode_size"]
},
"phase9_txq": {
"description": "Phase 9 transaction queue metrics (via OTel MetricsRegistry).",
"metrics": ["rippled_txq_count", "rippled_txq_max_size"]
},
"phase9_rpc_method": {
"description": "Phase 9 per-RPC-method metrics (via OTel Metrics SDK).",
"metrics": [
"rippled_rpc_method_started_total",
"rippled_rpc_method_finished_total"
]
},
"phase9_objects": {
"description": "Phase 9 counted object instances.",
"metrics": ["rippled_object_count"]
},
"phase9_load": {
"description": "Phase 9 fee escalation and load factor metrics.",
"metrics": ["rippled_load_factor"]
},
"grafana_dashboards": {
"description": "All 10 Grafana dashboards that must render data.",
"uids": [
"rippled-rpc-perf",
"rippled-transactions",
"rippled-consensus",
"rippled-ledger-ops",
"rippled-peer-net",
"rippled-statsd-node-health",
"rippled-statsd-network",
"rippled-statsd-rpc",
"rippled-statsd-overlay-detail",
"rippled-statsd-ledger-sync"
]
}
}