Files
rippled/docker/telemetry/workload/regression-thresholds.json
Pratik Mankawde d83cb0bdb3 fix(telemetry): refresh regression baseline + widen bucket-noise thresholds
With validation now passing 133/133, the only remaining job failure was the
regression gate flagging 4 timing "regressions". Two compounding causes:

1. Stale baseline: the committed baseline was captured (2026-04-24) under the
   old, lighter workload — before the new txq-burst phase (60 TPS) existed. The
   heavier per-ledger work genuinely raises ledger.build / tx.apply /
   ledger.validate / acceptLedger timings, so every run regressed against it.
   Refreshed the baseline from the latest CI-measured timings (same workload).
2. Histogram quantization: SpanMetrics latency buckets are
   [1,5,10,25,...]ms, so a sub-millisecond quantile near a low-end boundary can
   jump a full bucket (1ms->5ms) between runs with no real change. The old
   absolute bounds (2-5ms) were narrower than one bucket width, so that jitter
   tripped the gate. Widened the default span bounds to 10-15ms (~2 low-end
   buckets) and pct to 50%, and the job_queue running bound to 20ms, to tolerate
   quantization noise while still catching genuine multi-bucket regressions. The
   consensus.* overrides (tight pct, large abs) are unchanged.

The refreshed baseline also picks up real rpc.ws_message timings (previously null
under the phantom rpc.request key).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 19:58:07 +01:00

27 lines
1.6 KiB
JSON

{
"_description": "Per-metric regression thresholds. A metric regresses when current - baseline exceeds BOTH the percentage and absolute bounds (AND, not OR — this tolerates small-value noise). Defaults apply unless a per-metric override exists.",
"_bucket_note": "SpanMetrics latency histograms use explicit buckets [1,5,10,25,50,100,250,500,1000,5000]ms. A quantile sitting near a low-end boundary can jump a full bucket (e.g. 1ms->5ms) between runs with no real change, so absolute span bounds are set to ~2 low-end bucket widths (10ms) to tolerate that quantization noise while still catching genuine multi-bucket regressions. The job_queue running bound is widened similarly — per-ledger apply work scales with TxQ burst load.",
"defaults": {
"span": {
"p50": { "max_pct_increase": 50.0, "max_abs_increase_ms": 10.0 },
"p95": { "max_pct_increase": 50.0, "max_abs_increase_ms": 10.0 },
"p99": { "max_pct_increase": 50.0, "max_abs_increase_ms": 15.0 }
},
"job_queue": {
"p95": { "max_pct_increase": 50.0, "max_abs_increase_us": 20000.0 }
}
},
"overrides": {
"span.consensus.ledger_close": {
"p50": { "max_pct_increase": 5.0, "max_abs_increase_ms": 200.0 },
"p95": { "max_pct_increase": 5.0, "max_abs_increase_ms": 500.0 },
"p99": { "max_pct_increase": 5.0, "max_abs_increase_ms": 1000.0 }
},
"span.consensus.accept": {
"p50": { "max_pct_increase": 5.0, "max_abs_increase_ms": 200.0 },
"p95": { "max_pct_increase": 5.0, "max_abs_increase_ms": 500.0 },
"p99": { "max_pct_increase": 5.0, "max_abs_increase_ms": 1000.0 }
}
}
}