mirror of
https://github.com/XRPLF/rippled.git
synced 2026-06-06 18:26:51 +00:00
With validation now passing 133/133, the only remaining job failure was the regression gate flagging 4 timing "regressions". Two compounding causes: 1. Stale baseline: the committed baseline was captured (2026-04-24) under the old, lighter workload — before the new txq-burst phase (60 TPS) existed. The heavier per-ledger work genuinely raises ledger.build / tx.apply / ledger.validate / acceptLedger timings, so every run regressed against it. Refreshed the baseline from the latest CI-measured timings (same workload). 2. Histogram quantization: SpanMetrics latency buckets are [1,5,10,25,...]ms, so a sub-millisecond quantile near a low-end boundary can jump a full bucket (1ms->5ms) between runs with no real change. The old absolute bounds (2-5ms) were narrower than one bucket width, so that jitter tripped the gate. Widened the default span bounds to 10-15ms (~2 low-end buckets) and pct to 50%, and the job_queue running bound to 20ms, to tolerate quantization noise while still catching genuine multi-bucket regressions. The consensus.* overrides (tight pct, large abs) are unchanged. The refreshed baseline also picks up real rpc.ws_message timings (previously null under the phantom rpc.request key). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
27 lines
1.6 KiB
JSON
27 lines
1.6 KiB
JSON
{
|
|
"_description": "Per-metric regression thresholds. A metric regresses when current - baseline exceeds BOTH the percentage and absolute bounds (AND, not OR — this tolerates small-value noise). Defaults apply unless a per-metric override exists.",
|
|
"_bucket_note": "SpanMetrics latency histograms use explicit buckets [1,5,10,25,50,100,250,500,1000,5000]ms. A quantile sitting near a low-end boundary can jump a full bucket (e.g. 1ms->5ms) between runs with no real change, so absolute span bounds are set to ~2 low-end bucket widths (10ms) to tolerate that quantization noise while still catching genuine multi-bucket regressions. The job_queue running bound is widened similarly — per-ledger apply work scales with TxQ burst load.",
|
|
"defaults": {
|
|
"span": {
|
|
"p50": { "max_pct_increase": 50.0, "max_abs_increase_ms": 10.0 },
|
|
"p95": { "max_pct_increase": 50.0, "max_abs_increase_ms": 10.0 },
|
|
"p99": { "max_pct_increase": 50.0, "max_abs_increase_ms": 15.0 }
|
|
},
|
|
"job_queue": {
|
|
"p95": { "max_pct_increase": 50.0, "max_abs_increase_us": 20000.0 }
|
|
}
|
|
},
|
|
"overrides": {
|
|
"span.consensus.ledger_close": {
|
|
"p50": { "max_pct_increase": 5.0, "max_abs_increase_ms": 200.0 },
|
|
"p95": { "max_pct_increase": 5.0, "max_abs_increase_ms": 500.0 },
|
|
"p99": { "max_pct_increase": 5.0, "max_abs_increase_ms": 1000.0 }
|
|
},
|
|
"span.consensus.accept": {
|
|
"p50": { "max_pct_increase": 5.0, "max_abs_increase_ms": 200.0 },
|
|
"p95": { "max_pct_increase": 5.0, "max_abs_increase_ms": 500.0 },
|
|
"p99": { "max_pct_increase": 5.0, "max_abs_increase_ms": 1000.0 }
|
|
}
|
|
}
|
|
}
|