Threshold Algorithm Specification
- Status:
Stable
- Version:
1.0
- Module:
flameiq.core.thresholds,flameiq.core.comparator
This document specifies the threshold-based regression detection algorithm used by FlameIQ’s comparison engine. This is the default and primary regression detection method.
Overview
The threshold algorithm detects regressions by comparing the signed percentage change between a baseline and current metric value against a configured tolerance threshold.
It is:
Deterministic — same input → same result, always
Efficient — O(n) in the number of metrics
Configurable — per-metric thresholds in
flameiq.yamlDirection-aware — different regression directions for different metric types
Step 1 — Compute change percent
For each metric present in both the baseline and current snapshot:
Floating-point policy: Rounded to 4 decimal places using Python’s
built-in round() for stable threshold comparisons.
Zero baseline guard: If baseline == 0, the metric is skipped with
a warning. Division by zero is undefined behaviour and would produce
meaningless percentages.
Examples:
Baseline |
Current |
change_percent |
Interpretation |
|---|---|---|---|
100.0 |
110.0 |
+10.0000 |
10% increase |
100.0 |
90.0 |
−10.0000 |
10% decrease |
100.0 |
100.0 |
0.0000 |
No change |
3.0 |
4.0 |
+33.3333 |
33.33% increase |
Step 2 — Evaluate against threshold
The threshold evaluation depends on the metric type:
Higher-is-worse metrics
For metrics where an increase is a regression (latency, memory, CPU):
Example: latency.p95 = 10% threshold:
change_percent = +12%→ REGRESSION (12 > 10)change_percent = +10%→ PASS (10 not > 10, exactly at threshold is pass)change_percent = −5%→ PASS (improvement)
Metrics in this group: latency.mean, latency.p50, latency.p95,
latency.p99, memory_mb, cpu_percent.
Lower-is-worse metrics
For metrics where a decrease is a regression (throughput):
Example: throughput = 5% threshold:
change_percent = −8%→ REGRESSION (−8 < −5)change_percent = −5%→ PASS (−5 not < −5)change_percent = +20%→ PASS (improvement)
Metrics in this group: throughput.
Unknown / custom metrics
For metrics not in either known group, absolute deviation is used:
This catches regressions in both directions for custom metrics.
Metrics in this group: All custom.* keys, and any metric not
explicitly listed in the above groups.
Step 3 — Warning detection
A metric in the WARNING state has not crossed its threshold but is approaching it. The warning margin is configurable (default: 5 percentage points):
Example with threshold = 10%, margin = 5%:
change_percent = +9%→ WARNING (9 ≥ 10−5=5, not yet regression)change_percent = +4%→ PASS (4 < 5)
Threshold format
Thresholds in flameiq.yaml are percent strings:
"10%" → float 10.0
"-5%" → float -5.0
"2.5%" → float 2.5
The regex used for parsing:
^(-?\d+(?:\.\d+)?)%$
Numeric values (int or float) are also accepted directly
(useful when building threshold configs programmatically).
Default threshold
If no threshold is configured for a metric, the default is applied:
DEFAULT_THRESHOLD_PERCENT = 10.0
This means any metric with no explicit configuration allows up to 10% absolute deviation before triggering a regression.
Overall status
The comparison result status is determined as:
Note
The current implementation does not set status to WARNING at the
result level — individual MetricDiff.is_warning flags are set,
but the result status is either PASS or REGRESSION. This matches
the CI exit code semantics (warnings do not fail the build).