Baseline Strategies
A baseline strategy determines which historical snapshot is used as the reference point when comparing a new run. FlameIQ v1.0 provides three strategies, each suited to different workflows.
Overview
Strategy |
How it selects the baseline |
Best for |
|---|---|---|
|
Most recently stored snapshot |
Simple projects, low-noise CI |
|
Median of the last N snapshots |
Shared CI runners, noisy benchmarks |
|
Snapshot with a specific release tag |
Release-to-release comparisons |
last_successful
Uses the single most recent snapshot saved via flameiq baseline set.
This is the default and simplest strategy. On every merge to main,
the baseline advances to the latest measurements.
baseline:
strategy: last_successful
Workflow:
main branch:
commit A → flameiq baseline set → baseline = A
commit B → flameiq baseline set → baseline = B
...
PR branch:
current commit → flameiq compare → compared against B (latest)
Characteristics:
Deterministic: same baseline file → same result
Sensitive to one-off performance spikes on
mainRequires
flameiq baseline setto be run on every main commit
rolling_median
Computes a synthetic baseline from the median values across the last N snapshots. This filters out one-off measurement noise that can cause false regressions or false passes.
baseline:
strategy: rolling_median
rolling_window: 5
How the synthetic baseline is computed:
For each metric key (e.g. latency.p95), FlameIQ collects the values
from the last N stored snapshots and computes the median:
The synthetic snapshot uses the most recent snapshot’s metadata
(commit, branch, tags) and is marked with
tags["flameiq_synthetic"] = "rolling_median".
Choosing a window size:
Window |
Guidance |
|---|---|
3 |
Minimal smoothing. Responsive to real changes. |
5 (default) |
Balanced. Recommended for most projects. |
10 |
Heavy smoothing. Good for very noisy CI. |
Characteristics:
Immune to single outlier runs
Requires at least N prior runs before becoming fully effective
The first run after
flameiq inituses only 1 snapshot regardless
tagged
Uses a snapshot explicitly tagged with a label such as "v1.0.0".
All subsequent comparisons are made against that fixed point, regardless
of how many other baselines have been set in between.
Tag a release:
# After your v1.0.0 release:
flameiq baseline set \
--metrics release_v1.0.0.json \
--tag v1.0.0
Compare PRs against v1.0.0:
baseline:
strategy: tagged
# FlameIQ will search history for any snapshot tagged "v1.0.0"
Characteristics:
Pinned reference — comparisons are always against the same baseline
Ideal for
maindevelopment comparing against a release tagRequires
--tagto have been used when setting the baseline
Switching strategies
You can switch strategies at any time by editing flameiq.yaml. The
strategy only affects how flameiq compare selects the baseline — all
stored history is retained.
# Change strategy by editing flameiq.yaml, then re-run comparison:
flameiq compare --metrics current.json --fail-on-regression
The new strategy takes effect immediately.