When you run experiments, clicks and impressions only tell part of the story. They don’t show true buyer intent or how reliable your results are.

Heatseeker goes deeper. We measure real buyer behavior—and tell you what’s working, how strongly it’s working, and how much you can trust it.

Our four core metrics are:

Score — how much buyers are engaging
Precision — how consistent the data is
Uplift — how much better or worse something performs
Confidence — how sure you can be that the difference is real

Let's walk through each one—with examples.

1. Buyer Engagement Score (“Score”)

Score captures how much meaningful engagement your test variants are driving.
It rolls up actions like leads, form opens, clicks, and social interactions into a single number, normalized per 1,000 impressions.

Not all actions are equal—actions that show stronger buying intent are weighted more heavily.

Example:
A Score of 21 would mean 21 clicks per 1,000 impressions if clicks were the only action. But usually it’s a mix: form opens, leads, clicks—all weighted by intent.

💡 A higher Score means stronger buyer interest.

2. Uplift

Uplift shows how much better or worse a variant is compared to the baseline.

Positive Uplift = outperforming
Negative Uplift = underperforming

Baseline is by default the average Score across all variants. You can set a different baseline for your experiment (dropdown at the top right of your results screen).

Example:
A +63% Uplift means the variant is performing 63% better than the baseline.

💡 Uplift tells you the size of the difference between the score and the baseline.

3. Precision

Precision tells you how consistent your results are.

High Precision = The data is stable across impressions.
Low Precision = The data is noisy and could change with more volume.

Precision Level	What it means
Very High	Use this result confidently for decision-making.
High	Use this result confidently for decision-making.
Moderate	Usable but could be improved. Consider additional validation for higher certainty.
Low	Too much variability. Results are not actionable; collect more data or refine the test.

Example:
A Score of 21 with Moderate Precision means engagement looks good, but more impressions would make the result more reliable.

💡 Precision tells you how stable your Score and Uplift numbers are.

4. Confidence

Confidence tells you whether the difference between a variant and the baseline is real—or just random chance.

High Confidence = Very likely real
Low Confidence = Could just be noise

Confidence Level	Meaning
Very High (95%+)	Almost certain the uplift is real.
High (85–95%)	Likely real.
Medium (75–85%)	Directional. Use with caution.
Low (50–75%)	Could easily be random.

Example:
A Confidence of 94% means there’s a very strong chance the uplift you’re seeing is real.

💡 Confidence tells you if the variant really beats the baseline. It doesn’t say by how much—that’s what Uplift (combined with precision) is for.

Precision vs Confidence

	Precision	Confidence
What it tells you	How consistent the results are across impressions.	How likely it is that one variant truly outperforms another.
If low...	Results might swing if you collect more data.	The difference might not be real.
In plain English	"Is my data stable?"	"Is my winner really a winner?"

Simple rule of thumb:

Precision = Trust the quality of the data.
Confidence = Trust the reality of the winner.

Real Example: Reading a Result

Let’s put it all together with this real result:

Message Variant: "Same-day or next-day delivery guaranteed."

Metric	Result	Interpretation
Score	21	Solid engagement.
Precision	Moderate	Some noise; extend test if critical.
Uplift	+63%	Variant strongly outperforms baseline.
Confidence	High (94%)	Very likely a real winner.

Bottom Line:
This message is outperforming and resonating with buyers. It’s a strong result—but because Precision is only Moderate, you might choose to gather a bit more data before scaling fully.

Metrics Applied to Segments, Too

Heatseeker applies Score, Precision, Uplift, and Confidence not just to variants—but to audience segments too:

By region
By age group
By job title
By custom combinations

No matter where you drill down, you get the same consistent, comparable insights.

Questions?

Send us a note in Intercom—we’re happy to help interpret your results or plan your next experiment.

Understanding the Heatseeker Metrics