In SEO, you tracked your average position and your share of voice. In a world where AI answers instead of the ten links, a new metric emerges: your share of answers — how often you, rather than a competitor, get cited in AI engines' replies. Here's how to measure it without fooling yourself.
A metric for a world without ten blue links
Share of voice measured your presence in search results; share of answers measures your presence in generative replies. The principle is the same — what fraction of attention do you capture? — but the surface has changed: it's no longer a page of links, it's a prose answer citing two or three sources.
How it's measured, concretely
The method is hands-on but robust: you define a set of real buyer questions, ask them to the engines, several times, and count who gets cited. Share of answers is the frequency of your citations across the whole.
- List the prompts: the real questions your customers ask ("best fragrance-free SPF 50 cream"…).
- Query each engine: Claude, ChatGPT, Gemini, Perplexity, Google AI.
- Repeat the runs: several executions per question, because the answer varies.
- Detect citations: does your brand/product appear, and competitors'?
- Compute the share: frequency of your citations over the total.
Why it's harder than a Google ranking
- The answer isn't deterministic: the same question can yield different citations from one run to the next.
- No clickstream: no fixed position to read off, you must interpret a prose answer.
- Engines differ: a product cited by one may be ignored by another.
- Phrasing matters: rewording the question sometimes changes the answer.
Variability
This is the main difficulty. Because generation is probabilistic, a single run proves nothing. You need multiple runs per question to get a stable frequency, not a random snapshot.
The per-engine breakdown
Aggregating all engines into a single number hides what matters. You can be heavily cited by Perplexity and absent from Gemini. A useful share of answers reads engine by engine — that's where you see where to act.
{
"prompt": "meilleure crème solaire SPF 50 sans parfum",
"runs": 5,
"engines": {
"claude": { "cited": 4, "share": 0.80 },
"chatgpt": { "cited": 2, "share": 0.40 },
"perplexity": { "cited": 5, "share": 1.00 },
"gemini": { "cited": 1, "share": 0.20 },
"google_ai": { "cited": 3, "share": 0.60 }
}
}On this question, you dominate Perplexity but are nearly invisible on Gemini. Without this breakdown, an average figure ("60%") would hide exactly the engine where you have work to do.
The trap: measuring on too little data
A share of answers computed on 3 questions and 1 run isn't a measurement, it's noise. For honest signaling, you need a floor: a minimum of prompts, runs and engines below which you don't show a percentage, but a plain "insufficient data". No number beats a wrong number.
What to do with the result
- Spot your absences on the questions that truly matter.
- Fix structure and attributes where you're missing.
- Re-measure after a few weeks to see movement.
- Track per engine over time, rather than a frozen global figure.
What isn't measured honestly can't be steered. And a bad number is worse than no number.