Run Book · Measurement

Overview

What this part delivers, and why

Most content teams measure only output — how much they published. This run book sets up something better: a scorecard that measures the whole system — the foundations that make content work (leading indicators we can move this quarter) and the payoff they produce (lagging indicators the business cares about). We choose the right few metrics, capture a baseline, wire up the tooling, and hand over a live dashboard and a cadence that survives after we leave.

The five steps at a glance

1 · Choose the metric set — pick the right few across five categories; don't measure everything.
2 · Baseline the current state — capture starting numbers so improvement is provable.
3 · Instrument the tooling — analytics, automated quality scoring, AI Share-of-Voice sampling, and the "Can You Tell?" test.
4 · Build the dashboard & cadence — what's reviewed weekly / monthly / quarterly, and who sees it.
5 · Tie to ROI & the business case — connect content to pipeline, not just clicks.

The five metric categories

Efficiency & throughput (leading) — time-to-publish, production velocity, reuse rate, revision cycles, cost per asset.
Quality & trust (leading) — automated quality score, accessibility, factual accuracy, brand-voice conformance, the "Can You Tell?" pass rate.
Audience & business outcomes (lagging) — engagement, conversion, pipeline influenced, organic + AI-referred traffic, content ROI.
AI visibility (lagging) — AI Share of Voice (entity & citation), AI referral traffic, citations per engine, presence on cited sources.
Capability & maturity (leading) — the Part 1 maturity score re-run quarterly, % of content AI-ready, governance compliance, content-debt trend.

Measure the system, not just the output. How much you published is the easiest number to game and the least worth knowing.

the principle the whole scorecard is built on

The live scorecard · illustrative snapshot

A worked example of the dashboard this run book stands up — leading indicators (teal) feeding lagging outcomes (purple). Numbers are illustrative.

Time-to-publish

0days

▼ 34% vs baseline 9.4

Quality score

0/100

▲ target 80+ cleared

AI Share of Voice

▲ +7pt vs named comp.

Pipeline influenced

£0M

▲ +£0.9M this quarter

Leading → lagginghover a leading metric to see the outcome it feeds

Leading indicators are the foundations you can move this quarter; lagging indicators are the business payoff that follows. The scorecard tracks both so you can act before the lagging number is already late.

● Leading · move now

Efficiency & throughputtime-to-publish, velocity, reuse

Quality & trustquality score, "Can You Tell?", accuracy

Capability & maturitymaturity level, % AI-ready, content debt

● Lagging · the payoff

Audience & conversionengagement, conversion rate

AI visibilityShare of Voice, citations, AI referrals

Content ROI & pipelinepipeline influenced, value ÷ cost

Metric treeactivities → outputs → outcomes

Every number on the scorecard traces back to an activity the team controls. Read top-down: what we do produces what we ship, which moves what the business cares about.

Activitieswhat the team does

Engineer & structure contentRun quality scoringSample AI Share of VoiceRe-run "Can You Tell?"

Outputswhat gets shipped

Published assets at quality barAI-ready structured contentReuse across localesScored quality & SoV readings

Outcomeswhat the business gets

Pipeline influencedAI Share of Voice up vs competitorsLower cost per assetMaturity +1 level / year

Choose the metric set

Week 1

Timebox · ~1.5 daysLead: ConsultantFormat: 2-hr selection workshop

Objective

Pick the right handful of metrics across the five categories — enough to measure the system, few enough that the client will actually maintain them. Tie each chosen metric to a decision someone makes, so nothing is collected for its own sake.

Workshop agenda (2 hrs)

Recap the five categories and why leading + lagging together (15m)
For each category, shortlist 2–4 candidate metrics against client goals (50m)
Pressure-test each: can we source it? does it drive a decision? (25m)
Cut the list — agree the minimal viable scorecard (20m)
Assign an owner and a target to each survivor (10m)

Inputs → outputs

Inputs

Client goals & the Part 1 findings
The five-category metric menu
What the board already asks about

Outputs

Agreed metric set (the minimal scorecard)
Owner + target per metric

◆ From the field

The single most useful thing you do in this part is kill vanity metrics. "Pageviews" and "assets published" feel like progress and steer nothing. Ask of every candidate: who changes what they do when this number moves? If nobody, strike it — a five-metric scorecard people read beats a thirty-metric one nobody opens.

Baseline the current state

Week 1–2

Timebox · ~1.5 daysLead: AnalystMethod: snapshot + documented method

Objective

Capture a starting number for every chosen metric — and write down exactly how it was measured — so that any improvement later is provable rather than asserted. No baseline, no proof.

Activities

Pull a current value for each metric from its agreed source
Record the measurement method beside each — so the next reading is comparable
Take the first AI Share-of-Voice reading as a baseline (sampled, not single-shot)
Run a first "Can You Tell?" panel to set the quality starting point
Note any metric you can't baseline yet — that gap is a Step 3 task

Inputs → outputs

Inputs

Agreed metric set (Step 1)
Analytics & CMS access

Outputs

Baseline snapshot (value + method per metric)
List of metrics not yet measurable

Instrument the tooling

Week 2–3

Timebox · ~2.5 daysLead: Analyst + MartechMethod: wire each source once

Objective

Make every chosen metric collectible on a repeatable schedule without manual heroics — analytics tagged, automated quality scoring running, the AI Share-of-Voice sampling set up, and the "Can You Tell?" test ready to re-run.

Activities

Analytics — confirm tagging captures engagement, conversion and AI-referred traffic; segment AI referrers
Automated quality scoring — wire clarity / consistency / tone / compliance checks into the pipeline so quality is scored, not eyeballed
AI Share-of-Voice sampling — set up the prompt set, the 30+ samples per prompt across ChatGPT, Google AI, Perplexity and Copilot, and capture entity vs citation results
"Can You Tell?" test — stand up the swipe panel, the snippet pool, and the scoring sheet so it re-runs each cycle
Document the run schedule and owner for each instrument

Inputs → outputs

Inputs

Metric set + baseline gaps
Analytics, CMS & SoV tooling access

Outputs

Instrumented, repeatable data sources
SoV sampling protocol live
"Can You Tell?" panel ready to re-run

▲ Watch out

Never treat a single AI Share-of-Voice reading as signal. AI answers vary run to run — the research puts month-to-month swing around 40–60% — so one query tells you nothing. Sample each prompt 30+ times across the engines and average it before anyone draws a conclusion, or you'll report noise as a trend.

◆ From the field

Whatever can't be collected on a schedule won't survive past the engagement. If a metric needs someone to hand-pull a spreadsheet every Friday, it dies the first busy week. Automate it or cut it — a slightly cruder number that arrives every cycle beats a perfect one that stops.

Build the dashboard & cadence

Week 3

Timebox · ~1.5 daysLead: Consultant + Analyst

Objective

Turn the instrumented metrics into one live scorecard, and define the review rhythm — what's looked at how often, and who sees it. The cadence is what makes the dashboard a habit rather than a one-off chart.

The cadence to set up

Weekly — efficiency & quality leading indicators, for the content team
Monthly — quality trend, business outcomes and AI visibility, for the content lead + sponsor
Quarterly — re-score maturity (the Part 1 instrument), review the full scorecard and targets, for the exec sponsor
Confirm who owns each review and where the dashboard lives

Inputs → outputs

Inputs

Instrumented data sources (Step 3)
Baseline values (Step 2)

Outputs

Live scorecard / dashboard
Reporting cadence + named owners

Metric explorerswitch the metric · hover a point

A worked example of the trend view in the live scorecard. Toggle between metrics; each shows eight cycles against its target. Illustrative numbers.

Quality score

target 80 · cleared

"Can You Tell?" rate

≈50% = indistinguishable

% content AI-ready

target 75% by year-end

Progress to targetcurrent value vs the agreed goal

Time-to-publishnow 6.2d · target 5.0d · 62% there

Content reuse ratenow 41% · target 60% · 68% there

AI Share of Voicenow 18% · target 28% · 64% there

Maturity levelnow 3.1 · target 4.0 · 78% there

Tie to ROI & the business case

Week 3–4

Timebox · ~1.5 daysLead: ConsultantFormat: ROI model + readout

Objective

Connect the lagging outcomes to pipeline and revenue, not just clicks — so the scorecard answers the question the board actually asks. This is where measurement becomes a business case the sponsor can defend.

Activities

Map the chain from content → engagement → conversion → pipeline influenced → revenue
Attach efficiency gains — reuse and localisation savings cut cost per asset (structured reuse alone can cut translation cost ~30–50%)
Express ROI as value ÷ cost, with honest assumptions stated
Set the north star: move up one maturity level a year while AI Share of Voice climbs against named competitors
Package it into the readout for the sponsor

Inputs → outputs

Inputs

Live scorecard + baseline
Pipeline / revenue data from the client

Outputs

Content-to-pipeline ROI model
Business-case readout

▲ Watch out

The "Can You Tell?" pass bar is ~50% — content the panel can't reliably tell apart from human writing has cleared the bar. But treat it as one signal, not proof of quality on its own. A piece can read as human and still be wrong, off-brand or useless; pair it with the factual-accuracy and brand-voice scores before you call anything good.

Roles & effort

RACI & effort summary

Who does what across the part. R Responsible · A Accountable · C Consulted · I Informed.

Activity	Sponsor	Content lead	Analytics / Martech	Lead consultant	Analyst
Choose the metric set	C	C	C	R	C
Baseline current state	I	C	C	A	R
Instrument tooling	I	I	R	A	R
Dashboard & cadence	C	C	C	R	R
Tie to ROI & business case	A	C	I	R	C

Week	Focus	Consultant days
Week 1	Choose metric set, start baselining	~2.5
Week 2	Finish baseline, instrument tooling	~3
Week 3	Dashboard & cadence, start ROI model	~2.5
Week 4	Business case, readout, handoff	~1.5

Templates & worksheets

The artifacts you use and leave behind

Four core templates are spelled out below; the full set produced in this part is indexed at the end.

Template 1 · Metric-definitions sheet

One row per metric — so it stays measurable

Metric	Formula / definition	Source	Cadence	Owner	Target
Time-to-publish	Days from idea approved to live	CMS / workflow tool	Weekly	Content ops	↓
Content reuse rate	Reused components ÷ total components × 100	CMS	Monthly	Content lead	↑
Automated quality score	Clarity + consistency + tone + compliance, scored	Quality tool	Weekly	Analyst	80+
"Can You Tell?" rate	% panel guesses correct (≈50% = pass)	Swipe panel	Quarterly	Content lead	≈50%
Pipeline influenced	Revenue of deals content touched	CRM / analytics	Monthly	Sponsor	↑
AI Share of Voice	Brand citations ÷ total category citations × 100	SoV tool	Monthly	Analyst	↑ vs comp.
Maturity level	Part 1 scorecard average, 1–5	Quarterly re-score	Quarterly	Consultant	+1 / yr

Keep the formula and source explicit — it's the only way a later reading is comparable to the baseline. One owner per row, always.

Template 2 · AI Share-of-Voice sampling protocol

Turning a noisy signal into a defensible number

Formula — your brand's citations ÷ total citations in your category × 100.
Prompts — a fixed set of buyer-intent prompts for your category; version them so the set stays constant run to run.
Engines — sample across ChatGPT, Google AI, Perplexity and Copilot (4–5 engines); report per engine and blended.
Sample size — 30+ samples per prompt per engine, then average — because AI answers swing 40–60% month to month, one reading is noise.
Entity SoV — is yours the brand the AI names as the answer?
Citation SoV — are you cited as a source it links to? Track both, benchmarked against named competitors.
What to do with gaps — topics/entities where AI doesn't know you feed straight back into the Model & Structure stages as the next content to engineer. That's how the loop closes.

Template 3 · "Can You Tell?" test protocol

The blind human-vs-machine quality bar

Panel — internal reviewers or, better, a target-audience panel; the closer to the real reader, the more honest the result.
Sample — interleave engineered/AI-assisted snippets with genuinely human-written ones; one snippet at a time, swipe "AI" or "Human".
Pass bar — ≈50% guess rate, i.e. indistinguishable from human. Below that they can tell; the content still reads as machine-made.
What to do with results — where the panel reliably spots the machine, that's a precise signal that feeds back into the Structure and Model stages.
One signal only — passing means it reads human, not that it's accurate, on-brand or useful. Always pair with the factual-accuracy and brand-voice scores.

Template 4 · Scorecard / dashboard spec

What the live scorecard must show

The five categories grouped — leading (teal) above lagging (purple)
Each metric: current value, baseline, target, and trend arrow
AI Share of Voice broken into entity vs citation, vs named competitors
"Can You Tell?" rate shown alongside accuracy & brand-voice, never alone
Maturity level (1–5) with the quarterly re-score date
A view per audience — team (weekly), lead+sponsor (monthly), exec (quarterly)
Owner and last-refreshed date visible on every panel
The north-star line: maturity ↑ one level / year while SoV climbs

Full template index for this part

Metric-definitions sheet — metric, formula, source, cadence, owner, target (above)

Metric-selection menu — the five-category candidate list to cut from

Baseline snapshot — starting value + method per metric

AI SoV sampling protocol — prompts, engines, sample size, entity vs citation (above)

"Can You Tell?" test protocol — panel, sample, pass bar, what to do (above)

Scorecard / dashboard spec — layout, audiences, refresh (above)

Reporting cadence calendar — weekly / monthly / quarterly + owners

Content-to-pipeline ROI model — content → pipeline → revenue, value ÷ cost

Business-case readout — north star, ROI, honest assumptions

Done criteria

Entry & exit gates

The quality bar that says this part is genuinely ready to start, and genuinely finished.

Before you start (entry)

Analytics & martech owner and content lead engaged
Access granted to analytics, CMS and any SoV tooling
Part 1 findings and client goals available to anchor metric choice

Before you finish (exit)

Minimal metric set agreed, each with owner + target
Baseline captured with method documented per metric
Tooling instrumented — analytics, quality scoring, SoV sampling, "Can You Tell?"
Live scorecard + weekly/monthly/quarterly cadence handed over
Content-to-pipeline ROI model and business case delivered

algomarketing

Run Book · Measurement & Success (v0.1) · part of the content-engineering delivery set. ← back to the playbook hub