algomarketing

Practice Playbook · Outline v0.1

The Content Engineering
Playbook

A field guide for implementing content engineering inside mid-market and enterprise marketing teams — and laying a modern AI layer over the frameworks that already work.

Foundation

Structured & semantic content

→

Becomes

Machine-consumable

→

Which makes it

AI-citable & agent-ready

The thesis running through every section: the AI layer is the payoff of content engineering done well, not a separate discipline. AI didn't change the requirements — it added a forcing function and a reward.

Overview · How it fits together

The playbook at a glance

Seven parts. A diagnostic up front, the engineering pipeline as the spine, then the operating model, governance, and roll-out wrapped around it. The AI layer is woven through every part — never a bolt-on chapter.

PART 0

Foundations

What content engineering is (and isn't). The strategy / engineering / operations triad. The case for it now.

Click to open ↗

PART 1

Assess

Maturity diagnostic, content audit (ROT), content-debt & AI-readiness gap analysis.

Click to open ↗

PART 2 · THE SPINE

The Content Engineering Pipeline

Six stages — Plan → Model → Structure → Govern → Deliver → Optimize — each with its classic framework and the AI layer on top. This is the heart of the playbook.

Click to open ↗

PART 3

Operating Model

Roles (incl. the modular content architect), team topology, RACI, the Content Services Organization.

Click to open ↗

PART 4

Governance & Risk

The layered control stack, human-oversight modes, brand safety & provenance for AI content.

Click to open ↗

PART 5

Implementation Roadmap

The 60 / 90–120-day engagement model: Clarity → Build → Scale & Prove.

Click to open ↗

PART 6

Tooling & Reference

The 5-layer reference stack, framework appendix, glossary, reusable templates.

Click to open ↗

◎

Explore the seven parts

Hover or click any card above for a one-line summary of what that part covers — and where it sits relative to the Part 2 pipeline spine. Click again to jump straight to the section.

Part 0 · Foundations

What content engineering actually is

Content engineering is the practice of designing the structure behind content — the models, metadata, taxonomy and schema — so a single piece of content can be reused, assembled, and understood across any channel, by people and, increasingly, by machines. It treats content as structured data, not as one-off pages or documents.

It isn't writing, and it isn't "using AI to make content faster" — those sit on top of it. Without the structure underneath, AI just produces more unstructured content, faster. The three disciplines below are distinct but inseparable:

Decides what & why

Content Strategy

Audience, journeys, the editorial plan — what content earns its place, and why.

“CEO of content”

Builds the structure

Content Engineering

Content models, metadata, taxonomy, schema — content as reusable, machine-readable data. This is where we focus, and what makes content AI-ready.

“CTO of content”

Runs the system

Content Operations

The people, workflow, tooling and governance that turn the plan into published content, repeatedly.

the engine room

Why now

AI is the forcing function. RAG pipelines, AI search, and agents all need structured, semantic content to work — the exact thing content engineering produces. Teams whose content is already well-engineered are the ones seeing AI pay off; everyone else is amplifying their mess, faster. That's the case for doing this now, not in two years.

Two paths, one outcome

Throughout the playbook we colour-code the two forces that have to meet: the people who decide and create, and the process that makes content machine-readable. Neither alone gets you AI-ready content — they converge on it.

Path A (teal) + Path B (purple) → the AI layer is the reward when both are done well

How to use this playbook: it works as a map for leaders deciding whether to invest, and as a delivery guide for the team implementing. Use the rail on the left to jump between parts — the pipeline (Part 2) is the heart of it.

Open the Foundations run book → Orientation & alignment · steps, agendas, RACI & templates

Part 2 · The pipeline

The pipeline spine, with the AI layer on top

Read each column top to bottom: what the AI layer adds, the classic framework underneath it, and the AI-readiness requirement that content engineering must produce for the AI layer to actually work.

AI layer — what agents & models add

Classic framework — the proven foundation

AI-readiness requirement

STAGE 1

Plan

AI layerAgent ideation, content-gap audits at scale, brief generation, demand forecasting. Prompts become governed assets.

ClassicRockley plan stage · Brain Traffic Quad · CMI strategy

Make AI-readyModel & taxonomy must exist before agents can plan against them

STAGE 2

Model

AI layerKnowledge graphs / GraphRAG grounding (~3× accuracy). Model as a dimensional knowledge graph.

ClassicSaunders' 7 disciplines · Gibbon modelling · DITA typing

Make AI-readyTyped fields, relationships, entities-not-strings, SKOS/RDF

STAGE 3

Structure

AI layerDraft & variant generation, autonomous tagging/metadata. Write for machines = better for humans.

ClassicDITA structured authoring · single-source · reuse

Make AI-readyModular components, semantic HTML, no copy-paste sprawl

STAGE 4

Govern

AI layerBrand / compliance checking agents, AI QA, machine-readable brand rules. Risk: off-brand "AI slop."

ClassicWelchman governance · policy + standards + rights

Make AI-readyRisk-tiered human oversight, provenance, least-privilege agents

STAGE 5

Deliver

AI layerAgents are now a delivery target. MCP is the contract — agents read the schema, retrieve, act.

ClassicMACH / composable · headless CMS · API-first

Make AI-readyContent addressable as data via API/MCP + governed access

STAGE 6

Optimize

AI layerAI Share of Voice, citation tracking across engines, "proof" as the new hard part. Plus the blind human-vs-machine test — if readers can't tell, the engineered content passes.

ClassicAprimo metrics · content audits · web analytics

Make AI-readyMulti-engine sampling, freshness, off-site brand mentions

↻ Measurement & insight feeds back into planning — the loop, not a line

Signature QA mechanism · Stage 6

The "Can You Tell?" test

The hardest, most honest quality bar for engineered & AI-assisted content: put it head-to-head with human-written content and see if anyone can spot the difference. We run it as a Tinder-style swipe tool — reviewers (or a target-audience panel) see one snippet at a time and swipe "AI" or "Human."

If they can't reliably tell them apart, the content has cleared the bar. If they can, it tells us exactly where the engineered content still reads as machine-made — a precise, repeatable signal that feeds straight back into the Structure and Model stages.

✓ Pass condition: indistinguishable from human ≈ 50% guess rate

Card 1 / 8 Score 0/0

Marketing snippet · which is it?

"We rebuilt the checkout flow over a weekend, and the support tickets just… stopped."

Open the Pipeline run book → The six build stages · the largest run book

Part 1 · Assess

Assess: where does this team actually start?

This is the first thing we run with a client, and its only job is to answer one question: given how this team works today, what should they fix first — and what are they not yet ready for? We score the content operation, translate it into language the boardroom recognises, and the gap between the two becomes the finding that sets the plan.

STEP 1 · SCORE

Rate the operation 1–5

Score the team against five criteria — content vision, model & taxonomy, governance, measurement, and reuse — using Content Science's maturity levels. This is the objective working diagnostic.

STEP 2 · TRANSLATE

Map to the AI posture

Translate that score into Gartner's Curious / Competent / Confident stages — the AI-readiness language executives and budget-holders already use.

STEP 3 · ACT

Set the entry point

The level dictates where to start in the pipeline and which roadmap phase to enter. It stops teams skipping ahead to agents before the foundations exist.

Step 1 · the score · Content Science

Content Ops Maturity 1–5

5Thriving — mature ops, scales AI successfully

4Sustaining — measured, governed

3Scaling — standardising, not yet measured

2Piloting — pockets of good practice

1Chaotic — random acts of content

Step 2 · the translation · Gartner

AI Curious → Competent → Confident

Curious

Experimenting

Competent

Where most stall

Confident

Scaling with ROI

Why both: AI confidence is capped by ops maturity. A team can't be "AI Confident" while its content is Chaotic — so the exec posture is only real if the operational score backs it up.

The space between AI ambition and operational reality is the "competency trap" — >50% of CMOs stall at Competent, only ~⅓ see expected AI returns. Closing that gap is the engagement.

Step 3 · from diagnosis to plan

If you score…	You're realistically…	Start here in the pipeline	Roadmap entry
Level 1–2 Chaotic / Piloting	AI Curious	Stages 1–3: Plan → Model → Structure. Run the audit, build the content model & taxonomy, fix the structure. Do not deploy agents yet — they'd amplify the chaos.	Phase 1 · Clarity
Level 3 Scaling	Curious → Competent	Stage 4: Govern. Standardise, add governance with teeth, then pilot the AI layer on one controlled workflow with human-in-the-loop.	Phase 2 · Build
Level 4–5 Sustaining / Thriving	Competent → Confident	Stages 5–6: Deliver → Optimize. Expose content via API/MCP, scale agents, stand up AI Share-of-Voice measurement and the "Can You Tell?" test.	Phase 3 · Scale

The evidence behind sequencing it this way: 86% of enterprises use AI but only 29% scale it well — and the ones that do are at maturity levels 4–5. Maturity is the force multiplier, so we build it in order rather than starting with the shiny layer.

Open the full Assess run book → Delivery steps · agendas · RACI & effort · templates — the first of seven part run books

Part 4 · Governance & risk

Governance & risk: a layered stack

AI hasn't removed the need to govern content — it's raised the stakes. Once a model can write in your brand's voice at scale, a mistake (off-brand, inaccurate, or non-compliant) spreads just as fast as a good piece does. Done well, governance isn't a document nobody reads — it's what lets a team move quickly without flinching. We organise it as four layers, from the rules you have to follow down to the guardrails the industry is still figuring out.

Binding floor

EU AI Act, Article 50 — the layer you can't opt out of. From August 2026 it requires AI-generated content to be marked so machines can detect it, and clearly labelled where it could mislead (such as deepfakes). The helpful part for content teams: content that a named person reviews and stands behind is treated differently — so keeping a human accountable isn't red tape, it's the route to staying compliant.

Voluntary standards

ISO/IEC 42001 & the NIST AI risk framework — not mandatory, but fast becoming the proof point buyers ask for. When a prospect's procurement team asks "how do you govern your AI?", having adopted these lets you answer in a sentence instead of a scramble — and increasingly it's what gets you through a security review at all.

Industry disclosure

The IAB disclosure framework & C2PA provenance — how you stay honest with your audience. The smart move isn't stamping "AI" on everything; it's disclosing when AI has materially changed what someone is seeing, and attaching tamper-proof provenance data so that claim can actually be trusted.

Agentic guardrails

OWASP's Top 10 for agentic apps — the newest, least settled layer: the security risks that appear once AI agents can act on their own, like an agent being tricked by a malicious web page, or one bad output rippling through an automated pipeline. Nobody has decades of practice here yet, so it comes down to sensible limits — least-privilege access, sign-off on anything high-impact, and audit trails.

What we actually build into the engagement

These are the concrete deliverables that turn the four layers above into something a marketing team runs day to day — and what we point clients to when they ask "so what do you actually do here?"

Machine-readable brand & editorial rules — your guidelines structured so an AI follows them by default, not by luck, producing on-brand content the first time.
A content risk-tiering policy — a clear, agreed line for what can be automated versus what needs a human sign-off before it ships.
Human-in-the-loop approval flows — review and sign-off steps wired into the CMS itself, matched to each risk tier rather than bolted on.
Disclosure & provenance setup — C2PA and metadata plumbing so you can show, on demand, exactly what was AI-assisted and who approved it.
A governed prompt library — the instructions behind your content kept as version-controlled, reusable assets, not scattered through people's chat histories.
Agent audit logs & least-privilege access — the safety net and access controls you need in place before letting anything run autonomously.

How much human oversight? Match it to the risk

Not every piece needs the same scrutiny. We help teams place each content type into one of four oversight modes, so effort goes where the risk actually is.

Agent-assistedHuman drives, AI suggests

Human-in-the-loopAI drafts, human approves

Human-on-the-loopAI acts, human monitors

Human-out-of-the-loopFully autonomous, low-risk only

And a bit of honesty we build into every client conversation: even with all this structure and grounding, AI still gets things wrong — recent testing found retrieval-grounded assistants hallucinated in 17–34% of cases (Stanford, 2025). Good content engineering lowers the error rate sharply; it never makes human oversight optional.

Open the Governance run book → Layered governance · obligations, risk tiers, templates

Measurement & success · the scorecard

How we measure success

Most content teams only measure output — how much they published. We measure the whole system: the foundations that make content work (leading indicators we can move this quarter, in teal) and the payoff they produce (lagging indicators the business cares about, in purple). The two AI-era additions — AI Share of Voice and the "Can You Tell?" pass rate — only mean something when they sit on top of solid fundamentals, so we never report them alone.

Efficiency & throughput

Leading

Time-to-publish — idea to live
Production velocity — assets per sprint
Content reuse rate — reused vs net-new
Revision cycles per asset
Cost per asset & localisation savings

What good looks like: reuse rising, time-to-publish falling. Structured-content reuse alone can cut translation cost 30–50%.

Quality & trust

Leading

Automated quality score — clarity, consistency, tone, compliance
"Can You Tell?" pass rate — engineered vs human
Accessibility pass rate
Factual accuracy / hallucination rate — owned AI assistants
Brand-voice conformance

Target: quality score 80+, and a "Can You Tell?" rate near 50% — i.e. indistinguishable from human.

Audience & business outcomes

Lagging

Engagement — depth, dwell, return visits
Conversion & assisted conversions
Pipeline influenced & revenue
Organic + AI-referred traffic
Content ROI — value ÷ cost

The proof: tie content to pipeline, not just clicks. This is what the board actually asks about.

AI visibility

Lagging

AI Share of Voice — entity & citation (detailed below)
AI referral traffic & sessions
Citations per engine — ChatGPT, Google AI, Perplexity, Copilot
Presence on the sources AI cites — Wikipedia, Reddit, G2…

Method: sample 30+ times per prompt. Under ~20% SoV means AI barely knows you exist.

Capability & maturity — are we getting better at the system itself?

Leading

Maturity level 1–5 — the Part 1 score, re-run quarterly
% of content modelled / structured / AI-ready
Governance compliance — disclosure, approvals, provenance coverage
Content-debt trend — ROT % falling over time

The headline AI metric · deep dive

AI Share of Voice, explained

Search marketing had rank tracking. The AI era's equivalent is AI Share of Voice (SoV) — how often your brand gets cited or recommended inside AI answers, measured as a share of your whole category. It's how we prove the engineering work is actually paying off where buyers now look.

your brand's citations ÷ total citations in your category × 100

Because AI answers change from one run to the next, asking once tells you nothing. We sample each prompt 30+ times across the engines that matter — ChatGPT, Google AI, Perplexity, Copilot — and average it, so the number is signal rather than noise.

What we measure

Entity SoV — is yours the brand the AI names as the answer? And Citation SoV — are you cited as a source it links to? Both, benchmarked against named competitors.

What we do with it

Gaps point straight back to the Model and Structure stages — the topics and entities where AI doesn't yet know you become the next content to engineer. That's how the loop closes.

North star: move up one maturity level a year while AI Share of Voice climbs against named competitors. Cadence: efficiency & quality reviewed weekly/monthly · business & AI visibility monthly/quarterly · maturity re-scored each quarter.

Open the Measurement run book → Scorecard setup · SoV & "Can You Tell?" protocols

Part 5 · Roadmap

Implementation roadmap

This is the engagement timeline — the order we do the work in — and it's the companion to Part 1, not a repeat of it. Part 1 is the diagnostic instrument; Phase 1 below is simply where we run it on the client, alongside the alignment work. Adapted from Robert Rose's operations-to-orchestration blueprint.

Phase 1 · Clarity

~60 days

Run the diagnosis & align

Run the Part 1 assessment & content auditThis is where the Part 1 diagnostic actually happens — not a second one.
Content-debt & AI-readiness gapROT analysis; what's missing for AI to work.
Editorial strategy before intake
Executive alignment & business case

Phase 2 · Build the system

~90–120 days

Engineer the foundation

Content model, taxonomy & metadataIncl. the knowledge graph / semantic layer (Stage 2) — the highest-leverage AI-readiness work.
Governance with teeth + oversight modesRisk-tiering, in-CMS approval flows, governed prompt library.
Tooling / stack decisionsMapped to the 5-layer reference stack.
Pilot the AI layer on one workflowHuman-in-the-loop, low-risk content first.

Phase 3 · Scale & prove

Ongoing

Orchestrate & measure

Roll out across teams & channels
Stand up AI Share-of-Voice measurementThe Stage 6 metric explained earlier — track citations in AI answers.
Agentic / MCP delivery where readyContent addressable as data for agents.
Prove ROI + run the "Can You Tell?" testQuality bar: engineered content vs human-written.

Open the Roadmap run book → Program management across the three phases

Parts 3 & 6 · Operating model & stack

Operating model & reference stack

Roles — the team that runs it

Modular Content Architect ★ Content Engineer Content Strategist Taxonomist Information Architect Content Designer Content Ops Manager Knowledge-Graph Engineer

★ The modular content architect — ontologies, metadata, brand-as-graph, MCP-ready content — is the specialist role the market now needs, and our defensible consulting wedge. Structured around a Content Services Organization (strategy + engineering + operations), typically hub-and-spoke.

The 5-layer reference stack

Each layer rests on the one beneath it. The foundation — structure — is where content engineering lives; everything above only works if that base is solid. Content flows up the stack: modelled, then stored, orchestrated, made intelligent, and finally delivered to people and AI agents.

▲ foundation

Delivery

DXP / composable front endAPIsMCP

Intelligence

Knowledge graphVector DBAI generation + RAG

Orchestration

CMPContent ops platform

Repository

Headless CMSCCMSDAMPIM

Structure the foundation · where content engineering lives

Content modelTaxonomyMetadataSchema

Open the Operating model run book → Team + tech stack · roles, RACI, stack scorecard

Part 6 · Reference

Glossary — the jargon, in plain English

Content engineering carries a lot of acronyms. Here's what the ones used in this playbook actually mean.

Content model — the blueprint of your content types and how they relate (e.g. an article has a title, author, body, tags).

Structured content — content broken into labelled fields and components rather than one freeform blob, so it can be reused and read by machines.

Metadata — data about content (author, topic, date, audience) that lets systems find, filter and assemble it.

Taxonomy — your controlled set of categories and tags — the shared vocabulary for classifying content.

Schema / schema.org — a standard vocabulary that labels content so search engines and AI understand what it means.

Knowledge graph — a map of how your concepts, products and content connect — gives AI the relationships, not just the words.

Semantic HTML — using real heading, list and table tags so each section is a clean, extractable unit for AI and accessibility.

DITA — an XML standard for writing modular, reusable, structured content (common in technical documentation).

Headless CMS — a content system that stores content as data and delivers it via API to any front end, not just one website.

CCMS — Component Content Management System — a CMS built to manage and reuse small structured components.

DAM — Digital Asset Management — the library for images, video and brand assets.

PIM — Product Information Management — the source of truth for product data across channels.

CMP — Content Marketing Platform — tooling for planning, workflow and orchestration of content.

DXP — Digital Experience Platform — the layer that assembles and delivers front-end experiences.

MACH — Microservices, API-first, Cloud-native, Headless — the modern composable architecture pattern.

RAG — Retrieval-Augmented Generation — when an AI fetches your real content to answer, instead of guessing from memory.

GraphRAG — RAG that retrieves from a knowledge graph, improving accuracy by using relationships.

MCP — Model Context Protocol — an emerging standard that lets AI agents securely read and act on your content and systems.

GEO / AEO — Generative / Answer Engine Optimization — getting your content cited and recommended inside AI answers.

AI Share of Voice — how often your brand is cited in AI answers as a share of your category — the AI era's rank tracking.

ROT — Redundant, Outdated, Trivial — the content an audit flags for removal.

Content debt — the future rework piling up from shortcuts and unstructured content; now the main blocker to AI-readiness.

C2PA — a standard for tamper-proof "content provenance" metadata — what was AI-made, and who approved it.

SKOS / RDF — web standards for expressing taxonomies and relationships in a machine-readable way.

algomarketing

Outline v0.1 · built from the Content Engineering landscape & AI-overlay research. Teal = established frameworks · purple = the AI layer. Next step: expand each part into full playbook sections with templates and worksheets.