← Playbook hub
algomarketing
Run Book · Part 2 of 7
Part 2 · The Pipeline · Delivery Run Book

The Pipeline

The build itself — six stages that turn a client's content from freeform pages into a structured, semantic, AI-ready system. Plan, Model, Structure, Govern, Deliver, Optimize. This is the heaviest engineering work of the engagement, and the run book a delivery team builds from: what each stage is for, who does it, the effort it takes, what goes in and comes out, and the artifacts that outlast you.

Duration
~3–6 monthsPhase 2 Build into Phase 3 Scale
Effort
Substantialthe heaviest part of the seven
Client team
Engineer + teamarchitect, taxonomist, devs, content
Output
AI-ready systema working, structured content platform
Overview

What this part delivers, and why

Assess told us where the client starts. The Pipeline is where we actually build. Six stages take content from freeform pages to a structured, semantic system that people and machines can use — and the order matters: you cannot skip ahead to agents and delivery before the model underneath them exists. Every stage carries its proven framework and the AI layer that now sits on top of it; the run book below is how a delivery team executes each one without reinventing the approach.

The six stages at a glance
  • 1 · Plan — agent-assisted ideation, content-gap audits and governed briefs. Requires the model to exist first.
  • 2 · Model — content modelling, taxonomy, metadata, schema, the knowledge graph. The highest-leverage stage — most of the value lives here.
  • 3 · Structure / Author — modular, componentised content and semantic HTML; AI drafting and auto-tagging on top.
  • 4 · Govern — what governance touches the build. Kept light here; the detail lives in the Governance run book (Part 4).
  • 5 · Deliver — headless / composable / MACH, API-first, and agents as a delivery target via MCP.
  • 6 · Optimize — the measurement loop. Kept light here; the detail lives in the Measurement run book.
Everything downstream is paid for upstream. Spend the budget on the model, and Plan, Structure and Deliver almost build themselves; starve it, and you pay for the shortcut at every stage that follows.
the line that frames the whole pipeline · worth repeating at kickoff
factual accuracy from knowledge-graph grounding vs a baseline LLM
data.world benchmark
0%
translation / localisation cost cut by structured single-sourcing
reuse across channels · up to
0
stages turn freeform pages into an AI-ready content system
Plan → Optimize
The pipeline, mapped

The pipeline as a system

One build, mapped three ways. The flagship map shows the six stages as connected nodes with the gates that sit between them — click any stage or gate to see what it takes in, what it ships, who owns it, and the bar it has to clear. Then walk it step by step, or read it as a People-vs-Process swimlane below.

Flagship · interactive pipeline map
click a stage ◆ or gate ◇
Stage 1 / 6
1

Plan

Month 1 · early Build
Timebox · ~2–3 weeksLead: Content strategist + ConsultantFormat: planning workshop + async
Objective

Decide what content earns its place and why — the editorial plan, the gaps, the briefs — and set up the AI-assisted planning loop. Critically, this stage depends on Stage 2: agents can't usefully audit gaps or plan against a taxonomy that doesn't exist yet, so in practice Plan and the first pass of Model run hand-in-hand.

Planning workshop agenda (half day)
  • Recap the Assess findings & the chosen entry point (15m)
  • Audience, journeys and the editorial priorities for the period (45m)
  • Content-gap audit — run the agent across the estate, review at scale (45m)
  • Brief generation — turn priorities into structured briefs (40m)
  • Agree the prompt library standard — RACE: Role, Action, Context, Expectations (15m)
Inputs → outputs
Inputs
  • Assess findings & entry point
  • Existing editorial strategy (if any)
  • Draft taxonomy from Stage 2's first pass
Outputs
  • Editorial plan for the period
  • Prioritised content-gap list
  • Structured content briefs
  • Governed prompt library (v1)
◆ From the field

The first thing a client wants to do with AI is generate content. The first thing you should make them do is govern the prompts. A prompt that lives in someone's chat history is a one-off; the same prompt version-controlled and shared is an asset the whole team compounds on. Make that switch in week one or you'll be untangling it in month four.

Governed prompt — RACE skeleton (copy & version it)
# RACE prompt · content-gap audit · v1 · owner: strategist
ROLE:        You are a content strategist auditing a structured estate.
ACTION:      Compare the editorial plan against the live taxonomy and
             list every topic/audience/funnel-stage with no covering asset.
CONTEXT:     Taxonomy = {topic_tree}. Editorial priorities = {priorities}.
             Content types and their metadata = {model_spec}.
EXPECTATIONS: Return a prioritised table: gap, why it matters,
             suggested content type, owner. Cite the taxonomy term.
             Flag any gap that needs a new content type — do not invent one.
2

Model

Months 1–3 · the core of Build
Timebox · ~6–10 weeksLead: Modular content architectWith: taxonomist, content engineerFormat: modelling workshops + iteration
Objective

This is the highest-leverage stage in the entire playbook. Define the content model, taxonomy, metadata and schema — and, where it earns its place, a knowledge graph / semantic layer. This is where content stops being pages and becomes structured, machine-readable data. Get this right and every stage after it gets easier; get it wrong and you'll feel the drag for years. We work through Cleve Gibbon's three passes: Conceptual (types + high-level relationships) → Design (attributes, refined relationships) → Implementation (CMS-level detail).

Conceptual modelling workshop (half day)
  • Inventory the content the strategy actually needs (not what exists today) (30m)
  • Draft the content types and their relationships — entities, not strings (75m)
  • Pressure-test: does each proposed type earn its keep, or is it a variant of another? (45m)
  • Sketch the taxonomy spine & the metadata that every type carries (30m)
Design & implementation activities
  • Content model spec — every type, its fields, types-of-field, required/optional, relationships
  • Taxonomy & metadata schema — the controlled vocabularies, applied consistently across the estate
  • Schema / semantic markup — schema.org on key types so engines and AI understand meaning
  • Knowledge graph / semantic layer — where the client's domain is relationship-rich, model it as a graph (SKOS / RDF / OWL); turn "strings into things"
  • Map the model into the chosen repository (headless CMS / CCMS) at implementation level
Inputs → outputs
Inputs
  • Editorial plan & content types needed
  • Audit findings (ROT, current structure)
  • Repository / CMS decision
Outputs
  • Content model spec (the key artifact)
  • Taxonomy & metadata schema
  • Schema/semantic markup plan
  • Knowledge graph / semantic layer (where warranted)
◆ From the field

The single most common failure here isn't under-modelling, it's over-modelling. A team gets excited and ends up with forty content types nobody can hold in their head, half of which differ by a single field. Our rule: if two types share more than ~80% of their fields, they're one type with a variant flag. One client landed on fourteen content types — twelve earn their keep, and we keep threatening to kill the other two and never do, because someone always finds an edge case. Fourteen you can govern. Forty you cannot.

▲ Watch out

The opposite trap is just as real: under-modelling, where everything is a "page" or an "article" with a freeform body. That's the blob problem in a new outfit — it ships fast and quietly defeats the entire point of the engagement, because AI retrieval can't get a clean chunk out of a freeform field. If the client pushes to "just ship something simple and structure it later," that later never comes. Structure it now.

Why this stage pays off

Knowledge-graph grounding has been benchmarked at roughly 3× the factual accuracy of a baseline LLM (the data.world benchmark moved accuracy from 16% to 54%). It costs more to build than plain retrieval, so reserve the graph for the relationship-rich parts of the domain — but where it fits, this is the work that makes the AI layer actually trustworthy. This is also the clearest professional-services wedge: the modular content architect who owns ontologies and metadata is the specialist the market now needs.

“The more structure you have, the less hallucination you will have.”
Rahel Bailie · Content Integrity Model · the case for spending most of the budget here
3

Structure / Author

Months 2–4 · overlaps Model
Timebox · ~4–6 weeksLead: Content engineer + content teamFormat: authoring patterns + migration
Objective

Turn the model into how content is actually authored: modular, componentised content, real semantic HTML, single-sourcing and reuse — so one piece of content can be assembled across many channels. Then layer AI on top for first-draft generation, variant generation and autonomous tagging. The principle that keeps this honest: write for machines and you get better content for humans — explicit, unambiguous, well-chunked content serves both.

Activities
  • Define content type components & the authoring patterns for each (the spec below)
  • Establish semantic HTML standards — each <h2> is an extractable answer unit, real lists and tables, no decorative markup
  • Set up single-sourcing & reuse (conref-style) so there's one source of truth, not copy-paste
  • Migrate / re-author a representative slice into the new structure (prove the pattern before scaling)
  • Wire in the AI layer — draft & variant generation, autonomous metadata enrichment and auto-tagging — with human review
Inputs → outputs
Inputs
  • Content model spec & taxonomy
  • Structured briefs from Stage 1
  • Existing content to migrate
Outputs
  • Modular component spec / content type definitions
  • Semantic authoring standards
  • A migrated, structured content slice
  • AI drafting + auto-tagging workflow
◆ From the field

Authors don't resist structure because they're difficult — they resist it because the first structured-authoring interface they're handed feels like filling in a tax return. Spend real effort on the authoring experience: sensible field order, helper text, sane defaults. A model that's technically perfect and miserable to write into gets quietly worked around, and then you're back to blobs.

4

Govern

Threaded through Build
Timebox · light here · see Part 4Lead: Consultant + client owner
Objective

Governance is its own run book — see the Governance run book · Part 4 for the full layered control stack, oversight modes, disclosure and provenance. Here we only note what governance touches the build, so the pipeline doesn't ship something that governance later has to unpick.

What governance touches in the pipeline
  • The governed prompt library started in Stage 1 — version-controlled, reusable, not scattered
  • Machine-readable brand & editorial rules — so AI conforms by default, set up alongside the model in Stage 2
  • Risk-tiering & oversight modes wired into the CMS in Stage 3 — Agent-assisted → Human-in-the-loop → Human-on-the-loop → Human-out-of-the-loop
  • Provenance & disclosure plumbing (C2PA, metadata) on the delivery layer in Stage 5
  • Least-privilege agent access & audit logs before anything in Stage 5 runs autonomously
Inputs → outputs
Inputs
  • Governance run book (Part 4) outputs
  • The model, components & delivery layer
Outputs
  • Governance hooks built into the system
  • Oversight modes mapped to content types
5

Deliver

Months 4–6 · Build into Scale
Timebox · ~4–6 weeksLead: Content engineer + devsFormat: architecture + integration
Objective

Make the content addressable as data — delivered to any channel through APIs, on a headless / composable / MACH architecture — and stand up the newest delivery target: agents. The contract for agents is MCP (Model Context Protocol): content lives behind a server, the agent reads the schema, retrieves and acts through a standard interface. Reuse compounds here — structured single-sourcing alone can cut translation/localisation cost by 30–50% across channels.

Activities
  • Confirm the headless / composable architecture & the API surface for each channel
  • Expose content as structured data via API — the model from Stage 2 becomes the contract
  • Stand up an MCP server so external agents get governed, scoped, structured access
  • Bake governance into the access layer — least-privilege agent identities, gateways, server vetting (MCP is a new attack surface)
  • Verify omnichannel assembly — one source, many channels, no re-keying
Inputs → outputs
Inputs
  • Structured content & component library
  • Repository / headless CMS
  • Governance access rules (Stage 4)
Outputs
  • Delivery / API & channel spec
  • Live API-first delivery
  • MCP server for agent access
▲ Watch out

The seductive mistake of 2026: building the agent and the MCP server before the model exists. A client will ask to "just put an MCP server in front of the current CMS." Don't. If the content underneath is unstructured, all you've done is give an agent a fast, governed pipe to your mess — it will retrieve blobs and hallucinate confidently. The MCP server is only as good as the model behind it. Stage 2 first, every time.

MCP tool surface over the model (illustrative — copy to adapt)
// content lives behind a server; the agent reads the schema and acts
tool search_content(query, type?, taxonomy?) // scoped retrieval over the model
tool get_entity(id)                       // returns structured fields, not a blob
tool list_types()                        // the content model = the contract

guardrails:
  identity: least-privilege, scoped per agent
  access:   published-status only
  audit:    every call logged · server vetted
6

Optimize

Phase 3 · Scale, ongoing
Timebox · light here · see Measurement run bookLead: Consultant + client
Objective

Close the loop: measure whether the engineered system is actually performing, and feed the gaps back into Model and Structure. Optimize is its own run book — see the Measurement run book for the full scorecard, sampling method and cadence. Here we only flag the two signature AI-era outputs and point to where they're detailed.

The signature outputs (detailed in the Measurement run book)
  • AI Share of Voice — your brand's citations ÷ total category citations, sampled 30+ times per prompt across the engines that matter, because AI citations swing month to month. Where AI doesn't yet know you becomes the next thing to model.
  • The "Can You Tell?" test — the honest quality bar: put engineered content head-to-head with human-written and see if a panel can spot the difference. Near a 50% guess rate, it's cleared the bar; if they can tell, the giveaways feed straight back to Structure and Model.
Inputs → outputs
Inputs
  • Live, delivered content system
  • Analytics + AI citation sampling
Outputs
  • Measurement loop feeding back to Stages 2–3
  • (Full scorecard → Measurement run book)
Roles & effort

RACI & effort summary

Who does what across the six stages. R Responsible · A Accountable · C Consulted · I Informed. The model stage is where the architect leads and most of the value concentrates.

StageSponsorContent leadModular architectDevs / MartechConsultant
1 · PlanIRCIA
2 · ModelICRCA
3 · Structure / AuthorICARC
4 · GovernACCCR
5 · DeliverIICRA
6 · OptimizeCRCIA
MonthFocusStages in flight
Month 1Plan & conceptual modelling — editorial plan, gaps, briefs, first content types1 · 2
Month 2Model design — full content model spec, taxonomy & metadata schema; begin structuring2 · 3
Month 3Model implementation & knowledge graph; component spec & semantic authoring standards2 · 3 · 4
Month 4Migrate a structured slice; wire AI drafting/tagging; begin delivery architecture3 · 4 · 5
Month 5API-first delivery live; stand up MCP server with governed access5 · 4
Month 6Omnichannel verification; hand into Scale & the measurement loop5 · 6
The same six stages, split by People vs Process

A second read on the pipeline: the People lane is the human craft each stage demands; the Process lane is the system / artifact it produces. Hover a cell for the detail.

Lane
1 Plan
2 Model
3 Structure
4 Govern
5 Deliver
6 Optimize
People
Strategist runs the workshop
Architect leads modelling
Engineer + authors
Consultant + owner
Devs + martech
Consultant + client
Process
Editorial plan + governed prompts
Model spec + taxonomy + graph
Components + semantic HTML
Hooks + oversight modes
API-first + MCP server
Feedback loop → Stages 2–3
Templates & worksheets

The artifacts you use and leave behind

Three core templates are spelled out below — the content model spec, the taxonomy & metadata schema, and the delivery / API & channel spec. The full set produced in this part is indexed at the end.

Template 1 · Content model spec (Stage 2)

One row per content type — the blueprint

Content typeCore fields (type)RelationshipsMetadata / taxonomyReuse
Articletitle (text), summary (text), body (rich/modular), author (ref), hero (ref:image)→ author, → topic, → related articlestopic, audience, funnel stage, datebody chunks single-sourced
Productname (text), spec (structured), price (number), description (modular)→ category, → related products, → docscategory, use case, regionspec reused across channels
FAQ / Q&Aquestion (text), answer (rich), entity (ref)→ product, → topic, → articletopic, intent, last-reviewedanswer = an AI answer unit
Author / personname (text), bio (text), credentials (list)→ articles, → topics of expertiseexpertise areareferenced, never copied

Rule of thumb: if two proposed types share >80% of their fields, collapse them into one type with a variant flag. Capture required vs optional per field. Aim for the smallest set of types that covers the strategy — the team has to hold them in their head.

Template 2 · Taxonomy & metadata schema (Stage 2)

Controlled vocabularies — entities, not strings

  • Topic taxonomy — the controlled subject tree (SKOS-style: broader/narrower/related), one term per concept, with synonyms mapped.
  • Audience / persona — the fixed list of who content is for; no free-text variants.
  • Funnel / journey stage — awareness → consideration → decision → retention (or the client's equivalent).
  • Content format — the rendered shape (guide, FAQ, comparison, case study…), distinct from content type.
  • Lifecycle metadata — owner, created, last-reviewed, review cadence, status, provenance (human/AI-assisted + approver).
  • Entity references — products, people, locations as linked entities (RDF/OWL where a graph is warranted), so AI gets relationships not just keywords.

Every content type in Template 1 must declare which of these it carries, and which are required. Apply consistently across the whole estate — inconsistent tagging is the gap that defeats retrieval.

Template 3 · Delivery / API & channel spec (Stage 5)

How content reaches every channel — and every agent

Channel / targetDelivery methodWhat it pullsGovernance
Website / DXPHeadless API (REST/GraphQL)Assembled components by type + taxonomyPublished-status only
Mobile / appSame API, different assemblySame source, channel-shaped variantPublished-status only
AI agentsMCP serverSchema + scoped retrieval over the modelLeast-privilege, scoped identity, audit log
LocalisationAPI + translation memorySingle-source content → reuse 30–50% savingsLocale approver per market

One source of truth, many delivery shapes. The MCP row is the 2026 addition: agents are a delivery target like any channel — but with their own access-control and audit requirements.

Full template index for this part
Editorial plan — what to make this period, and why
Content-gap list — prioritised gaps from the agent audit
Content brief — the structured brief per piece
Governed prompt library — RACE-structured, version-controlled prompts
Content model spec — types, fields, relationships (above)
Taxonomy & metadata schema — controlled vocabularies (above)
Schema / markup plan — schema.org on key types
Knowledge graph map — entities & relationships (where warranted)
Modular component spec — content type definitions for authoring
Semantic authoring standards — semantic HTML & reuse rules
Delivery / API & channel spec — channels + MCP (above)
Governance hooks checklist — what the build wires in (→ Part 4)
Done criteria

Entry & exit gates

The quality bar that says this part is genuinely ready to start, and genuinely finished. The exit gate is deliberately strict — a half-built model is worse than none.

The path to done — a vertical recap

A third view of the pipeline: stages stacked as a journey, with the leverage marker on the stage that earns the budget. Scroll to light each step.

1
Plan
Month 1 · ~2–3 weeks · strategist + consultant
Agent-assisted ideation, content-gap audits and governed briefs. Runs hand-in-hand with Model's first pass.
Gate → editorial plan + prompt library
2
Model
Months 1–3 · ~6–10 weeks · modular architect
Content model, taxonomy, metadata, schema and the knowledge graph. Pages become structured, machine-readable data.
★ Highest leverage — spend the budget hereGate → model spec complete
3
Structure / Author
Months 2–4 · ~4–6 weeks · content engineer
Modular, componentised content and semantic HTML; AI drafting and auto-tagging on top, with human review.
Gate → slice migrated, pattern proven
4
Govern
Threaded through Build · consultant + owner
What governance touches the build — oversight modes, provenance, the prompt library. Full detail in Part 4.
Gate → hooks wired in
5
Deliver
Months 4–6 · ~4–6 weeks · engineer + devs
Headless / composable / MACH, API-first delivery, and agents as a delivery target via an MCP server.
Gate → content addressable as data
6
Optimize
Phase 3 · Scale · ongoing · consultant + client
The measurement loop — AI Share of Voice and the "Can You Tell?" test feed gaps back into Model and Structure.
Gate → loop handed to Measurement
Before you start (entry)
  • Assess complete — entry point & roadmap phase agreed
  • Phase 2 scope & budget signed off; team named (architect, taxonomist, devs)
  • Repository / CMS direction chosen (or scoped to decide in Stage 2)
  • Editorial strategy clear enough to model against
Before you finish (exit)
  • Content model spec + taxonomy & metadata schema complete and implemented
  • Schema/semantic markup in place; knowledge graph where warranted
  • A representative slice authored modularly & migrated — pattern proven
  • Content addressable as data via API; MCP server with governed access where in scope
  • Governance hooks wired in (oversight modes, provenance, prompt library)
  • Measurement loop handed to the Optimize / Measurement run book
algomarketing
Run Book · Part 2 · The Pipeline (v0.1) · the core build of the engagement — Plan → Model → Structure → Govern → Deliver → Optimize. Governance detail in Part 4; measurement detail in the Measurement run book. ← back to the playbook hub