White Paper

Modular Knowledge Architecture:
A Passage-Selection Doctrine for AI Retrieval

How to engineer web content whose passages are selected, trusted, and reused during AI answer synthesis. A strategic and operational analysis for advanced practitioners.

Kurt Fischman / Growth Marshal / March 2026
Download PDF
Section 01

Executive Analysis

The core innovation of MKA is a reframe. It does not ask how to write pages that AI systems will cite. It asks a harder, more consequential question: how do you engineer passages whose upstream determinants make citation a likely downstream artifact?

That distinction matters because it relocates the optimization target. Most AI search optimization methodologies focus on surface formatting: FAQ blocks, numbered lists, schema markup, keyword placement. MKA treats those as second-order effects. The first-order concern is whether a passage contains enough informational density, local coherence, and evidentiary support to survive the retrieval pipeline on merit.

Core Thesis

Citation is not the goal. It is the receipt. Selection-worthiness, trust, and synthesis fitness are the upstream forces that produce it. Engineer for those, and citation follows.

This paper provides a strategic, operational, and tactical analysis of MKA for practitioners who already understand the landscape. It is not an introduction to AI search. It is a deep reading of the doctrine's architecture, its operational logic, and the places where it breaks new ground relative to prevailing practice.

Section 02

The Paradigm Shift MKA Codifies

Traditional SEO optimized pages as atomic units. A page ranked or it didn't. You earned a position in a list of ten blue links, and the user arrived at your front door.

AI retrieval systems operate differently. They decompose pages into passages, score those passages independently, and reassemble answers from the strongest fragments they find across many sources. Your page is no longer a destination. It is a quarry. And the retrieval system is mining it for parts.

MKA codifies this shift into a working doctrine. Its central claim is that the webpage is a modular knowledge system, not a linear essay. Each important section is a potential retrieval object that must function independently: locally meaningful, locally supported, and locally complete enough to survive being torn from context and inserted into a synthesized answer.

What this means operationally

When a retrieval system encounters your content, it is not reading your page the way a human does. It is not absorbing your narrative arc or following your argumentative flow. It is scanning for passages that answer a query well enough to be worth selecting over competing passages from other sources. If your Section 4 only makes sense because of setup in Section 2, it will lose to a competitor's Section 4 that stands on its own.

This is not a hypothetical concern. It is how modern retrieval-augmented generation works. And it demands a fundamental rethinking of how content is structured, scoped, and supported at the section level.

Strategic Insight

The unit of competition has shifted from the page to the passage. Content strategy that optimizes at the page level while neglecting passage-level independence is fighting the last war.

Section 03

The 5-Layer Hierarchy: Strategic Anatomy

MKA organizes its framework into five layers that operate as a strict hierarchy. This is not a buffet. It is a cascade. Each layer gates the value of the one below it. A passage that fails at Layer 1 cannot be rescued by excellence at Layer 3.

01
Selection-Worthiness
The passage contains information that is directly useful, specific, distinctive, and evidence-backed. Without this, nothing downstream matters.
02
Chunk Independence
The passage survives isolation. It names its subject, delivers its payload, and carries enough context to remain meaningful when extracted from the page.
03
Extraction Resilience
The passage is technically accessible. Clean HTML, semantic markup, no critical content hidden behind JavaScript execution or accordion gates.
04
Synthesis Friendliness
The passage is safe to reuse. Explicit claims, bounded scope, low referential ambiguity, evidence near the assertion. A model can quote it without guessing.
05
Human Persuasion
The passage reads well, converts readers, and maintains trust. This layer is non-negotiable but comes last in the engineering sequence.

Why the ordering matters

Most content teams instinctively start at Layer 5 and work backwards. They write for humans first, then bolt on "AI optimization" as a formatting pass. MKA inverts this. It begins with the question of whether the passage deserves to be selected at all, then engineers it for independence, extractability, and synthesis safety before polishing the human-facing layer.

This is not a demotion of human readability. It is a recognition that in an AI-mediated discovery landscape, the passage that never gets retrieved never gets read by a human either. The hierarchy ensures that retrieval fitness is baked in structurally, not sprinkled on cosmetically.

The most important insight in this hierarchy lives at the junction between Layer 1 and Layer 2. A beautifully formatted, perfectly extractable passage that contains generic information is what the doctrine calls retrieval wallpaper. Structure cannot compensate for thin substance. This is MKA's sharpest departure from conventional AI optimization guidance, which tends to overindex on formatting mechanics.

Section 04

The Content Unit Model

MKA defines an ideal anatomy for a high-utility content section. This is not a rigid template. It is a design scaffold: six elements that, when present, maximize a passage's chance of being selected, trusted, and reused.

Concept
What is being discussed? Named explicitly, not pronominally.
Answer
The direct claim or resolution. Positioned early, not buried after scene-setting.
Mechanism
How or why. The explanatory layer that transforms assertion into understanding.
Scope
Where it applies, where it does not. Boundaries that make the claim safe to reuse without distortion.
Distinction
How this differs from alternatives or adjacent concepts. The competitive signal for retrieval ranking.
Support
Evidence, example, comparison, data, source cue, or artifact. Locally placed, not deferred to an appendix.

The doctrine extends this into a canonical claim package that adds exception handling, comparison, and attribution to the core anatomy. The full package (claim, scope, evidence, mechanism, exception, comparison, attribution) represents the maximum useful density for a single passage. Not every section needs all seven. But the more you include without bloating the section, the more defensible the passage becomes against competing content.

The strategic function of this model

This unit model solves a problem that plagues most content operations: the absence of a shared definition of "good enough." When a writer or editor asks whether a section is done, the content unit model provides six testable criteria. Does it name the concept? Does it deliver the answer early? Does it explain the mechanism? Does it scope the claim? Does it distinguish from alternatives? Does it provide local support?

If the answer to any of those is no, the section has a quantifiable gap. That transforms content quality from a subjective judgment call into an auditable standard.

Section 05

The 12 Doctrine Principles: Operational Commentary

The doctrine's twelve principles encode its philosophy into operational rules. Several of these are straightforward. A few deserve closer scrutiny because they represent genuine departures from standard practice.

Principle 1: Optimize for passage selection, not page admiration

This principle forces a perspective shift. Most content teams evaluate pages holistically: does it flow? Is the narrative compelling? Does the CTA land? MKA says that the correct evaluation unit is the individual passage. A page can be beautifully composed at the macro level and still produce weak retrieval candidates at the section level. The optimization question changes from "Is this a good page?" to "Would a retrieval system have a reason to prefer this passage over ten alternatives?"

Principle 2: Information gain beats formatting neatness

This is the principle most likely to create friction in execution. Content teams and SEO practitioners have spent years internalizing the importance of formatting: proper heading hierarchy, bullet lists, FAQ markup. MKA does not dismiss these. It subordinates them. A section with a crisp definition, a meaningful example, and a fair comparison is stronger than a perfectly formatted paragraph that says nothing distinctive. Formatting is a carrier wave. Information gain is the signal.

Principle 4: Boundaries increase trust

This principle is underappreciated in practice. Most content errs toward expansive claims because they feel more authoritative. MKA argues the opposite: a passage that explicitly states where a concept does not apply, when a method breaks down, or which audiences it serves poorly becomes more trustworthy and more reusable in synthesis. Scoped truth is more useful than unscoped assertion, both to a model selecting passages and to a human evaluating advice.

Principle 5: Evidence should live near the claim

This principle directly challenges the common web content pattern of making claims in the body copy and deferring all evidence to a "Sources" footer or an "About" page. MKA argues that trust in a retrieval context is local. When a passage is excerpted, it carries only what is within its own boundaries. A claim that relies on site-wide reputation signals for credibility becomes an unsupported assertion the moment it leaves the page. Placing evidence, method notes, source cues, or named examples within the section itself makes the passage self-credentialing.

Principle 6: Query-family coverage over single-keyword targeting

Retrieval systems routinely rewrite and expand queries. A user who types "best CRM for startups" may trigger retrieval across definitional, comparison, implementation, and limitation query types. MKA argues that a page should be designed to answer a family of adjacent queries, not just one head term. This has direct architectural implications: it means building section-level coverage for definitional, comparative, operational, and edge-case angles on the same core topic.

Principle 11: Distinctiveness drives selection

This is perhaps the hardest principle to execute consistently. Most web content on any given topic converges toward the same set of talking points, often because it is produced by referencing the same source material. MKA's distinctiveness protocol asks: what makes this passage more reusable than the next ten competing passages? The doctrine identifies the levers: sharper definition, explicit mechanism, better scoping, stronger examples, fairer comparison, more honest treatment of limitations. If a section could appear on a thousand generic blogs without detection, it has failed this test.

Section 06

Query-Family Architecture

One of MKA's most operationally significant contributions is the query-family mapping protocol. Rather than designing a page around a single target query, the doctrine requires practitioners to map content against a family of plausible retrieval prompts that a user might issue in relation to the core topic.

The minimum viable query family

For any primary topic, the doctrine defines eight query variants to cover: the primary query itself, a definitional variant, a comparison variant, an implementation variant, an examples variant, an objection or legitimacy variant, a limitation or edge-case variant, and an audience-fit variant. Each variant represents a distinct user intent that a retrieval system might serve, and each requires at least one section on the page that directly answers it.

Operational Insight

Query-family mapping transforms page architecture from "what do we want to say?" into "what will retrieval systems need to answer?" This inversion is the difference between publishing content and engineering retrieval surfaces.

Semantic entry points as competitive advantage

The strategic payoff of query-family coverage is multiplicative. A page with a single semantic entry point participates in one retrieval pathway. A page with eight entry points, each backed by a self-contained section, participates in eight. In a retrieval landscape where the AI system selects the best available passage across the entire web, having more high-quality entry points dramatically increases your surface area for selection.

This is also where MKA intersects most directly with content strategy at the portfolio level. When you combine query-family mapping with the content unit model, you get a systematic method for auditing existing pages, identifying coverage gaps, and prioritizing section-level improvements that have direct retrieval impact.

Section 07

Anti-Patterns and Failure Modes

The doctrine's anti-pattern catalog is arguably as valuable as its positive prescriptions. These failure modes are not hypothetical. They are the dominant pathologies of current AI search optimization practice.

Retrieval Cosplay
Formatting pages to look AI-optimized while the underlying information remains generic. The SEO equivalent of a bodybuilder who only trains mirror muscles.
Ritualized Numbers
Injecting statistics into sections that do not benefit from quantification. Numbers should discriminate, not decorate.
Performative Nuance
Adding caveats to appear balanced without increasing truth or clarity. Hedge words as credibility theater.
Pronoun Fog
Writing sections whose referents collapse when excerpted. "It" and "this" become meaningless in isolation.
Decorative Structure
Using visual styling to simulate tables, lists, or definitions without machine-readable markup underneath.
Chrome Pollution
Allowing CTAs, navigation noise, and template furniture to crowd the knowledge surface that extractors need to reach.
Unsupported Assertion
Making important claims without local evidence where support is expected. Authority aura is not evidence.
Generic Competitor Sludge
Publishing content that says the same thing as every other page with slightly different wording. The retrieval landscape's background radiation.
Overpacked Chunks
Stuffing too many distinct ideas into one section so the semantic payload becomes muddy and hard to rank.
Robot Prose
Over-optimizing toward formulaic machine bait at the expense of human readability. The content reads like it was written for a parser, not a person.

The most insidious of these is Retrieval Cosplay because it is the easiest to mistake for real optimization. A page can have perfect heading hierarchy, clean semantic markup, FAQ schema, and definition-first openings while containing absolutely nothing distinctive. The formatting looks like it should work. But the retrieval pipeline grades on substance, not ceremony.

Section 08

Guardrails Against Doctrinal Drift

One of MKA's most mature design decisions is the inclusion of explicit guardrails. Frameworks have a natural tendency to decay into rituals. Practitioners learn the rules, then apply them mechanically, then cargo-cult the mechanics while losing the intent. The doctrine anticipates this decay pattern and builds countermeasures directly into its structure.

No formatting without informational benefit

Tables, FAQs, glossaries, and structured blocks should only exist when they increase usefulness. The guardrail prevents the accumulation of empty structure: sections that look like they're doing something useful but carry no informational payload.

No fake precision

Manufacturing numerical specificity where none exists damages credibility with both humans and models. One honest range is worth more than a fabricated exact figure. This guardrail directly counters the "ritualized numbers" anti-pattern.

No mandatory caveat theater

Caveats should sharpen truth, not decorate it. The doctrine draws a clear line between scoping a claim honestly (which increases trust and synthesis safety) and padding a claim with hedge language (which decreases clarity and signals low confidence).

No optimization that harms readability

This guardrail preserves the dual-competence requirement. The doctrine never allows retrieval optimization to override human usefulness. Robot prose that scores well on structural metrics but alienates human readers violates MKA as fundamentally as thin content does.

Watch For This

The strongest signal that MKA is being applied ritualistically rather than strategically: every section on a page has the same structural shape. The doctrine is principled and tactical. It is not a cookie cutter. Sections should be shaped by their content, not by a template.

Section 09

Implementation Framework for Practitioners

Translating MKA from doctrine into practice requires working at three levels simultaneously: page architecture, section engineering, and extraction hygiene. Each level has distinct concerns and distinct execution patterns.

Page-level architecture

Start with the core entity or concept the page addresses. Map the relevant query families. Design major sections as retrieval objects, each covering a distinct angle from the query-family map. Ensure the page provides multiple semantic entry points. Audit for chrome pollution and CTA clutter that might degrade extraction quality.

Section-level engineering

For each important section, apply the content unit model as an audit tool. Name the concept explicitly. Deliver the answer early. Explain the mechanism. Scope the claim. Distinguish from alternatives. Provide local support. Test the section by mentally excerpting it: does it still make sense, carry its own credibility, and deliver utility in isolation? If not, identify which element is missing and add it.

Extraction hygiene

Ensure core answers exist in parsable HTML. Use semantic markup that reflects actual content structure: real tables for tabular data, real lists for enumerations, logical heading hierarchy. Minimize content hidden behind JavaScript execution. Reduce boilerplate and template furniture around knowledge sections. Confirm that important definitions, comparisons, and evidence are in the page's accessible text, not locked inside images or fragile rendering patterns.

The revision workflow

For existing content, the most effective approach is a section-by-section audit using the evaluation rubric from Section 10 of this paper. Score each important section across the five criteria. Any section that scores below 3 on Selection-Worthiness or Chunk Independence should be flagged for immediate rewriting. This prioritization ensures that effort goes where the retrieval impact is highest.

Section 10

Evaluation and Scoring

The doctrine provides a five-axis evaluation rubric that can be applied to any section. For each axis, a score from 1 (weak) to 5 (elite) assesses the section's performance.

A. Selection
Does the section contain information more useful than generic alternatives? Is there original framing, evidence, or operational detail? Would a retriever prefer this over competitors?
B. Independence
Can the section stand alone without prior context? Is the entity named locally? Are all references unambiguous when the section is isolated?
C. Extraction
Does the core answer exist in clean HTML? Is the section free from chrome noise? Does markup reflect real structure?
D. Synthesis
Can a model reuse the passage without guessing scope? Are claim and mechanism explicit? Are boundaries and exceptions present where needed?
E. Readability
Is the section readable and persuasive to a human? Does it avoid robotic repetition? Would a human still trust it?

The doctrine's decision rule: any section that scores below 3 in Selection-Worthiness or Chunk Independence should usually be rewritten. These two axes are the load-bearing gates. A passage that fails either one is unlikely to be retrieved regardless of its performance on the other three.

Practical Application

Apply this rubric to your top-performing pages first. The gap between where your content actually scores and where you assumed it scored will likely be the most useful discovery in your first MKA audit.

Section 11

Closing Argument

MKA is not a formatting guide. It is not a checklist. It is an engineering doctrine for a new competitive environment where the unit of competition has shifted from the page to the passage, where retrieval systems select fragments rather than rank URLs, and where the content that wins is the content that deserves to win at the section level.

The doctrine's central move is to subordinate surface optimization to substance optimization. It does not dismiss formatting, markup, or structure. It insists that those things serve a purpose only when they carry genuine informational value. A perfectly structured empty section is still empty. A richly informative section with clean structure is a retrieval weapon.

For the advanced practitioner, MKA provides three things that most optimization frameworks do not. First, a hierarchy that prevents you from optimizing the wrong layer first. Second, a content unit model that defines "done" at the section level. Third, an explicit anti-pattern catalog and guardrail system that protect against the doctrine's own potential for ritual decay.

Make every important section worth selecting, easy to extract, safe to trust, and simple to reuse.

That motto is the doctrine in seventeen words. Everything else is engineering discipline in service of those four outcomes. The companies and practitioners who internalize this framework and apply it with rigor will build content assets that compound in value as AI retrieval systems become the dominant discovery mechanism for how buyers find, evaluate, and choose solutions.