Your Generic Schema is Useless: New Research on What it Really Takes to Get Cited by AI

Feb 22

Schema markup for AI citation is the practice of implementing JSON-LD structured data to increase the probability that AI platforms cite your page in generated answers. A 2026 empirical study of 730 AI citations across ChatGPT and Gemini found that generic schema (Article, Organization, BreadcrumbList) provides zero measurable citation advantage. Only attribute-rich schema (Product and Review types with populated pricing, ratings, and specifications) showed a significant effect, cited at 61.7% versus 41.6% for generic implementations. This report is for founders, CMOs, and marketing leaders who need to know where schema investment actually pays off for AI visibility

HOME / FIELD NOTES

✍️ Published: February 22, 2026 · 🧑‍💻 Last Updated: February 22, 2026 · By: Kurt Fischman, Founder @ Growth Marshal

Author’s note: We decided to ask a simple question: does schema markup actually get you cited by AI, or has the entire industry been repeating an untested assumption in an echo chamber? (And btw, there’s no bigger offender than us!) So we studied 730 real AI citations, analyzed over 1,000 pages, and found the answer is more uncomfortable than anyone expected. Jump to the study →

Quick Facts

Primary Entity: Schema markup for AI citation
Category: Generative engine optimization (GEO) / Structured data strategy
Audience: Founders, CMOs, small business owners, marketing leaders
Time to Implement: 2 to 8 hours per page (attribute-rich); 15 minutes per page (generic CMS default)
Difficulty Level: Moderate (attribute-rich requires manual customization); Low (generic)
Key Alternatives: Content quality optimization, organic rank improvement, entity-graph architecture

Essential Insights

Schema markup for AI citation produces no measurable effect when implemented as generic CMS-default types, according to our cross-platform empirical study of 730 AI citations.
Attribute-rich schema with populated pricing, ratings, and specifications fields outperforms generic schema by 20 percentage points in AI citation rates.
Google organic rank position reduces AI citation odds by approximately 24% per position, making retrieval rank the dominant predictor of which pages AI platforms cite.
Schema markup for AI citation delivers its largest advantage for lower-authority domains (DR 60 or below), where structured factual data partially compensates for weak authority signals.
The practitioner consensus that schema improves AI visibility originated through an LLM feedback loop in which AI platforms reproduced untested SEO recommendations from their training data.
Position-1 pages in Google's organic results receive AI citations in 43% of queries, declining to 5% at position 7, establishing a steep and consistent position gradient.
Schema markup for AI citation shows a null result for entity richness score (OR = 1.001, p = .833), indicating that scoring complexity alone does not influence AI citation decisions.
Fewer than 4% of schema-present pages in our study implemented sophisticated entity-linking techniques such as Wikidata sameAs identifiers.
AI citation behavior is not simply a restatement of Google's top-10 results: 63.5% of AI-cited pages did not appear in the organic top-10 for the query that surfaced them.
Schema markup for AI citation is most productively understood as an uncertainty reduction mechanism: attribute-rich schema gives AI systems verifiable facts that help overcome a confidence threshold for citation.

What is Schema Markup for AI Citation?

Schema markup for AI citation is a structured data strategy that embeds machine-readable JSON-LD metadata into web pages to help AI retrieval systems parse, classify, and cite content in generated answers. Schema built for AI citation differs from traditional schema optimization because the target system is not a search engine results page but rather a large language model generating a prose response with source attribution.

The practitioner consensus has treated schema markup as essential infrastructure for AI visibility, drawing an analogy to its established role in Google rich results eligibility. Agency frameworks, SEO publications, and AI visibility tools all score pages partly on schema implementation. The logic sounds airtight: schema reduces machine parsing uncertainty, therefore AI systems should prefer schema-bearing pages. The problem is that nobody bothered to check whether this was actually true until we (Growth Marshal) decided to examine 730 AI citations and found the answer is: mostly no.

However, the relationship between schema and AI citation is more nuanced than a binary yes-or-no. The type and informational density of schema implementation matters enormously. Generic schema types produced by default CMS plugins provide no detectable advantage. Attribute-rich schema with concrete factual payloads tells a different story entirely.

For example, a B2B SaaS company running standard Article schema on its blog pages would see zero measurable lift in AI citation rates. That same company adding detailed Product schema with populated pricing tiers, feature specifications, and aggregate ratings to its product pages would be operating in the implementation category that showed a statistically significant 20-percentage-point advantage over generic schema in our dataset.

How Does Schema Influence AI Citation Systems?

Schema influences AI citation systems through the retrieval-augmented generation (RAG) pipeline, the technical architecture through which platforms like ChatGPT and Gemini produce web-grounded answers. RAG operates in stages: a search backend retrieves candidate pages, the AI system extracts relevant information, resolves entities, and generates a cited response. Schema is theoretically relevant at the extraction and entity-resolution stages, where machine-readable field labels reduce the inferential burden on the AI system.

The critical architectural reality, documented by our study, is that the retrieval stage is mediated by a search backend whose ranking behavior operates independently of the AI platform itself. ChatGPT's web retrieval and Gemini's search grounding both rely on underlying search infrastructure that applies its own relevance and authority judgments before AI-level processing begins. The study found Google organic rank position predicted AI citation with an odds ratio of 0.762 per position (p < .001), meaning each rank drop reduces citation odds by approximately 24%.

Exceptions include the roughly 63.5% of AI-cited pages in the study that did not appear in Google's top-10 results at all. This "breakthrough population" demonstrates that AI platforms exercise independent judgment beyond search ranking. Understanding what drives citation for these pages represents the central open question in generative engine optimization research.

Here is how this works in practice: When a user asks ChatGPT "best CRM for small businesses," the system first queries a search backend that returns ranked candidate pages. A page at position 1 has a 43% probability of being cited. A page at position 7 has roughly a 5% chance. The AI system then evaluates the retrieved pages for extractable answers. A Product schema block that explicitly labels pricing, features, and ratings gives the system structured facts it can verify, reducing the confidence threshold required for citation. An Article schema block that merely declares "this is an article" provides nothing a basic HTML parser would not already infer.

Attribute-Rich Schema vs. Generic Schema: What the Data Shows

Attribute-rich schema outperforms generic schema by a statistically significant margin in AI citation rates, according to our cross-platform study. Pages implementing Product or Review schema with populated concrete attributes (pricing, aggregateRating, specifications, availability) were cited at 61.7%, compared to 41.6% for pages with generic schema types like Article, Organization, or BreadcrumbList (p = .012).

The comparison structure reveals a counterintuitive pattern. Pages with no schema at all were cited at 59.8%, occupying an intermediate position. Attribute-rich schema slightly exceeds the no-schema baseline (61.7% vs. 59.8%, not statistically significant). Generic schema actually underperforms no schema at all (41.6% vs. 59.8%). Generic schema, in other words, carries a modest citation penalty relative to having no schema whatsoever.

Table 1. Schema Type vs. AI Citation Rate (Fischman, 2026)

Schema Implementation	AI Citation Rate	Statistical Significance	Best For
Attribute-Rich Product, Review with pricing, ratings, specs	61.7% ▲▲▲	p = .012 vs. generic	Lower-authority domains (DR < 60) needing citation tiebreakers
No Schema No JSON-LD implemented	59.8% ▬ baseline	Baseline	Pages where content quality and rank carry the signal
Generic Article, Organization, BreadcrumbList	41.6% ▼▼▼	Worst performer	Traditional rich results only; no AI citation benefit

▲▲▲ Significant advantage ▬ Baseline ▼▼▼ Underperforms baseline

However, the attribute-rich advantage was most pronounced among lower-authority domains with Ahrefs Domain Rating of 60 or below. Among these pages, Product and Review schema with concrete attributes was associated with a citation rate of 54.2% compared to 31.8% for generic schema. Among high-DR pages (DR > 75), the schema-type difference narrowed considerably. Authority signals dominate citation decisions for established domains. Structured data provides relatively more leverage where traditional authority signals are weakest.

Case in point: A regional insurance brokerage (DR 38) implementing detailed Product schema with coverage types, premium ranges, and customer ratings on its policy pages operates in precisely the category where schema delivers its largest advantage. A Fortune 500 insurer (DR 85) running the same schema would see negligible incremental benefit because its authority signals already carry the citation decision.

Schema Markup Examples That Drive AI Citation

Schema markup examples that drive AI citation share one common trait: they provide extractable, verifiable factual content that reduces the AI system's confidence threshold for citation. Our study identified concrete attribute fields as the differentiating factor, not schema presence itself.

Attribute-Rich Product Schema (Effective)

A product page implementing JSON-LD with populated name, description, offers (including price and priceCurrency), aggregateRating (ratingValue and reviewCount), brand, and specific product attributes like material specifications gives an AI retrieval system structured facts it can extract without natural language inference. Each populated field represents a discrete, verifiable claim the system can reference with confidence.

Generic Article Schema (Ineffective)

A blog post running default CMS-generated Article schema with only @type, headline, datePublished, and author provides metadata that a basic HTML parser would already infer from the title tag and byline. The schema adds no informational content the AI system could not extract from standard HTML structure. Heuristic benchmark: Based on our dataset, approximately 80% of schema-present pages in the study ran generic CMS-default implementations with no page-specific customization.

Before/After: Converting Generic to Attribute-Rich

For example, a SaaS pricing page running generic WebPage schema (before) could be converted to Product schema with populated offers array including three pricing tiers, each with price, priceCurrency, description, and eligibility criteria, plus aggregateRating from verified customers and feature-level specifications (after). The before state provides zero extractable facts beyond what the page title conveys. The after state provides structured data for the exact query patterns ("[product] pricing," "[product] reviews") that drive commercial AI citations.

Conversely, attribute-rich schema is not a substitute for content quality. Our study left the majority of citation variance unexplained, with content quality the most plausible candidate. Schema that labels facts still requires those facts to exist in well-structured prose.

Limitations of Using Schema for AI Citation

Using schema for AI citation faces five significant limitations that practitioners should understand before allocating optimization resources. Our study documented each of these constraints empirically.

1. Generic schema provides no measurable advantage. The corrected GEE model found schema presence produced an odds ratio of 0.678 (p = .296), consistent with a true null effect. Entity richness score showed OR = 1.001 (p = .833). Schema-to-query alignment showed OR = 1.068 (p = .626). The practitioner consensus that schema improves AI visibility is not supported by citation data for the implementations that dominate the current web.

2. Rank position dominates the citation equation. Google organic rank position predicted AI citation with OR = 0.762 per position (p < .001). Position-1 pages were cited at 43%, declining to 5% at position 7. Moving from position 5 to position 2 delivers more expected AI citation lift than any schema intervention the study could identify.

3. Sophisticated entity-graph schema remains untestable. Wikidata sameAs links, genuine @id cross-referencing, and nested entity structures appeared on fewer than 4% of schema-present pages. The implementation approach that mechanistic reasoning most strongly supports is so rarely deployed that empirical evaluation is currently impossible.

4. Content quality is likely the dominant unmeasured variable. Schema characteristics and domain authority together explain a modest fraction of citation variance. Answer-first heading structure, entity clarity in running text, factual density, and modular extractability are all candidate predictors that were not measured.

5. Findings are platform- and time-specific. The study examined ChatGPT and Gemini during a single collection window. Cross-platform URL overlap was approximately 4%, consistent with Lee's (2026) finding that AI platforms draw from meaningfully different retrieval pools. Perplexity, Copilot, and Google AI Overviews may exhibit different schema sensitivity patterns.

However, the position gradient finding is likely more durable than specific effect sizes because it reflects a structural feature of retrieval-augmented generation that applies regardless of which search backend an AI platform uses.

Who Should Invest in Schema for AI Citation?

Schema is most valuable for lower-authority domains (DR 60 or below) that sell products or services with concrete, quantifiable attributes. Our data identifies a specific profile where schema investment yields measurable returns, and a much larger profile where it does not.

Table 2. Schema Investment Decision Matrix

Business Profile	Schema Strategy	Expected AI Citation Impact	Priority
Lower-authority domain DR < 60 with product or service pages	Implement attribute-rich Product/Review schema with pricing, ratings, and specs	Significant: 22+ point citation advantage over generic schema	HIGH
Lower-authority domain DR < 60 with content/blog pages only	Focus on content quality and organic rank improvement; skip generic schema	Negligible from schema alone	LOW for schema HIGH for content
High-authority domain DR > 75	Attribute-rich schema for rich results; minimal AI citation lift expected	Marginal: authority signals already carry citation decisions	MEDIUM for rich results, not AI citation
Any domain with technical resources for entity-graph architecture	Implement Wikidata sameAs, genuine @id cross-referencing, nested entities	Unknown but theoretically promising; virtually uncontested territory	EXPERIMENTAL early-adopter bet

DR = Ahrefs Domain Rating (0–100 scale). Citation data from Fischman (2026), n = 1,006 pages across 75 commercial queries.

For example, a mid-market SaaS company with DR 45 and a product catalog would prioritize populating Product schema with pricing tiers, feature specifications, integration counts, and aggregateRating data. A professional services firm with the same authority profile but no productizable offerings would redirect that same effort toward answer-first content architecture and organic rank improvement.

Exceptions include early-adopter firms with development resources to build genuine entity-graph schema. Fewer than 4% of pages in our dataset implemented anything resembling deliberate entity-linking. Firms that deploy Wikidata-linked sameAs identifiers, genuine @id cross-referencing across schema blocks, and nested entity structures are operating in essentially uncontested territory. The empirical evidence for this approach does not yet exist because virtually no one has built it.

The LLM Feedback Loop: Why the Schema Myth Persists

The schema-helps consensus persists through a self-referential feedback loop between AI platforms and the practitioner communities that query them. Ask ChatGPT how to improve AI visibility and it will recommend schema markup. Ask Gemini the same question and it will recommend structured data. Ask Perplexity, and it will cite SEO publications that were themselves informed by AI-generated summaries of SEO best practices.

The feedback loop operates through a specific mechanism: large language models trained on corpora including SEO publications, marketing agency content, and practitioner forums reproduce the accumulated consensus of that training data regardless of its empirical basis. Practitioners ask AI platforms for optimization advice. AI platforms reproduce the consensus. Practitioners implement the advice. Nobody measures whether it works. The consensus reinforces itself through implementation without outcome measurement.

However, breaking this loop requires a methodology that is not complex: query design, citation collection, control set construction, and regression analysis. What has been missing is not capability but the willingness to design studies that might falsify the recommendations being made to clients. Heuristic benchmark: Based on industry observation, an estimated 90% or more of published GEO recommendations have not been validated against observed AI citation behavior (assumption: few practitioner-researchers have published falsification-oriented studies as of Q1 2026).

Case in point: The study itself began as an internal challenge to Growth Marshal's own assumptions. Our MKA (Modular Knowledge Asset) framework assigns significant weight to schema implementation. When we began to suspect that the evidentiary chain supporting this emphasis possibly traced back to AI platforms endorsing schema because their training data contained that endorsement, we designed a study that could falsify our own methodology. The study returned results that changed our thinking. That is how the process is supposed to work.

Concept Map: How Schema Markup Relates to AI Citation (Fischman, 2026)

User Query

▼

triggers

RAG Pipeline

ChatGPT • Gemini

▼

retrieves via

Search Backend

Google organic index

▼

ranks by

Generic Schema

Article • Organization • BreadcrumbList

▼

✕ NULL EFFECT

OR = 0.678, p = .296

Organic Rank Position

DOMINANT PREDICTOR

OR = 0.762 per position (p < .001)

▼

Pos 1Pos 3Pos 5Pos 7

43%20%10%5%

Attribute-Rich Schema

Product • Review + pricing, ratings, specs

▼

Reduces Extraction Uncertainty

61.7% vs 41.6%, p = .012

Strongest for DR < 60

▼ ▼ ▼

AI Citation Probability

The Feedback Loop

LLM Training Data
Contains SEO consensus

→

AI Recommends Schema
Reproduces consensus

→

Practitioners Implement
Without testing

→

Consensus Reinforced
Loop repeats ↻

⚠ Unmeasured Variable: Content quality (answer-first structure, entity clarity, factual density) likely explains the majority of remaining citation variance.

Source: Fischman, K. (2026). Does Schema Markup Predict AI Citation? Growth Marshal. n = 1,006 pages, 75 queries, 730 citations across ChatGPT and Gemini.

Key relationships: User queries trigger the RAG pipeline, which retrieves candidates via a search backend. Organic rank position is the dominant predictor of AI citation (OR = 0.762). Generic schema produces no citation effect. Attribute-rich schema provides a modest citation advantage for lower-authority domains by reducing extraction uncertainty. The LLM feedback loop reinforces the untested schema consensus, driving continued implementation of generic schema that the data does not support.

graph LR

UserQuery -->|triggers| RAG_Pipeline

RAG_Pipeline -->|retrieves via| SearchBackend

SearchBackend -->|ranks by| OrganicRankPosition

OrganicRankPosition -->|OR=0.762 per position| AICitationProbability

SchemaMarkup -->|generic types| NullEffect[No Citation Effect]

SchemaMarkup -->|attribute-rich types| ModestAdvantage[Citation Advantage for Low-DR]

DomainAuthority -->|DR controls| AICitationProbability

ContentQuality -->|unmeasured but dominant| AICitationProbability

AttributeRichSchema -->|reduces| ExtractionUncertainty

ExtractionUncertainty -->|lowers threshold for| AICitationProbability

LLMFeedbackLoop -->|reinforces untested| SchemaConsensus

SchemaConsensus -->|drives implementation of| GenericSchema

Final Takeaways

Generic schema markup (Article, Organization, BreadcrumbList) provides zero measurable AI citation advantage. Stop treating CMS-default schema as an AI visibility strategy.
Attribute-rich schema (Product and Review with populated pricing, ratings, and specifications) outperforms generic schema by 20 percentage points (61.7% vs. 41.6%, p = .012) and delivers its largest advantage for domains with DR 60 or below.
Google organic rank position is the dominant predictor of AI citation, with each rank position reducing citation odds by approximately 24%. Moving from position 5 to position 2 delivers more expected citation lift than any schema intervention.
Sophisticated entity-graph schema (Wikidata sameAs, @id cross-referencing, nested entities) represents uncontested territory. Fewer than 4% of pages implement it, making empirical evaluation impossible but the competitive opportunity substantial.
Validate GEO recommendations against observed citation behavior, not against advice from the AI systems you are trying to optimize for. The LLM feedback loop produces confident recommendations without empirical basis.

FAQs

Q1: Does schema markup help pages get cited by AI platforms like ChatGPT and Gemini?

Schema markup for AI citation produces no statistically significant effect when implemented as generic types (Article, Organization, BreadcrumbList), according to our research. The corrected GEE model found schema presence had an odds ratio of 0.678 (p = .296), consistent with a true null effect. Only attribute-rich implementations (Product and Review schema with populated pricing, ratings, and specifications) showed a significant citation advantage.

Q2: What is the difference between attribute-rich schema and generic schema for AI citation?

Attribute-rich schema is a structured data implementation that provides extractable factual content through populated concrete fields such as pricing, aggregateRating, and product specifications. Generic schema provides machine-readable metadata (Article type, datePublished, author) without substantive informational content beyond what standard HTML already conveys. Attribute-rich schema was cited at 61.7% versus 41.6% for generic schema in our study (p = .012).

Q3: How does Google organic rank position affect AI citation probability?

Google organic rank position reduces AI citation odds by approximately 24% per position (OR = 0.762, p < .001). Position-1 pages were cited in 43% of queries in which they appeared, declining to 27% at position 2, 20% at position 3, 10% at position 5, and 5% at position 7. Rank position was the dominant predictor of AI citation in our study, outperforming all schema variables.

Q4: What are the limitations of using schema markup to improve AI visibility?

Schema markup for AI citation faces five documented limitations: generic implementations provide no measurable advantage; rank position dominates the citation equation; sophisticated entity-graph schema remains too rare to evaluate (fewer than 4% of pages); content quality is the likely dominant unmeasured variable; and findings are platform- and time-specific to ChatGPT and Gemini during a single 2025 collection window.

Q5: How does schema markup for AI citation differ from schema markup for Google rich results?

Schema markup for AI citation targets the probability that an AI platform cites a page in a generated prose response with source attribution. Schema markup for Google rich results targets SERP feature eligibility such as star ratings, FAQ dropdowns, and price displays. The mechanisms differ: AI citation requires extractable factual content that reduces retrieval uncertainty, while rich results require type-specific markup that matches Google's structured data documentation.

Q6: Who should prioritize attribute-rich schema implementation for AI citation?

Lower-authority domains (Ahrefs DR 60 or below) with products or services that have concrete, quantifiable attributes benefit most from attribute-rich schema implementation. Our research found a 22-percentage-point citation gap between attribute-rich and generic schema among pages with DR 60 or below (54.2% vs. 31.8%). High-authority domains (DR > 75) see minimal incremental benefit because authority signals already carry citation decisions.

Q7: What is the LLM feedback loop in AI search optimization?

The LLM feedback loop is a self-referential dynamic in which AI platforms reproduce optimization recommendations from their training data, practitioners implement those recommendations without testing, and the consensus reinforces itself without empirical validation. We documented this loop as the primary mechanism through which the schema-helps hypothesis achieved practitioner consensus despite lacking empirical support for generic implementations.

Evidence and Methodology: All quantitative claims in this article are cited from Fischman, K. (2026), "Does Schema Markup Predict AI Citation? A Cross-Platform Empirical Study of Structured Data and Generative Engine Optimization," Growth Marshal (growthmarshal.io). Preprint, February 2026. Not yet peer-reviewed. Heuristic benchmarks are explicitly labeled with assumptions throughout. Source: Fischman, 2026.

All statistics verified as of February 2026. This article is reviewed quarterly. Strategies and pricing may have changed.

Kurt Fischman