VEEZOW

04 / PLAYBOOKS · 05

Structured data for LLMs

JSON-LD Organization, Article, Product, and FAQPage schema give LLMs machine-readable facts. The specific fields that matter most for each engine — and which patterns to avoid.

Structured data is machine-readable markup embedded in your web pages that tells crawlers — including LLM crawlers — exactly what your content is, who you are, and what the key facts about your business are. Google uses it for rich results; LLMs use it for entity recognition and citation accuracy.

The most impactful structured data type for AI citation optimization is the Organization schema on your homepage. This is the entity identity document that LLMs resolve against when they encounter your brand name in a query.

Organization schema — the baseline

Every domain should have a JSON-LD Organization block on the homepage. The minimum viable implementation:

name, url, logo, description, foundingDate, founder (linked person entity), sameAs (array of authoritative profile URLs), contactPoint, address. The sameAs array is the citation multiplier — it links your Organization entity to LinkedIn, Crunchbase, Wikidata, Twitter/X, and any other authoritative sources that know about you. For developer-tool brands, add your GitHub organization URL to sameAs as well.

Missing sameAs reduces citation accuracy across Gemini, Claude, and Perplexity. Gemini in particular uses entity graph completeness as a trust signal — a brand with three sameAs references is more confidently cited than one with none.

Article schema for content pages

For blog posts, case studies, and long-form content, implement Article or BlogPosting schema. Key fields: headline, author (linked Person), datePublished, dateModified, description, image. The datePublished field matters for freshness signals — pages without explicit publication dates may be treated as stale.

Gemini cites long-form content with Article schema at significantly higher rates than equivalent content without it. The schema tells the model that this is a substantive, dateable piece of content — not a product page.

FAQPage schema for high-intent queries

FAQPage schema marks up question-and-answer content in a machine-readable format. LLMs use this directly when generating answers to specific questions. If you have a legitimate FAQ section on your site, implementing FAQPage schema on those pages is one of the most direct citation optimization moves available — see the FAQPage schema playbook for question selection, answer length guidance, and JSON-LD implementation.

The questions should reflect real search intent — not softballs about your company. Write answers that would be useful to a person researching your category who doesn't already know your brand.

Product schema for e-commerce and SaaS

Product schema on pricing and product pages gives LLMs pricing information, feature descriptions, and availability signals. Include: name, description, offers (with price, currency, availability), review (aggregate rating), and brand.

For SaaS, SoftwareApplication schema is more specific than Product — use it when available. It includes applicationCategory, operatingSystem, and featureList fields that Gemini and Claude specifically recognize.

Common mistakes

Avoid duplicating schema across every page — Organization schema belongs on the homepage and about page, not on every blog post. Avoid schema that contradicts your visible page content (mismatches reduce trust). Avoid using schema for information that changes frequently without updating it — stale schema is worse than no schema.

What this means for citation strategy

Structured data is the most controllable citation signal. Unlike Wikipedia or Wikidata, you control it directly. Implement Organization schema first, then Article schema on your highest-value content, then FAQPage on genuine question-and-answer content. Measure changes in your discoverability score in Veezow 6-8 weeks after implementation.

Measure your current position

Veezow scans your domain for the signals covered in this playbook — robots.txt access, structured data, Common Crawl presence, bot permissions, and off-site mentions — and scores them in one report.

Run a free scan →

Weekly Visibility Index

New data every Monday — citation shifts, engine behaviour changes, and what moved the index this week.

More playbooks

01

Wikipedia presence strategy

02

Wikidata entity graph

03

Earned Reddit and HN presence

All playbooks →

← PREVIOUS

Common Crawl coverage audit

NEXT →

Citation laundering defense