AI engines decide whether to cite your page based on a stack of attribution signals — author identity, source provenance, claim verifiability, and entity recognition. This page itemizes the signals and how to surface them.
AI engines preferentially cite content authored by recognizable Person entities over anonymous content. To surface author identity:
- **Author byline visible on every content page** with link to a real /author/[name]/ or /about/[name]/ page. - **Person schema** on the author page with name, url, sameAs (LinkedIn, professional registry, Wikipedia where applicable), jobTitle, worksFor. - **Article schema's author property** resolves to the same Person entity (matched by url/sameAs). - **External entity recognition** — the author should be findable in Wikipedia, LinkedIn, professional registries, podcast transcripts, or industry-publication bylines. AI engines verify against these external sources.
The publishing organization's identity is a primary trust input. Surface it via:
- **Organization (or LocalBusiness / ProfessionalService) schema** at site root with full NAP, sameAs, founder Person ref, and where defensible, aggregateRating. - **Knowledge Panel claim** if available — this is one of the strongest entity-recognition signals. - **Wikidata entry** if applicable — particularly powerful for AI engines that train on open-source structured data. - **About page depth** — a thin About page reduces publisher-identity confidence; a substantive About page with team, history, named credentials, and verifiable claims raises it.
AI engines preferentially cite content where claims are verifiable:
- **Source citations on quantitative claims** — link to the underlying data source (Statistics Canada, CRA, Health Canada, provincial regulator, named research). 'According to Statistics Canada' beats 'studies show.' - **Date stamps on time-sensitive claims** — datePublished + dateModified on Article schema; visible in body text where claims have currency dependence. - **Disambiguation of similar entities** — if you reference an organization or person with a common name, link to canonical (official site, Wikipedia, Wikidata) so the AI engine can disambiguate. - **First-party vs. secondary disclosure** — be clear when you're reporting first-party data ('our 2026 survey of 312 Canadian SMBs found...') vs. citing third-party data ('Statistics Canada's 2025 LFS shows...'). AI engines preferentially cite first-party sources for the data they originate.
AI engines extract 40-90 word passages; pages structured for passage extractability cite at higher rates. See /ai-overview-optimization/heading-structure-for-aio/ for the structural pattern. Density indicators that increase citation probability:
- Numbered facts / specific dollar amounts / date ranges in the first 90 words. - Named entities (people, places, organizations, products) per passage. - Direct quotes from named experts. - Comparative or contrastive structure ('X differs from Y in that ...').
Not strictly — content without it can still be cited, but at materially lower rates. For commercial-intent pages, Person schema on a credentialled author is one of the highest-leverage tactics available.
Yes. sameAs links from your Organization or Person schema to canonical external identifiers (LinkedIn, Wikipedia, Wikidata, professional registries) are the primary mechanism by which AI engines confirm entity identity. Pages with no sameAs are systematically under-cited.
Cite the underlying source for quantitative claims, time-sensitive claims, and any claim that a skeptical reader could reasonably want to verify. Don't cite obvious general-knowledge facts. Over-citation (every sentence with a footnote) can reduce readability without improving citation eligibility.