03 · Capability

Technical & semantic foundations.

Technical and semantic foundations are the machine-legibility layer AI engines evaluate before they decide whether to cite a page. Clean schema, a valid llms.txt, permissive crawler access for AI user-agents, fast Core Web Vitals, and verifiable authorship. Get the plumbing right and your content is eligible. Get it wrong and the model never reads you, no matter how well you write.

TL;DR

AI engines judge your site in milliseconds. Schema, llms.txt, crawler access, vitals, and authorship decide whether you're even eligible to be cited. We make your site legible to the machines that decide which sources an AI answer trusts.

The Problem

AI can't cite what it can't parse.

The engines retrieve at speed. They need entities, passages, attribution, and load budgets that resolve cleanly on first pass. If your schema is missing, your llms.txt is absent, your AI user-agents are blocked, your vitals are red, or your authorship is ambiguous, the model skips you and cites a competitor with the plumbing sorted. Most of the sites losing citation share are losing it here, not on the writing.

67%
of product and category pages ship with broken, missing, or misapplied schema the major engines can't use.
Schema audit · CiteSurge panel 2026
3 of 5
AI user-agents are blocked outright on a typical enterprise robots.txt. The site is uncrawlable to the engines that matter.
Robots.txt audit · CiteSurge 2026
<200ms
INP budget AI crawlers respect before downgrading a page's retrieval priority. Most enterprise CMS pages miss it.
Core Web Vitals benchmark · 2026
12%
of cited articles across the seven engines carry a verified author entity with an sameAs graph. The rest are anonymous.
Citation attribution study · 2025
The Outcome

Be a source AI trusts on the first pass.

What Moves

The signals we move.

Before and after, across a typical 90-day engagement. Directional, representative. Your numbers will differ.

Schema validation pass rate
41%
98%
+57%
AI user-agents allowed
2/5
5/5
+3/5
Pages with llms.txt coverage
0%
100%
+100%
INP on answer-shaped pages
312ms
148ms
-164ms
Author entities with sameAs graph
0%
100%
+100%
The Methodology

We don't publish the playbook. The advantage is in the execution, not the explanation. If we taught it, it would stop working. Not just for you, for everyone we work with. What we'll show you is the result.

A category-leading enterprise SaaS brand grew its citation share across seven engines from 3% to 41% in 90 days. Without publishing more content.
Anonymized client outcome · CiteSurge panel
Frequently Asked

Questions buyers ask us.

It's the layer of structured data and performance budgets AI engines use to decide whether a page is worth retrieving. Schema markup, llms.txt, robots.txt permissions for AI user-agents, Core Web Vitals, and verifiable author and publisher entities. When all five resolve cleanly, your page becomes eligible for citation. When any one fails, the engines skip you.

llms.txt is an emerging standard that tells language models where your authoritative content lives and how it should be cited. It sits at the root of your domain, points to a clean markdown digest of your canonical pages, and gives retrievers a fast path to your best answers. If your category is AI-answered, yes, you need it.

At minimum: GPTBot (OpenAI), ClaudeBot and anthropic-ai (Anthropic), PerplexityBot, Google-Extended, and Bingbot. Blocking any of them removes your pages from that engine's retrieval index. Allow them all unless you have a specific legal or contractual reason to block one.

The retrievers enforce load budgets. Slow pages, layout shift, and sluggish interaction get deprioritised in the retrieval queue, and a page that isn't retrieved is never cited. INP under 200 ms, LCP under 2.5 s, and CLS under 0.1 are the practical floors.

Organization and WebSite graphs site-wide. Article, Product, FAQPage, HowTo, Breadcrumb, and SpeakableSpecification at the page level. Author and Person entities with a full sameAs graph linking to LinkedIn, crunchbase, and other sources the engines triangulate against. Everything validates against schema.org and the major engine-specific extensions.

Almost never. Most foundation work is surgical. A JSON-LD injection layer, robots.txt edits, a new llms.txt, performance fixes on the pages that matter, and an authorship graph. Your CMS stays. Your URLs stay. Your content stays. The plumbing changes underneath.

Find out where you stand.

A free citation audit shows exactly which queries surface you, which surface a competitor, and what's costing you share.

90-minute turnaround. No pitch.