GEO Glossary

Mastering Generative Engine Optimization vocabulary is the first step to understanding how AI assistants process, evaluate, and cite web content. This glossary brings together the essential technical terms that connect traditional SEO with the new reality of conversational search.

Each term includes not only its technical definition, but its practical relevance to your GEO strategy: how it impacts the probability that ChatGPT, Claude, Gemini, or Perplexity cite your site as an authority source.

Terms marked with a link have a dedicated page with implementation guides, research data, and practical examples.

GEO (Generative Engine Optimization)

Practice of optimizing web content to be recognized, indexed, and cited by generative search engines like ChatGPT, Gemini, Claude, and Perplexity. Unlike traditional SEO that seeks positions in search results, GEO aims for your content to be the selected source when an AI assistant generates a direct response.

Read full guide

RAG (Retrieval Augmented Generation)

Architecture that combines language models with real-time external information retrieval. When a user asks a question, the system first searches for relevant sources and then generates a response based on that data. It's the central mechanism ChatGPT, Claude, and Perplexity use to cite web content.

Read full guide

E-E-A-T

Experience, Expertise, Authoritativeness, Trustworthiness — the 4 criteria Google and AI assistants use to evaluate content quality and reliability. Experience (first-hand experience) was added in 2022 and is particularly relevant for GEO because AI assistants prioritize sources demonstrating verifiable practical knowledge.

Read full guide

AI Citation

Reference that an AI assistant makes to an external source when generating a response. Unlike a link in search results, an AI citation implies the model actively selected your content as an authoritative source. Perplexity shows visible numbered citations; ChatGPT and Claude integrate them contextually.

Read full guide

LLM (Large Language Model)

Large-scale language model trained on massive data to understand and generate text. Examples: GPT-4o (OpenAI), Claude (Anthropic), Gemini (Google). These models are the technological foundation of conversational AI assistants. Their ability to cite external sources depends on the RAG architecture that complements them.

Read full guide

Schema.org / JSON-LD

Structured data standard that communicates semantic meaning of content to AI bots. JSON-LD (JavaScript Object Notation for Linked Data) is Google's recommended format. Correctly implementing Schema.org (Article, FAQPage, LocalBusiness, etc.) significantly increases the probability of your content being understood and cited by RAG systems.

Read full guide

Semantic Ratio

Metric measuring the ratio of semantic HTML (article, section, header, nav, aside) versus generic elements (div, span) on a web page. A semantic_ratio above 0.85 correlates with significantly higher probability of being cited by AI assistants, according to our audits.

Hreflang

HTML tag that indicates to search engines and AI bots the relationship between versions of a page in different languages or regions. Correctly implementing hreflang ensures AI assistants cite the correct version of your content based on the querying user's language.

Canonical URL

URL designated as the main version of a page to prevent duplicate content issues. AI bots like GPTBot and ClaudeBot respect canonical signals to determine which version of a page to index in their knowledge base.

Core Web Vitals

Google's web performance metrics: LCP (loading speed of main content), INP (interaction responsiveness), and CLS (visual stability). While traditional SEO signals, Gemini considers them directly because it has access to Google's index.

Crawlability

Ability of search and AI bots to access and explore all pages of your website. AI bots (GPTBot, ClaudeBot, PerplexityBot) have different crawling behaviors than Googlebot. Correctly configuring robots.txt for each bot is a fundamental requirement of any GEO strategy.

Hub & Spoke

Content architecture where a central topic (Hub or pillar page) connects to multiple subtopics (Spokes or cluster pages) through internal links. This structure helps RAG systems understand the hierarchical relationship between topics and strengthens the site's topical authority for AI assistants.

Indexation

Process by which search engines and AI bots add your pages to their database to include them in results or generated responses. AI bots maintain knowledge bases separate from Google: GPTBot indexes for ChatGPT, ClaudeBot for Claude, etc.

Entity Density

Ratio of named entities (people, organizations, products, places, technical concepts) to total text in a section. The optimal range for RAG systems is 0.10-0.20 entities per sentence. Too low indicates generic content; too high hinders readability.

Chunking

Process by which RAG systems divide your content into fragments (chunks) of 200-400 tokens to process them individually. Correct semantic structure with clear headings ensures each chunk retains complete meaning and can be cited independently.

15 of 15 terms found

Frequently asked questions

What is the difference between GEO and SEO?

SEO optimizes for positions in Google search results. GEO optimizes to be cited as a source by AI assistants (ChatGPT, Claude, Gemini, Perplexity). GEO builds on SEO foundations but adds specific optimizations for RAG systems and AI bots.

What is semantic_ratio and how is it calculated?

The semantic_ratio is the proportion of semantic HTML elements (article, section, header) versus generic elements (div, span) on a page. It is calculated by dividing the number of semantic nodes by the total container nodes. A ratio above 0.85 is optimal.

Why is Schema.org important for AI assistants?

Schema.org allows AI bots to understand the meaning of your content, not just its text. Implementing JSON-LD with types like Article, FAQPage, or LocalBusiness significantly increases the probability that RAG systems understand and correctly cite your content.

What is the llms.txt file?

It is a discovery file for AI bots, similar to robots.txt for search engines. llms.txt provides a structured content map so language models can navigate your site efficiently.