GEO Glossary

Mastering Generative Engine Optimization vocabulary is the first step to understanding how AI assistants process, evaluate, and cite web content. This glossary brings together the essential technical terms that connect traditional SEO with the new reality of conversational search.

Each term includes not only its technical definition, but its practical relevance to your GEO strategy: how it impacts the probability that ChatGPT, Claude, Gemini, or Perplexity cite your site as an authority source.

Terms marked with a link have a dedicated page with implementation guides, research data, and practical examples.

Semantic Ratio

Metric measuring the ratio of semantic HTML (article, section, header, nav, aside) versus generic elements (div, span) on a web page. A semantic_ratio above 0.85 correlates with significantly higher probability of being cited by AI assistants, according to our audits.

Hreflang

HTML tag that indicates to search engines and AI bots the relationship between versions of a page in different languages or regions. Correctly implementing hreflang ensures AI assistants cite the correct version of your content based on the querying user's language.

Canonical URL

URL designated as the main version of a page to prevent duplicate content issues. AI bots like GPTBot and ClaudeBot respect canonical signals to determine which version of a page to index in their knowledge base.

Core Web Vitals

Google's web performance metrics: LCP (loading speed of main content), INP (interaction responsiveness), and CLS (visual stability). While traditional SEO signals, Gemini considers them directly because it has access to Google's index.

Crawlability

Ability of search and AI bots to access and explore all pages of your website. AI bots (GPTBot, ClaudeBot, PerplexityBot) have different crawling behaviors than Googlebot. Correctly configuring robots.txt for each bot is a fundamental requirement of any GEO strategy.

Hub & Spoke

Content architecture where a central topic (Hub or pillar page) connects to multiple subtopics (Spokes or cluster pages) through internal links. This structure helps RAG systems understand the hierarchical relationship between topics and strengthens the site's topical authority for AI assistants.

Indexation

Process by which search engines and AI bots add your pages to their database to include them in results or generated responses. AI bots maintain knowledge bases separate from Google: GPTBot indexes for ChatGPT, ClaudeBot for Claude, etc.

Entity Density

Ratio of named entities (people, organizations, products, places, technical concepts) to total text in a section. The optimal range for RAG systems is 0.10-0.20 entities per sentence. Too low indicates generic content; too high hinders readability.

Chunking

Process by which RAG systems divide your content into fragments (chunks) of 200-400 tokens to process them individually. Correct semantic structure with clear headings ensures each chunk retains complete meaning and can be cited independently.

15 of 15 terms found

Frequently asked questions

What is the difference between GEO and SEO?

SEO optimizes for positions in Google search results. GEO optimizes to be cited as a source by AI assistants (ChatGPT, Claude, Gemini, Perplexity). GEO builds on SEO foundations but adds specific optimizations for RAG systems and AI bots.

What is semantic_ratio and how is it calculated?

The semantic_ratio is the proportion of semantic HTML elements (article, section, header) versus generic elements (div, span) on a page. It is calculated by dividing the number of semantic nodes by the total container nodes. A ratio above 0.85 is optimal.

Why is Schema.org important for AI assistants?

Schema.org allows AI bots to understand the meaning of your content, not just its text. Implementing JSON-LD with types like Article, FAQPage, or LocalBusiness significantly increases the probability that RAG systems understand and correctly cite your content.

What is the llms.txt file?

It is a discovery file for AI bots, similar to robots.txt for search engines. llms.txt provides a structured content map so language models can navigate your site efficiently.