Why this article exists

Two things changed search in the last seven years. The first was BERT, in October 2019 — the moment Google moved from matching keywords to understanding sentences. The second was AI Overviews, the generative answer block that now sits above the blue links for a growing share of queries. Marketers I talk to know AIO exists. Almost none of them know BERT is what made AIO possible, or how the two relate to the everyday choices they make on a content brief.

That's a problem, because the rules have shifted. Old SEO rewarded keyword density and exact-match phrases. Modern SEO rewards being the page Google's models understand best on a topic — and being structured enough that a generative answer can quote you. Those are different jobs.

This piece walks the line from BERT through MUM and on to AI Overviews, then turns the theory into a working playbook. It assumes you've read the E-E-A-T piece and the Schema piece — the moves below stack on top of those.

BERT — what it is, in plain English

BERT is short for Bidirectional Encoder Representations from Transformers. Three words do the heavy lifting: bidirectional, encoder, and transformer.

A transformer is a kind of neural network introduced in a 2017 paper called "Attention Is All You Need." Its key trick is the attention mechanism — for every word in a sentence, the model can look at every other word and weigh how much each one matters for understanding the current word. Pre-transformer models read sentences left-to-right or right-to-left like a human reader. Transformers read all of it at once, and decide on the fly which other words are relevant.

BERT was Google's 2018 application of that idea to language understanding, with one important twist: it's bidirectional. When BERT encodes the word "lead" in the phrase "cost per lead in B2B SaaS 2026," it doesn't just see the words to the left ("cost per"). It sees everything in both directions before deciding what "lead" means. That's how it knows we're talking about a sales lead, not the metal, not a leash, not the verb.

And "encoder" means the model produces a numerical representation — a vector — of every word and every passage. Two passages that mean similar things end up near each other in vector space, even if they share no exact words. "Cost per lead" and "CPL by channel" land close together. "Lead" the metal and "lead" the sales prospect land far apart.

That single capability — encoding meaning instead of matching strings — is what made everything that came after possible.

Pre-BERT, "best banking app for under 18s" and "best banking app for kids" were two queries with two ranking sets. Post-BERT, they're a single intent with a single ranking. Your page either wins it or it doesn't.

From BERT to MUM to AI Overviews

BERT was the start, not the end. Google's language models have compounded since:

The line is straight: BERT lets Google understand the page. MUM lets it reason across modalities and languages. AI Overviews let it write you an answer using both, plus citations.

AI Overviews — what they actually are

An AI Overview is the boxed, generative answer Google now shows above (or near the top of) the SERP for a growing share of queries. It's not a featured snippet — those are quoted directly from a single page. An AIO is composed text, generated by a Gemini-family model, drawing on multiple sources that are then cited as supporting links beside the answer.

From the marketer's perspective, an AIO is two things at once:

Both are happening at the same time. The strategy is to be the page that gets cited inside the AIO and ranks underneath it for the cases where the user wants more. Doing one without the other is half a strategy.

What triggers an AI Overview

AIOs don't appear on every query. Google's own pattern, observed across hundreds of queries:

What gets cited inside an AIO answer

The cited sources beside an AIO aren't always the top blue-link results. Patterns I've watched in the wild:

If your page is the clearest, most credibly authored, most comprehensively-structured answer to the specific sub-question the AIO is answering — you get cited. The order matters: clear, credible, comprehensive. In that order.

The marketer's playbook

Five moves that compound. None of them are exotic. The compounding is the point.

1 · Write to entities, not keywords

Pre-BERT, the unit of optimisation was the keyword phrase. Post-BERT, it's the entity — a thing in the world that Google understands, with relationships to other things. "Cost per lead" is an entity. So is "B2B SaaS." So is "LinkedIn Campaign Manager." A page about CPL in B2B SaaS should namedrop the entities it's connected to (CAC, MQL, SQL, channel mix, attribution windows) without keyword-stuffing them. Google's models are fluent enough to read this as "this page understands the topic," not "this page is keyword-spamming."

2 · Structure for sampling

Treat every H2 as a possible AIO citation point. The model is going to look for a passage that cleanly answers a sub-question. The easier you make it to lift a clean 80–120 words, the more often you'll be lifted. Concretely:

3 · Schema as the bridge

Structured data gives the model an explicit, machine-readable answer to "what is this page about?" — saving it from inferring. Article schema with proper author, image, and dateModified for explainer pieces. FAQPage schema only for genuine FAQs. HowTo schema for procedural content. The Schema piece goes through the patterns. Match the schema type to the actual content; mismatched schema gets penalised, not rewarded.

4 · Visible authorship + sameAs everywhere

An AIO that cites a page with a clear named author and a verifiable identity is a much safer answer for Google than one citing an anonymous post. Make the byline a real link to a real /about page. Use Person schema with sameAs pointing to LinkedIn (and any other authoritative profile). The E-E-A-T piece covers this on the trust dimension; here, it's also a citation-eligibility signal.

5 · Cover the topic, not just the slice

AIO favours sources that look like they understand a subject end-to-end. Three deep articles with internal cross-links beat ten shallow articles. If you've already written a piece on CPL, the second piece should be on attribution windows and link to the first; the third should be on channel mix and link to both. You're building a topic cluster, but the goal isn't internal-linking SEO — it's giving the model evidence that this site is the place to draw from.

Mistakes I see most often

Practical checklist

Three categories. Pick the ones that aren't already done on your highest-traffic page, ship those first.

Most of this list is invisible to the user and unmistakable to a model. That asymmetry is the opportunity. Spend a week on a single piece doing all of the above — including the schema, including the byline, including the topic cluster — and watch what happens to your impressions on the queries that show AIO. The compounding starts almost immediately.

AI Overviews didn't replace SEO. They raised the bar for what counts as the best page on a topic, and they made the bar machine-readable. The marketers winning in 2026 are the ones writing for both readers and models, in that order, and refusing to choose.