Skip to content

The Markdown Myth: What Google Actually Said About llms.txt and AI SEO

2026.06.19 // Minneapolis Made // 19 min read

The Markdown Myth: What Google Actually Said About llms.txt and AI SEO

An SEO consultant on LinkedIn told tens of thousands of people last week that Google had just released a new Markdown standard for AI optimization and GEO. The post traveled the way these things travel: confident tone, technical-sounding vocabulary, a vague gesture at an unnamed Google announcement, a list of services the consultant could sell to help you implement it.

There is no standard. Google released nothing. The closest thing to a Google statement on the topic that month was a Search Central documentation update telling site owners to stop building these files because Google ignores them.

This article is a complete walk through what Google actually said, what llms.txt actually is, who has actually adopted it, and how to spot the pattern of marketing-services professionals manufacturing terminology to sell you optimization work for problems that do not exist.

Key Takeaways

  • Google has not released any Markdown standard, AI optimization standard, or GEO standard. Their official guidance, updated June 15, 2026, states that you do not need to create new files of any kind, and that Google Search ignores llms.txt.
  • John Mueller and Martin Splitt dedicated an entire episode of Search Off the Record to dismantling the premise: HTML is the standard, Markdown gives no SEO benefit, and Markdown strips out the link, navigation, and heading structure that LLMs need to understand context.
  • llms.txt was proposed in September 2024 by Jeremy Howard at Answer.AI, not by Google. As of June 2026, no major LLM vendor (OpenAI, Anthropic, Google, Perplexity) has formally adopted it.
  • Common Crawl, the corpus that feeds at least 64 percent of all LLMs released between 2019 and late 2023, ingests raw HTML. Not Markdown. Not llms.txt. HTML.
  • If your vendor is selling you Markdown conversion or an llms.txt optimization package, ask them to name a single LLM provider that ingests it. They cannot, because none do.

What Did Google Actually Say?

On June 15, 2026, Google updated its Search Central documentation with a page called Optimizing your website for generative AI features on Google Search. The opening guidance is a single sentence that ends the AI optimization upsell on its own:

Per Google Search Central: “You don’t need to create new machine readable files, AI text files, markup, or Markdown to appear in Google Search (including its generative AI capabilities), as Google Search itself doesn’t use them.” That is the official position from the team that owns Search.

On llms.txt specifically, the same documentation page says: “It’s completely fine if you decide to create and maintain LLMS.txt files (or other similar files) for other services or systems that use these files. Doing so won’t harm (nor help) your visibility or rankings in Google Search, as Google Search ignores them.”

This was covered the same day by Search Engine Land. Read both. There is no ambiguity to interpret.

Original Data

The relevant section of Google’s documentation is roughly 600 words long. There are zero references to a Markdown standard, zero references to GEO, zero references to AI optimization as a discipline you should hire a vendor to perform on your behalf. The page exists specifically to disabuse site owners of the idea that they need to do anything new.

What Did Mueller and Splitt Actually Say?

In episode 111 of Search Off the Record, the official Google Search Relations podcast, John Mueller and Martin Splitt sat down with one question: Should I use Markdown for my site?

That was not a listener question they answered in passing. It was the framing question for the entire episode. They had clearly been watching the same AI SEO discourse the rest of us have, and they used the episode to address it head on. The full episode is at search-off-the-record.libsyn.com, and the verbatim quotes below are corroborated by Search Engine Roundtable and Search Engine Journal.

Mueller, on HTML being the SEO baseline:

“The generic SEO angle of how do I find a website that sells me a photograph is almost going to be completely bound to HTML pages and normal web pages.”

Mueller, on the “HTML is too hard for LLMs to parse” claim that AI SEO consultants keep repeating:

“The web with HTML and everything has been around for really long time, longer than Markdown. And all of the crawlers out there have practiced with HTML. And converting HTML into text is trivial. There are lots of libraries out there that can do that for you. So if you think about what an average web crawler might look for or might need to find on a page to be able to understand it, then probably that’s just HTML.”

Splitt, on why Markdown actively makes your content worse for LLMs to understand:

“HTML with all the links and navigation and the headers… that kind of gets stripped out in the Markdown files… important to understand the structure.”

Read that last quote twice. The pitch is that Markdown helps LLMs understand your content. The two engineers at Google whose entire job is web search and crawler infrastructure say the opposite is true. Markdown removes the structural signal LLMs use to figure out what your page is about.

Unique Insight

The AI SEO pitch survives because most buyers do not read primary sources. They read the LinkedIn summary of a Twitter recap of a newsletter mention of a podcast nobody listened to. Each layer between the buyer and Google’s actual position is an opportunity for someone selling services to substitute confident terminology for a Google quote. The simplest defense is to go to the source. The episode is 41 minutes. The Search Central documentation is one page.

What Is llms.txt and Who Built It?

llms.txt was proposed in September 2024 by Jeremy Howard at Answer.AI. The format is a simple Markdown file you place at your domain root, listing the parts of your site you think an LLM should consume, with brief summaries. It is a clever idea. It is also, as of June 2026, completely unimplemented by any major AI vendor.

That last part is the part the AI SEO pitch leaves out.

Google does not honor llms.txt. Search Central’s documentation confirms that in writing. Gary Illyes confirmed the same thing in person at Search Central Live in July 2025. Mueller has compared the format to “the discredited keywords meta tag,” the 1990s SEO artifact that vendors used to sell as optimization work until search engines stopped reading it entirely.

OpenAI has not published any commitment to honor llms.txt at crawl time. Anthropic has not. Perplexity has not. Meta has not. Mistral has not. The format exists at llmstxt.org as a community proposal. There is no IETF RFC, no W3C working group, no production statement from any vendor whose adoption would actually matter.

Per PPC Land’s coverage of llms.txt adoption: the major AI platforms have ignored the proposed standard. Kai Spriestersbach’s Medium analysis calls it “a dud.” The format is not dead because it is bad. It is dead because the four or five companies that would have to support it for it to matter have not.

This matters because every vendor pitching llms.txt optimization is selling you a file that no system reads. The work has the surface texture of optimization. The file gets created. The deliverable gets emailed. The invoice gets paid. The file then sits on your server doing nothing for anyone, because nothing crawls for it.

Personal Experience

We have run audits on Minneapolis Made client sites where the previous vendor had installed llms.txt files, AI sitemap files, and a custom robots.txt block specifically targeting LLM crawlers. The invoice for that work was over four thousand dollars. The traffic impact was zero, because the files were addressing a problem that did not exist. The vendor was not stupid. The vendor was operating in a market where invented work is easier to sell than real work, and where buyers who do not read primary sources have no defense.

What Do LLMs Actually Train On?

The single largest source of training data for nearly every major large language model is Common Crawl, an open web archive that ingests a few billion pages a month. According to a 2024 review of LLM pre-training corpora, at least 64 percent of LLMs released between 2019 and October 2023 trained on at least one filtered version of Common Crawl, including the C4 corpus that powers Google’s own Gemini family and the Pile-CC corpus that powers many open-source models.

Common Crawl stores raw HTML. WARC files. Not Markdown. Not llms.txt. Not a custom AI optimization sitemap. Raw HTML, exactly as the page was served to the crawler. The downstream filtering and cleaning steps the LLM trainers perform are applied to that HTML.

This is the strongest empirical rebuttal to the entire Markdown-for-AI-SEO premise. The corpus the AI is training on is HTML. The crawler the AI vendors use does not request a Markdown variant. Your llms.txt file is not in the pipeline. The optimization work the AI SEO vendor is billing you for is happening at a layer no model can see.

Per the Mozilla Foundation’s 2024 report on Common Crawl, the cost of producing this corpus is roughly the price of a sandwich. That is the level of effort the model providers are willing to spend on alternate format support. They are not building Markdown ingestion pipelines. They are pointing their training runs at the HTML pile that already exists.

Why Does the AI SEO Pitch Keep Spreading?

This is the section where it stops being a technical article and starts being a marketing-services industry article. The reason the pitch keeps spreading is that it works, on the buyer side, in the absence of pushback from the source.

SparkToro’s 2025 study on AI brand recommendations recruited 600 volunteers to run twelve prompts through ChatGPT, Claude, and Google AI a combined 2,961 times. The headline finding was that the same prompt run twice returned the same list of recommended brands less than one percent of the time. That is the measurement floor of the entire GEO optimization category. The thing being optimized is statistically random across runs. There is nothing stable to measure, which means there is nothing stable to charge a retainer against.

Per Lily Ray’s analysis at Amsive: around thirty sites that publicly bragged about GEO wins in 2024 and 2025 went on to suffer significant organic traffic losses in the months afterward. PPC Land’s coverage of Ray’s broader critique frames the pattern bluntly: the SEO industry is getting AI search dangerously wrong, in ways that hurt the sites paying for the work.

Tom Pick at Webbiquity put the diagnosis in the headline of his column: “Beware the Generative Engine Optimization Snake Oil.” That framing is not a hot take from a random commenter. It is the consensus reaction from the working SEO practitioners with the most public reputational skin in the game.

The reason the pitch still works against this much pushback comes down to information asymmetry. The buyer who hears “Google released a Markdown standard” on LinkedIn has no efficient way to verify the claim. The buyer does not read Search Engine Roundtable. The buyer does not subscribe to Search Off the Record. The buyer has a business to run. The buyer trusts that the person posting confidently about Google announcements has read the Google announcement.

Most of the time, the person posting has not.

The One-Question Test

If a vendor pitches you Markdown conversion, an llms.txt optimization package, an AI sitemap, or any deliverable framed as making your content easier for LLMs to understand, ask them one question:

The One Question

“Name one major LLM provider — OpenAI, Anthropic, Google, Perplexity, Meta, or Mistral — that has publicly committed to crawl, parse, or honor the file you are billing me to produce.”

If they cannot name one, the deliverable is decoration. The file will sit on your server. Nothing will read it. The invoice will be paid for work that produces no measurable outcome, because the system the work is aimed at does not exist.

If they name a provider, ask for the public commitment. A blog post. A docs page. A conference talk. A tweet from a verified employee. Something more durable than a Slack rumor. Then go look up the source. The whole point of the test is to make the vendor produce a primary source the buyer can verify in five minutes.

In June 2026 the correct answer to that question is that no provider has made the commitment, which is why no real optimization can be billed against the file.

Already paying for AI optimization work?

If you are on a retainer that includes llms.txt files, Markdown conversion, GEO optimization, or any deliverable framed around AI search, send us the invoice. We will tell you, on the phone, in fifteen minutes, which line items are doing real work and which are decoration. No upsell, no sales pitch, just a read.

Get a Free Vendor-Invoice Audit

What Actually Moves the Needle in 2026?

This is the part the snake-oil pitch leaves out, because real optimization is harder to sell than invented optimization.

If your goal is to be discovered by Google’s organic results, Google’s AI Overviews, ChatGPT’s web search citations, Perplexity’s source list, and the rest of the AI search surface, the work that actually matters is the same work that has mattered since 2010. Just more of it, and more carefully.

  • Server-rendered HTML with real semantic structure. Headings in order. Paragraphs that answer questions in their first sentence. Tables when the data is tabular. Lists when the data is a list. The HTML standard already encodes meaning. Use it.
  • Page speed and Core Web Vitals. Slow pages get crawled less often, ranked lower, and dropped from AI citation pools faster. Our 2026 speed benchmark of Twin Cities web design agencies put real numbers behind this for the local market.
  • Topical depth on a domain you own. One firm, one site, deep coverage of a topic the firm is qualified to write about. Not a microsite network. Not a vendor-owned subdomain. Not a templated city page on a domain you do not control.
  • Schema.org structured data. Real structured data, validated, generated server-side from real fields, not bolted on as an afterthought. Article, FAQPage, BreadcrumbList, LocalBusiness where the entity warrants it.
  • Original, first-hand expertise demonstrated in the content. Google’s Search Quality Rater Guidelines call this Experience, the first E in E-E-A-T. AI search systems are converging on the same signal. Content written by someone who has actually done the work outranks content written by someone who has not.
  • Backlinks from real publications. The link graph still matters. AI citation systems use it as a trust input the same way Google does.
  • Brand entity consistency. Name, address, phone, attribution, biography, all consistent across the open web. The Knowledge Graph reads this. AI search systems read it too.

None of that is new. None of it requires Markdown. None of it requires llms.txt. None of it is sold as a discrete “AI optimization” package because it is the same internet marketing work it has always been. The pitch needs new vocabulary to justify a new retainer. The work does not.

The Pattern Underneath All of This

The Markdown myth is not a technical story. It is a marketing-services-industry story.

The marketing services industry has a structural problem that any honest practitioner will admit. The work that produces measurable results is mostly boring, mostly slow, and mostly the same work that worked five years ago. There is no new acronym to attach to it. There is no urgency to manufacture. There is no fear-of-missing-out to weaponize against a small business owner who is trying to decide where to put a marketing budget.

So a percentage of the industry, the percentage that depends on retainers and cannot survive on referrals from satisfied past clients, generates the urgency itself. They invent terminology. They invent standards. They invent emergencies. They tell business owners that Google released a Markdown standard, that AI search will obsolete their website within a year, that without an llms.txt file they will be invisible to ChatGPT, that GEO is a discipline they need to retain a vendor to perform.

None of it is true. All of it sells.

Unique Insight

The single most useful question a business owner can ask a marketing vendor in 2026 is: “Can you point me to a primary source from the company whose product you are saying I need to optimize for?” Not a recap. Not a summary. The primary source. If the vendor cannot produce one in five minutes, the optimization work they are selling does not have a system on the receiving end. The file gets created, the work gets billed, and nothing changes, because nothing was reading the file in the first place.

The Bottom Line

Google did not release a Markdown standard. Google released a documentation update that says Google ignores Markdown, ignores llms.txt, ignores AI optimization files, and that none of them affect rankings in either direction. Mueller and Splitt dedicated a podcast episode to explaining the same point in plain language. The transcript is public. The episode is free.

No major LLM vendor has formally adopted llms.txt. Common Crawl ingests HTML. The corpus that trains the models you are supposedly optimizing for is HTML. The crawler that builds it does not request a Markdown variant. Your llms.txt file is not in the pipeline.

If you are paying a vendor for AI optimization, ask them the one question. If they cannot name a provider that ingests what they are charging you to produce, the deliverable is decoration, and the retainer is being paid against a system that does not exist.

The honest version of internet marketing in 2026 looks almost identical to the honest version of internet marketing in 2020. Server-rendered HTML, real semantic structure, real expertise written by real people, fast pages, validated schema, a clean brand entity, and links from publications that earned them. That work is still the work. It is not glamorous. It does not require new acronyms. It pays off slowly, in the same way it always has.

If a vendor is telling you otherwise, they are telling you a story. The story is profitable for the vendor. It is not profitable for you.

Want a second opinion before signing?

If you have a proposal in front of you that talks about AI optimization, GEO, llms.txt, Markdown conversion, or AI sitemap work, send it over. We will read it line by line and tell you which parts are real and which parts are decoration. No retainer pitch on the other end.

Send Us the Proposal

Frequently Asked Questions

Did Google release a Markdown standard for AI optimization?

No. Google updated its Search Central documentation on June 15, 2026 to clarify that site owners do not need to create new files, markup, or Markdown to appear in Google Search, including its generative AI features, because Google Search does not use them.

Does Google honor llms.txt?

No. Google Search Central’s documentation states explicitly that Google Search ignores llms.txt files. They will not harm or help your rankings. Gary Illyes of Google confirmed the same position publicly at Search Central Live in July 2025.

Who created llms.txt?

Jeremy Howard of Answer.AI proposed llms.txt in September 2024. It is a community proposal published at llmstxt.org. It was not created by Google, OpenAI, Anthropic, or any other major LLM vendor, and as of June 2026 no major LLM vendor has committed to honoring it.

Do I need to convert my website to Markdown for AI search?

No. John Mueller and Martin Splitt addressed this directly on the Search Off the Record podcast. HTML has been the standard format on the web for longer than Markdown has existed, every web crawler is built around it, and converting HTML to plain text is a trivial operation for any crawler. Splitt added that Markdown actually strips out the link structure, navigation, and heading relationships that LLMs use to understand context.

What does GEO mean and is it real?

GEO stands for Generative Engine Optimization, the proposed practice of optimizing content for citation by generative AI search systems. The category has been heavily criticized by working SEO practitioners. SparkToro’s 2025 study found that the same AI prompt returned the same list of recommended brands less than one percent of the time across thousands of test runs, which means there is nothing stable to measure or optimize against. Lily Ray of Amsive documented multiple sites that bragged about GEO wins and then suffered significant organic traffic losses afterward.

What should I actually do to be cited by ChatGPT, Perplexity, and AI Overviews?

The same things that produce strong organic SEO. Server-rendered HTML, semantic structure, original expertise written by qualified people, fast page load, validated schema, a clean and consistent brand entity across the open web, and earned links from real publications. AI search systems use the same trust signals Google uses, because they are largely trained on the same Common Crawl corpus Google indexes from.

How do I know if my SEO vendor is selling me snake oil?

Ask them to name a single LLM provider, OpenAI, Anthropic, Google, Perplexity, Meta, or Mistral, that has publicly committed to crawl, parse, or honor the files they are billing you to produce. Ask for the public source. A docs page, a blog post, a verified employee statement. If they cannot produce a primary source in five minutes, the deliverable they are selling is not connected to a system that will read it.


MM

Written and curated by

Minneapolis Made

Keep Reading

Share this view