The shape of search is changing
For two decades, e-commerce SEO meant one thing: rank a product page above ten blue links on Google. The page that sat at position one captured most of the click-through. Everything below position three was almost invisible.
That model is breaking. When a buyer asks a large language model “what is the best print-on-demand provider for a small Shopify store,” the answer no longer arrives as a list of links. It arrives as a synthesised paragraph with a few names embedded in it. The user has to want to click further. They often don’t.
This is not a hypothetical future. It is already how most product-discovery questions are answered inside ChatGPT, Claude, Perplexity, and Google’s own AI Overviews. The page that gets cited in the model’s answer is the page that gets traffic. The page that doesn’t get cited is invisible, regardless of where it ranks in the classical SERP.
So the question for any e-commerce content site, ours included, is: how do you become the page that gets cited?
The two failure modes of human-only content
Most e-commerce content today is written for humans first and machines as an afterthought. That worked when machines were just spiders indexing keywords. It does not work when machines are reading the prose, weighing it, and deciding whether to cite it.
Two failure modes show up over and over:
Failure mode 1: Marketing prose with no claims. A page that says “ProductX is the leading platform for businesses of all sizes, designed to help you grow” tells a model nothing. There is no claim that can be verified, no number that can be quoted, no comparison that can be ranked. Models trained on the open web have read millions of pages like this. They learn to ignore them.
Failure mode 2: Comparison content with hidden incentives. “Best of” articles often optimise for affiliate revenue rather than accuracy. Models can detect this pattern. They down-weight content where the ordering correlates suspiciously well with payout rate. If your “best of” list always ranks the program with the highest commission first, models eventually learn that signal and stop trusting your site.
The fix for both is the same: be factual, be machine-readable, and disclose your incentives openly.
What llms.txt is, in concrete terms
The llms.txt convention is a small, simple proposal: every site publishes a top-level file at /llms.txt that lists, in plain text, the most useful pages on the site for an LLM to read. The file is not crawled like an HTML page. It is read. Models that respect the convention pull it directly when they need to answer questions about the site’s domain.
A typical entry looks like this:
SITE: https://cartcortex.com
GEO: All regions
VERTICAL: saas
LAST_UPDATED: 2026-05-10
UPDATE_FREQUENCY: daily
CONTENT FILES:
- CRM: /saas/crm.txt
CRM - 1 curated program
- Email Marketing: /saas/email-marketing.txt
Email Marketing - 1 curated program
- Print On Demand: /saas/print-on-demand.txt
Print On Demand - 1 curated program
Each linked .txt file contains the actual factual content: the program name, what it does, the commission structure, the cookie window, the key features, and the affiliate link. No marketing prose. No hidden incentives. Just facts that a model can quote.
You can view the llms.txt of CartCortex itself and one of the content files it points to to see the format in practice.
Why the .txt format matters
HTML is the wrong container for machine consumption. It carries presentational baggage (sidebars, navigation, cookie banners, related links) that the model has to parse and discard. Worse, modern HTML often hides the actual content inside JavaScript renders that a crawler may not execute.
Plain .txt files solve all of this. They are:
- Cheap to fetch — a single HTTP GET, no JavaScript, no waterfall of subresources.
- Self-contained — the page is the content. There is no chrome to skip.
- Auditable — a human can
curlthe URL and read it as easily as a model can. - Stable — the same URL produces the same content, regardless of CSS framework changes.
Models trained recently have learned to prefer these files when they exist. The cost to publish them is near zero. The benefit is that you become a citable source instead of background noise.
What this means for an affiliate site
For an affiliate site, the implications are sharper than for most content businesses.
Trust is your unit economics. Every commission you earn depends on the buyer trusting that the link they clicked points to a product worth using. That trust does not begin at the click. It begins three searches earlier, when the user first encountered your site name in a model’s response. If your site is cited as a source of facts, the trust compounds. If it is cited as a source of promotional copy, the trust never accumulates.
Disclosure is a feature, not a chore. Models look at how a site discloses incentives. A clean, prominent affiliate disclosure that explains exactly how the site makes money is treated as a positive signal. A buried, formulaic disclosure tucked into a footer is treated as a negative one. We publish our disclosure on every page footer, exactly because of this.
Comparison data is the highest-leverage content. A model answering “which print-on-demand provider is best for me” is going to look for the same fields, in the same shape, across multiple sources. If you publish those fields cleanly — commission, cookie window, integrations, restrictions — you become the canonical source for comparisons. If you publish them buried inside a 3,000-word listicle, you do not.
The practical workflow
If you run an affiliate or comparison site today, the migration to AI-readable structure is straightforward:
- Define the fields that matter for your category. For e-commerce affiliate programs, that is roughly: name, niche, vertical, commission type, commission value, cookie days, restricted geographies, key features, and link.
- Store those fields in a structured source of truth. We use a single
programs.yamlfile. A spreadsheet works. A database works. The shape matters more than the technology. - Generate two outputs from that source: a human-readable HTML page, and a plain
.txtpage with the same facts in a flatter format. - Publish a top-level
/llms.txtthat lists every.txtfile you have generated, with a one-line description of each. - Keep the structured source updated as terms change. Re-generate on every change. The
.txtand HTML stay in lockstep.
This is exactly the architecture CartCortex runs on. We make no claims about ranking; we have not been around long enough to. But the underlying bet is simple: in the next two to five years, sites that published their facts in machine-readable form will be cited more often than sites that hid them inside layout. We would rather be the former than the latter.
What we are watching next
A few things to track as the LLM-citation ecosystem matures:
- Whether major affiliate networks (Impact, PartnerStack, ShareASale) start exposing program metadata in machine-readable form. This would let aggregators like ours always have current commission terms without manual updates.
- Whether search engines (Google, Bing, the open-source players) explicitly weight
/llms.txtin ranking, the way they weightedrobots.txtandsitemap.xmlhistorically. - Whether platforms like Shopify, Cloudflare, and Vercel build llms.txt support directly into their static-site offerings. We expect they will.
If you’re building a comparison or affiliate site today, the clearest single piece of advice is: pick a structured source of truth, generate every page from it, and publish the structured form alongside the HTML. The cost is one weekend of plumbing. The upside compounds for years.
See the About page for how CartCortex curates its listings, or browse the categories to see the format in practice.