Blog/Programmatic SEO

What is Programmatic SEO? The Complete Guide to Building 20,000 Pages at Scale

Ira Bodnar,

Founder @ Ryze AI

March 2026~12 min read
TL;DR

Build a template. Connect it to data. Generate thousands of pages — each targeting a different long-tail keyword.

Zapier did it with 70,000+ integration pages (16M monthly visitors). Flyhomes did it with cost-of-living guides and hit 10,737% traffic growth in 3 months. The barrier dropped with AI. This guide shows you how.

1. What is Programmatic SEO?

Programmatic SEO is building a system that generates large numbers of pages from templates and data — instead of writing each one by hand. You define the pattern once, connect it to a structured data source, and let the system produce pages at scale.

The core formula:

Head term + modifiers = scalable keyword matrix

Concrete patterns that work:

  • "Best CRM for [industry]" — real estate, nonprofits, startups, agencies, restaurants — 200+ pages from one pattern
  • "[Service] in [city]" — plumber in Austin vs plumber in Denver have completely different local intent
  • "[Tool A] vs [Tool B]" — Zapier alone ranks for 8,000+ comparison queries this way
  • "[Number] [content type] for [niche]" — "50 blog ideas for travel bloggers" × 100 niches = 100 pages

Each page targets a long-tail keyword with low competition. 500 searches/month × 2,000 pages = 1M potential monthly search impressions. That's the math behind it.

2. How is it Different from Traditional SEO?

Traditional SEOProgrammatic SEO
Page creationManual, one at a timeAutomated, hundreds or thousands at once
Target keywordsHead terms, medium competitionLong-tail, low individual competition
Time per pageHours to daysSeconds to minutes
Effort allocationOngoing per-page effortFront-loaded into system design
Typical scale50–500 pages5,000–500,000 pages
Risk profileGradual, predictableExplosive growth or rapid deindex
Traditional SEO vs Programmatic SEO comparison — Traditional SEO: 50-500 pages, head terms, hours per page, gradual growth. Programmatic SEO: 5,000-500,000 pages, long-tail at scale, seconds per page from data, explosive or catastrophic results. Most successful sites use both.
Traditional vs programmatic SEO across scale, keyword strategy, effort, and risk. HubSpot and Ahrefs built authority through editorial depth — Zapier, Airbnb, and Zillow scaled through data-driven page systems. Most high-traffic sites run both in parallel.

Neither replaces the other. Editorial content builds topical authority; programmatic content captures the long tail. Most successful sites run both — Zapier has a full editorial blog alongside its 70,000 integration pages.

3. Real Examples — With the Actual Numbers

A lot of the framework in this section is borrowed from @jakezward — specifically this thread on how he built 13,000+ pages in 3 hours using JSON schemas and Gemini Flash.

70,000+
Zapier integration pages — 16.2M monthly visitors
466%
Jake Ward traffic growth — 13,000 pages in 3 hours
10,737%
Flyhomes — 10K to 425K pages in 3 months
1,969%
KrispCall — area code pages, 82% of all US traffic
Zapier's programmatic SEO pages — 70,000+ integration URLs each targeting a unique app-to-app connection like 'Connect Slack to Trello'
Zapier's 70,000+ programmatic pages — one URL per app-to-app integration. Each page targets a unique query like "Connect Slack to Trello" or "Sync Google Sheets to Salesforce."
Zapier"Connect [App A] to [App B]"

Data source

Their own integration database — every supported connection

Result

70,000+ pages, 16.2M monthly organic visitors, 1.3M+ keywords ranking

Why it worked

Each page has a real setup guide, use cases, and trigger/action documentation specific to that integration pair

Flyhomes"Cost of living in [City]"

Data source

Real estate data, salary benchmarks, local cost indices

Result

10K → 425K pages in 3 months. 1.1M monthly visits. 55.5% of all site traffic from these pages

Why it worked

Targeted relocation intent — people Googling cost of living are actively considering moving, which maps directly to Flyhomes' product

KrispCall"[US area code] phone numbers"

Data source

Area code data — location, cities covered, carrier info

Result

Area code pages = 82% of all US traffic. 1,969% YoY growth

Why it worked

Businesses buying phone numbers search by area code — highly specific intent, thin competition

Jake Ward / Byword"[Number] [content type] for [niche] bloggers"

Data source

13,000+ niche combinations generated via Gemini Flash + JSON schemas

Result

971 → 5,500 weekly clicks in 60 days (+466%). Built in under 3 hours of generation time

Why it worked

Each niche got truly different content because niche context (audience, pain points, monetization) was injected into every prompt

Jake Ward (@jakezward) Twitter breakdown showing how 13,000+ programmatic SEO pages were built using JSON schemas and Gemini Flash, resulting in 466% traffic growth in 60 days
@jakezward's breakdown of how 13,000+ pages were generated in under 3 hours using strict JSON schemas — traffic went from 971 to 5,500 weekly clicks in 60 days. Much of the framework in this guide is borrowed from this thread.

4. The Step-by-Step Process

01

Find your keyword pattern

Use Google Autocomplete to surface modifier variations for your head term. Validate with Ahrefs or Semrush — look for KD under 30 and at least 50 monthly searches per variation. At 1,000 variations × 50 searches, you have 50K/month in reach before writing a single word.

02

Build your data source

This is what makes each page genuinely different. Without it, you're just swapping keywords. Sources that work:

Own database

Zillow property listings, Airbnb destinations — the data IS the product

Public datasets

Census data (KrispCall's area codes), government databases, industry benchmarks

APIs

Real-time pricing, review scores, availability — data that changes and stays fresh

AI-structured

Jake Ward's approach — generate unique content per niche using JSON schemas (Section 5)

03

Design your template

Manually write 3-5 example pages first. These set the quality bar. If you can't write a good page manually for a specific variation, the automated version won't be good either. Your template needs 500+ words of unique content, structured headings with the keyword, internal links, schema markup, and a clear CTA.

04

Generate and validate

Every page must pass automated schema validation before publishing. Check: minimum word count, all required fields populated, no hallucinated data, no duplicate content across pages. Run 10% of each batch through manual review.

05

Publish in batches — never all at once

50-100 pages in week 1. Monitor Google Search Console indexing rate for 2 weeks. If pages are indexing and showing impressions, publish 500 more. Scale from there. Launching 10,000 pages overnight is a red flag that triggers manual review.

5. The pSEO 2.0 Framework: JSON Schemas + AI

Jake Ward generated 13,000+ pages in under 3 hours using Gemini Flash and strict JSON schemas. The key principle: never ask AI to write freeform content — ask it to fill a schema.

Freeform output is inconsistent at scale. One page might have 8 checklist items, the next might have 40. With a schema, every page is structurally identical — different substance, same shape. Validation can be automated.

Example schema

json
{
  "meta": { "content_type": "checklist", "niche": "travel" },
  "seo": {
    "title": "SEO Checklist for Travel Bloggers (2026)",
    "keywords": ["travel blog SEO", "travel blogger checklist"]
  },
  "content": {
    "intro": "string (2-3 sentences)",
    "sections": [{
      "heading": "string",
      "items": [{
        "title": "string",
        "description": "string (1-2 sentences)",
        "difficulty": "beginner | intermediate | advanced",
        "priority": "high | medium | standard"
      }]
    }],
    "pro_tips": ["string (exactly 5 tips)"]
  }
}

The niche context layer — where 60% of the work goes

A travel blogger's SEO checklist should cover seasonal traffic swings, Google hotel pack competition, and affiliate link disclosure. A health blogger's checklist should cover E-E-A-T requirements, YMYL compliance, and medical review processes. Same schema. Completely different content. This only happens if you inject real niche context into the prompt.

json
{
  "slug": "travel",
  "context": {
    "audience": "Travel bloggers, digital nomads, family vacation planners",
    "pain_points": [
      "Seasonal traffic swings — summers spike, winters crash",
      "Google hotel pack dominates destination searches above organic",
      "High-DA competitors (TripAdvisor, Lonely Planet) own head terms"
    ],
    "monetization": "Affiliate (Booking.com, Viator, gear), display ads, sponsored trips",
    "content_that_works": "Specific itineraries with costs, hidden gem guides, comparison posts"
  }
}

This context is injected into every prompt for the travel niche. The output isn't "generic SEO advice with travel swapped in" — it's advice that only makes sense for travel bloggers.

Content vs. presentation are completely separate

Content = JSON files

Generated by AI, validated against schema, versioned. Can be regenerated without touching the frontend.

Presentation = React components

Purpose-built renderers per content type — checklist pages get checkboxes, comparison pages get filterable tables, idea lists get category filters.

Jake Ward's team built 20+ specialized React components for different content types. This is what separates pages that look like real tools from pages that look like keyword filler.

6. Tools and Tech Stacks

Programmatic SEO workflow built in n8n — automating page generation by connecting a data source to an AI content generator and publishing pipeline
You can also build programmatic SEO pipelines entirely in n8n — connecting your data source, AI generation step, and CMS publish in a single no-code workflow. n8n is free and self-hostable, making it a strong alternative to Zapier or Make for teams that want full control over the automation.

No-code (~$150-300/month)

Best for teams without developers. Up to ~5,000 pages.

ToolRoleCost
AirtableDatabase — stores all structured data per page$20+/mo
Webflow CMSTemplate and page rendering$29+/mo
WhalesyncSyncs Airtable → Webflow automatically$49+/mo
Make.comAutomation — trigger AI generation when new rows added$20+/mo

WordPress (~$200 one-time)

Best for teams already on WordPress.

ToolRoleCost
Google SheetsDatabase — one row per pageFree
WP All Import ProBulk-creates WordPress posts from spreadsheet~$200 one-time
ACF (Advanced Custom Fields)Maps spreadsheet columns to page fieldsFree/paid

Developer stack (10,000+ pages)

ToolWhy
Next.js + ISRIncremental static regeneration — rebuild individual pages without full rebuilds
Gemini FlashNative JSON output, fast, cheapest cost-per-page for bulk generation
PostgreSQLHandles millions of rows; better than Airtable at true scale
Python generation scriptsIterate through niche × content type matrix, call AI, validate, save JSON
IndexNowNotify Bing/Yandex instantly when pages publish — speeds up indexing

7. Common Mistakes

~60% of programmatic SEO projects fail. Here's why.

01

Thin content — only the keyword changes

A travel site created 50,000 "hotels in [city]" pages where only the city name differed. Google deindexed 98% within 3 months. If you strip the keyword from the page and it reads identically to every other page in the set, you don't have a programmatic SEO strategy — you have a spam site.

Fix

30%+ content differentiation per page. Minimum 500 words of genuinely unique content per variation.

02

Publishing 10,000 pages overnight

A sudden spike from 200 to 10,000 pages is an obvious signal. Google sends it to manual review, most pages don't get indexed, and the few that do get lower crawl priority.

Fix

50-100 pages → wait 2 weeks → 500 pages → wait → scale. Always batch.

03

No internal linking

Orphaned pages — no links in, no links out — are invisible to search engines. If Googlebot can't discover a page by crawling, it won't index it no matter how good the content is.

Fix

Hub-and-spoke architecture. One category hub page links to all child pages. Each child page links back to the hub and to 3-5 related variations. XML sitemap segmented by content type.

04

No feedback loop

Most teams launch, celebrate the page count, and then ignore which pages actually work. Three months later they have 5,000 pages and no idea why only 200 of them rank.

Fix

Weekly GSC check: indexing rate, impressions, CTR by page segment. Double down on the niches and content types that move. Prune pages that never indexed.

8. What Google Actually Thinks

Google's spam policies target content "created with the primary purpose of manipulating ranking without providing unique value." Automation isn't the issue — lack of value is.

Penalized

  • ×Same content with keyword swapped — Doorway pages
  • ×No user value — exists only to rank
  • ×Data that's fabricated, stale, or meaningless

Fine

  • Large-scale pages where each page serves distinct intent
  • AI content structured and validated for quality
  • Templated pages with real, unique data per variation

The test

“Would this page be useful if search engines didn't exist? Would someone bookmark it and come back?”

Zapier's integration pages: yes — they're genuinely useful setup guides. Flyhomes' cost-of-living guides: yes — real data people need for relocation decisions. That's the bar.

FAQ

Does programmatic SEO still work in 2026?+

Yes — when each page has unique data, strong niche context, and passes quality controls. The bar is higher than 2022. Keyword swapping doesn't rank. But structured, genuinely useful pages with real data continue to index and drive traffic. The case studies above are all from 2025-2026.

Will Google penalize AI-generated pages?+

Not for being AI-generated. Google penalizes thin, duplicative content regardless of how it was created. Jake Ward's 13,000 AI-generated pages grew 466% in 60 days. The difference is schema-driven structure and genuine niche context — not the origin of the content.

How many pages should I start with?+

50-100. Monitor Google Search Console for 2 weeks — watch indexing rate and impressions. If pages are getting indexed and showing search impressions, publish another 500. Never launch everything at once.

What's the minimum budget?+

Airtable + Webflow + Whalesync: ~$150-300/month. WordPress + WP All Import: ~$200 one-time. AI generation (Gemini Flash): fractions of a cent per page — 10,000 pages costs ~$5-15 in API fees.

How long until results?+

3-6 months is typical for pages to rank on a new or low-authority domain. Jake Ward's 466% growth happened in 60 days — but he had an existing domain with authority. Flyhomes' 10,737% growth took 3 months on an established real estate site.

What's the difference between pSEO and content spinning?+

Content spinning swaps synonyms in the same text. Programmatic SEO generates genuinely different pages because the underlying data is different. KrispCall's area code pages aren't spun — every page has different location data, coverage info, and carrier details for that specific area code.

Final Thoughts

The companies that fail at programmatic SEO all make the same mistake: they treat it as a content shortcut. They focus on page count, skip the taxonomy work, and launch thousands of thin pages at once.

The ones that succeed — Zapier, Flyhomes, KrispCall — all built systems where the data makes each page genuinely different. Their page count is a side effect of having useful data, not the goal itself.

The page count is never the point. The system that gets better with every batch — that's the point.

Manages all your accounts
Google Ads
Connect
Meta
Connect
Shopify
Connect
GA4
Connect
Amazon
Connect
Creatives optimization
Ad creative preview
Next Ad
ROAS1.8x
CPA$45
Ad Creative
ROAS3.2x
CPA$12
24/7 ROAS improvements
Pause 27 Burning Queries
0 conversions (30d)
+$1.8k
Applied
Split Brand from Non-Brand
ROAS 8.2 vs 1.6
+$3.7k
Applied
Isolate "Project Mgmt"
Own ad group, bid down
+$5.8k
Applied
Raise Brand US Cap
Lost IS Budget 62%
+$3.2k
Applied
Monthly Impact
$0/ mo
Next Gen of Marketing

Let AI Run Your Ads