git log for any website.
Track what changed on any website. Git-style diffs with optional AI summaries.
Powered by Cloudflare's /crawl endpoint.
pip install crawldiff# Snapshot a site
crawldiff crawl https://stripe.com/pricing
# Come back later. See what changed.
crawldiff diff https://stripe.com/pricing --since 7dA CLI tool for tracking website changes over time. It crawls pages via Cloudflare's /crawl endpoint, stores markdown snapshots locally in SQLite, and produces unified diffs between crawls. Optionally summarizes changes with AI.
No SaaS subscriptions. No proprietary dashboards. Just crawldiff diff.
You need a free Cloudflare account. That's it.
# Install
pip install crawldiff
# Set your Cloudflare credentials (free tier: 5 jobs/day, 100 pages/job)
export CLOUDFLARE_ACCOUNT_ID="your-account-id"
export CLOUDFLARE_API_TOKEN="your-api-token"
# Or save to config (env vars take precedence over config file)
crawldiff config set cloudflare.account_id your-id
crawldiff config set cloudflare.api_token your-token# Take a snapshot
crawldiff crawl https://competitor.com
# Later, see what changed
crawldiff diff https://competitor.com --since 7d
# Output as JSON (pipe to jq, Slack, wherever)
crawldiff diff https://competitor.com --since 7d --format json
# Save a markdown report
crawldiff diff https://competitor.com --since 30d --output report.md# Check every hour, get notified when something changes
crawldiff watch https://stripe.com/pricing --every 1h
# Check every 6 hours, skip AI summary
crawldiff watch https://competitor.com --every 6h --no-summarycrawldiff history https://stripe.com/pricing Crawl History β https://stripe.com/pricing
ββββββββββββββββββ³ββββββββββββββββββββββ³ββββββββ
β Job ID β Date β Pages β
β‘βββββββββββββββββββββββββββββββββββββββββββββββ©
β cf-job-abc-123 β 2026-03-13 09:00:00 β 12 β
β cf-job-def-456 β 2026-03-06 09:00:00 β 11 β
β cf-job-ghi-789 β 2026-02-27 09:00:00 β 11 β
ββββββββββββββββββ΄ββββββββββββββββββββββ΄ββββββββ
# Deeper crawl
crawldiff crawl https://docs.react.dev --depth 3 --max-pages 100
# Static sites (faster, no browser rendering)
crawldiff crawl https://blog.example.com --no-render
# Ignore whitespace noise in diffs
crawldiff diff https://example.com --since 7d --ignore-whitespacecrawldiff can optionally summarize diffs using an LLM. Three providers are supported:
# Cloudflare Workers AI (free, uses your existing CF account)
crawldiff config set ai.provider cloudflare
# Anthropic Claude
pip install crawldiff[ai]
crawldiff config set ai.provider anthropic
export ANTHROPIC_API_KEY="sk-..."
# OpenAI
pip install crawldiff[ai]
crawldiff config set ai.provider openai
export OPENAI_API_KEY="sk-..."Don't want AI? Just use --no-summary. Diffs work fine without it.
1. crawldiff crawl <url>
βββ Cloudflare /crawl API (headless browser, respects robots.txt)
βββ Store Markdown snapshots in local SQLite (~/.crawldiff/)
2. crawldiff diff <url> --since 7d
βββ Cloudflare /crawl with modifiedSince (only fetches changed pages)
βββ Diff against stored snapshot (unified diff via difflib)
βββ AI summary (optional)
βββ Syntax-highlighted diffs in the terminal (via rich)
Cloudflare's modifiedSince parameter means repeat diffs only fetch changed pages, not the entire site.
| crawldiff | Visualping | changedetection.io | Firecrawl | |
|---|---|---|---|---|
| Open source | Yes | No | Yes | Yes |
| CLI-native | Yes | No | API | API |
| AI summaries | Built-in | No | Via plugins | Extraction |
| Incremental crawling | Yes (modifiedSince) |
No | No | No |
| Local-first storage | SQLite | Cloud | Self-host or cloud | Cloud |
| JSON/pipe output | Yes | No | Yes | Yes |
| Free tier | 5 jobs/day, 100 pages | Limited | Yes (self-host) | 500 credits |
crawldiff crawl <url> Snapshot a website
crawldiff diff <url> Show what changed (the main command)
crawldiff watch <url> Monitor continuously
crawldiff history <url> View past snapshots
crawldiff config set|get|show Manage settings
Pull requests and bug reports are welcome. See CONTRIBUTING.md to get started.
MIT