OptiviewAuditBot

We are a responsible web crawler that runs site audits you initiate in Optiview. We respect robots.txt directives and follow best practices for web crawling.

Our Identity

User-Agent: OptiviewAuditBot/1.0 (+https://optiview.ai/bot; [email protected])

Identifying Header: X-Optiview-Bot: audit

What We Respect

robots.txt - We check for User-agent: OptiviewAuditBot rules, falling back to *
Allow/Disallow - We respect path-based allow/disallow rules (longest match wins)
Crawl-delay - We honor crawl delay directives (default: 1.5s between requests to same domain)
Retry-After header - We parse and respect the Retry-After header for rate-limited responses (429)
Exponential backoff - We implement exponential backoff for transient errors (429, 521) with up to 3 retries
Meta tags - We respect noindex, nofollow, and noai meta tags

How to Allow/Block Us

Allow the bot:

User-agent: OptiviewAuditBot
Allow: /

Block the bot:

User-agent: OptiviewAuditBot
Disallow: /

Block specific paths:

User-agent: OptiviewAuditBot
Disallow: /admin/
Disallow: /private/

Set crawl delay:

User-agent: OptiviewAuditBot
Crawl-delay: 5

About Our Crawling

We only crawl sites when you explicitly request an audit through Optiview. Our crawler is designed to be polite, efficient, and respectful of your server resources:

Targeted crawling - We fetch pages to analyze AEO/GEO optimization (typically 40-50 pages per audit)
Structured data extraction - We extract JSON-LD, meta tags, headings, and content signals for analysis
Dual-mode rendering - We fetch both static HTML and JavaScript-rendered content to detect visibility gaps
Strict robots.txt compliance - We respect all robots.txt directives with longest-match path rules
Configurable delays - Default 1.5s delay between requests, configurable via Crawl-delay directive
Rate limit handling - We automatically back off when receiving 429 responses and respect Retry-After headers
Retry logic - Up to 3 retries with exponential backoff for transient errors (429, 521)
Clear identification - We identify ourselves with User-Agent and custom header on every request
KV caching - We cache robots.txt rules for 24 hours to reduce redundant requests

Crawling Limits & Scope

To minimize server impact while providing comprehensive analysis:

Page limit - Maximum 50 pages per audit (prioritizes homepage and top-level navigation)
Sitemap-first - We prioritize sitemap-listed URLs with 3s timeout for sitemap discovery
Depth limits - We typically crawl homepage + 1-2 levels deep
Language filtering - We focus on English (US) content, filtering non-English paths by default
Per-request timeout - 25 seconds per batch request to prevent hung connections
Total audit timeout - 2 minutes maximum runtime with automatic completion

Contact

Questions or concerns? Contact us at [email protected]

Machine-Readable Info

For automated systems: /.well-known/optiview-bot.json