A robots.txt generator helps you build the robots exclusion protocol file that controls which pages search engine crawlers can access. Use it to block AI training bots, protect private paths, set crawl delays, and declare your sitemap location — all without writing the format by hand.
Preset Templates
User-Agent Rules
Global Directives
Specify canonical hostname (used by some crawlers)
robots.txt Preview
real-timeCommon AI Bot User-Agents
How to Use the Robots.txt Generator
The robots.txt generator lets you visually build a properly formatted robots exclusion file without memorizing syntax. Add user-agent rules, configure paths, and see the output in real time.
Step 1: Choose a Preset or Start Fresh
Start with a preset template to save time. Allow All creates an open wildcard rule. Block All prevents all crawlers from indexing any path. Block AI Bots adds specific Disallow: / rules for GPTBot, Claude-Web, CCBot, Google-Extended, and other AI training crawlers. Standard blocks common private directories. WordPress Default blocks admin, wp-json, and login paths.
Step 2: Add User-Agent Rules
Each rule targets a specific bot by its user-agent string. Use * to match all crawlers. For specific bots, enter their exact name (e.g., Googlebot, GPTBot). Use the presets dropdown to quickly select common bots. For each rule, add Disallow paths (paths crawlers cannot access) and Allow paths (exceptions within disallowed directories).
Step 3: Configure Crawl Delay
The optional Crawl-delay directive tells bots how many seconds to wait between requests. Google ignores this in favor of Search Console settings, but Bing and smaller crawlers do respect it. Values between 1 and 10 are common for servers under moderate load. Leave blank for no directive.
Step 4: Add Sitemap and Host
Always add your sitemap URL in the Global Directives section so all search engines can find it reliably. Format: https://example.com/sitemap.xml. The Host directive is optional and primarily used by Yandex to specify your canonical domain.
Step 5: Review Validation Warnings
The generator flags issues like conflicting Allow/Disallow rules for the same path, missing sitemaps, and empty user-agent blocks. Address warnings before deploying to avoid unexpected crawl behavior.
Step 6: Download and Deploy
Click Download to get the robots.txt file, then upload it to your website's root directory so it's accessible at https://yourdomain.com/robots.txt. Alternatively, copy the content directly into your CMS or web server configuration.
FAQ
Is this robots.txt generator free?
Yes, completely free with no usage limits. Generate as many robots.txt files as you need without signing up or paying anything. All processing happens in your browser.
Is my website data safe?
Absolutely. All processing is done locally in your browser. Your URLs, paths, and configuration are never sent to any server or stored anywhere.
How do I block AI bots like GPTBot and Claude?
Use the 'Block AI Bots' preset to add specific user-agent rules for GPTBot (OpenAI), Claude-Web (Anthropic), CCBot (Common Crawl), Google-Extended, and other AI training crawlers. Each bot gets a separate user-agent block with a Disallow: / rule.
Does robots.txt actually block AI training bots?
Compliant AI companies like OpenAI, Google, and Anthropic respect the robots.txt disallow rules for their crawlers. However, some scrapers and non-compliant bots may ignore it. Robots.txt is a protocol, not a technical barrier — it works for honest bots.
What is the Crawl-delay directive?
Crawl-delay tells bots how many seconds to wait between requests. Google ignores this and uses Google Search Console's crawl rate settings instead. Bing and some other crawlers do respect it. Common values range from 1 to 10 seconds for heavily loaded servers.
Should I add a sitemap URL to robots.txt?
Yes, it is best practice to include your sitemap URL so search engines can find it reliably. Add it as 'Sitemap: https://example.com/sitemap.xml'. You can add multiple sitemap URLs if you have separate sitemaps for different content types.
What happens if I have conflicting Allow and Disallow rules?
The generator warns you if a path appears in both Allow and Disallow for the same user-agent. Most crawlers follow the most specific matching rule. Google uses the longest matching rule wins algorithm. It is safest to avoid overlapping rules.