Robots.txt Builder
Build precise robots.txt directives that control how search engines and AI bots crawl your site
Optimize Your SEO: Generate a Professional Robots.txt
A Robots.txt file is the first thing search engine crawlers look for. It acts as a gatekeeper, guiding bots away from private or resource-heavy directories and toward your most important content.
How to Use
- Set Rules: Choose a bot, an action, and a path.
- Add to List: Click “Add Rule” to stack instructions.
- Platform Presets: Use “SEO Health Check” for quick setups.
- Export: Download the `.txt` or copy the code.
Pro Tips
- Don’t hide CSS/JS: Google needs these to render your site.
- Sitemaps: Add your Sitemap URL at the very end.
- Wildcards: Use
*to apply a rule to every bot. - AI Bots: Use the AI preset to protect your content.
Deployment Checklist
- Root Access: Upload strictly to
domain.com/robots.txt. - Naming: Filename must be lowercase
robots.txt. - Visibility: Ensure the file is public (HTTP 200).
Bot Guide: Googlebot (Google), Bingbot (Bing), GPTBot (OpenAI). Use * for all known crawlers.
SEO Health Check: Platform Presets
Click to add standard security, SEO, or AI-blocking rules:
Test & Validate
Live Preview
# Generated robots.txt
Frequently Asked Questions
Where does the robots.txt file go?
At the root of your domain, served at https://yourdomain.com/robots.txt. Crawlers only read it from that exact path, so a robots.txt in any subfolder is ignored. The filename must be lowercase, and the file must return HTTP 200.
Does robots.txt actually block AI crawlers?
Only the ones that choose to obey it. Major AI crawlers such as GPTBot, ClaudeBot, plus Google-Extended document that they respect robots.txt, so a Disallow rule keeps compliant bots out. It is a published request rather than a hard firewall, so pair it with server-level controls for anything you must truly protect.
What is the difference between Disallow and noindex?
Disallow tells a crawler not to fetch a path, but it does not guarantee the page stays out of an index, because the URL can still be listed from external links. To keep a page out of results, allow the crawl and use a noindex meta tag or an X-Robots-Tag header so the engine can read the directive.
Should I block my CSS and JavaScript?
No. Google renders pages the way a browser does, so blocking CSS or JS stops it from seeing your real layout and content, which can lower how well the page is understood. Leave rendering assets crawlable; reserve Disallow for admin, checkout, plus other private directories.
Why add a Sitemap line?
The Sitemap directive points crawlers straight to your XML sitemap, so they find new or updated URLs faster than by following links alone. It must be an absolute https URL, and it conventionally sits at the end of the file. A single robots.txt can list more than one sitemap.
What does User-agent: * mean?
The asterisk is a wildcard, so the rules beneath it apply to every crawler that has no block of its own. A crawler obeys the most specific matching group, so a named GPTBot block overrides the wildcard for GPTBot. Order the file from general rules down to specific ones for clarity.
Is a missing robots.txt a problem?
A missing file is treated as full-allow, so crawlers assume every path is open. That is fine for a fully public site, but most sites benefit from disallowing admin paths plus declaring a sitemap. An empty file behaves the same way as a missing one.
MAY THE FORCE BE WITH YOU