Robots.txt Tester
Paste a robots.txt and any URL to test whether it's allowed or blocked — for any user-agent. Handles wildcards, end-of-URL markers, and most-specific-wins rule precedence just like Google.
How to use this tool3 quick steps
Get your robots.txt
Visithttps://yoursite.com/robots.txtin a browser and copy the contents. Or paste the rules you are about to deploy.Pick a URL to test
The full URL you want to know whether crawlers can fetch. Include the protocol (https://).Read the verdict
We apply the User-agent + Allow/Disallow rules in priority order and tell you exactly which rule matched (or that no rule applied — meaning crawlable by default).
Paste the full robots.txt text. We parse User-agent groups, Allow/Disallow rules, and Sitemap directives.
Use this with
Related crawlers & indexing tools
Robots.txt Testing Guide
Catch crawl bugs before they kill your rankings
A broken robots.txt is the fastest way to disappear from Google. One misplaced Disallow: / line wipes millions of URLs from the index. Test changes before pushing to production — every senior SEO has a story about an intern shipping a robots.txt change at 5pm on a Friday.
Most-specific rule wins
When multiple rules match, Google picks the one with the longest pattern. `Allow: /api/public/` beats `Disallow: /api/` because it's more specific. This tool uses Google's exact precedence rules.
Wildcards (*)
Asterisks match any sequence — including zero chars. 'Disallow: /*.pdf$' blocks every PDF on the site. Use carefully; overly broad wildcards can block far more than you intend.
End-of-URL markers ($)
The dollar sign anchors to the end of the URL. 'Disallow: /*.pdf$' blocks /report.pdf but NOT /report.pdfa or /report.pdf?v=2. Essential for file-type rules.
User-agent matching
Google picks the most specific UA block that matches the crawler. 'User-agent: Googlebot' beats 'User-agent: *' for Googlebot requests. If no block matches, default is allow-all.
Empty Disallow = allow-all
'Disallow:' with no value is an explicit allow-all (a legacy way to declare 'we have no restrictions for this UA'). 'Disallow: /' is the opposite — block everything.
Case sensitivity
Robots rules are case-sensitive. 'Disallow: /Admin' does NOT block /admin. Mirror your actual URL casing — or use wildcards like '/*admin' to catch both.
Pro Tips
Any edit to robots.txt should be tested against your top 10 URLs before going live. This tool works without a network round-trip — safe to use pre-commit.
'Disallow: /admin' blocks both /admin and /admin/login. 'Disallow: /admin/' blocks only paths under /admin/ but allows /admin itself. Know which you want.
Disallow hides URLs from crawlers but does NOT hide them from humans. Sensitive paths should require authentication — not robots.txt blocking.
Frequently Asked Questions
- Why did Google still index a URL I disallowed?
- Robots.txt blocks crawling, not indexing. If other sites link to a blocked URL, Google can still include it in the index (with just the URL and no snippet). To truly de-index, use a meta robots noindex tag and allow crawling to see it.
- What's the difference from Google's Search Console tester?
- Search Console tests against your LIVE robots.txt on Google's servers. This tool tests against any robots.txt you paste — useful for testing drafts, competitors, or historical versions you've exported.
- Does this handle Crawl-delay?
- No — Google ignores Crawl-delay entirely (set crawl speed in Search Console instead). Bing and Yandex honor it. The directive parses but has no effect on allow/block decisions.
- Why is my wildcard rule not matching?
- Most common bug: forgetting that patterns anchor to the start of the path. 'Disallow: /pdf' matches /pdf and /pdfs but NOT /my/pdf. Use '/*pdf' to match anywhere in the path.