#These shouldn't be indexed directly, and spiders hitting the #kana tester generate further session files each time. #DID have: # /japanese/kanatest.php # and /japanese/kanjitest_PROT.php # and /ironman/kanatest.php # and /maths/modulo.php #But that only stopped google CRAWLING the pages, it didn't stop #Google from INDEXING them. It appears Googlebot has to see the meta robots #tags on individual pages, in order to know to exclude them from the index. #However I don't want this to be general rules, as other robots might #actually *work* re robots.txt and ignore meta robots... #(Why yes, Twiceler(?) seems to be that way...) #Ok, since taking those off, Google has since unindexed them, so I can #put the bloody things back in so it'll stop SPIDERING them too... #I can probably leave /ironman off for good. User-agent: Googlebot Disallow: /utils/feedback.php Disallow: /utils/tileedit.php Disallow: /utils/tilegen_form.php Disallow: /maths/modulo.php Disallow: /japanese/kanatest.php Disallow: /japanese/kanjitest_PROT.php #unfortunately things are complicated by the fact Googlebot appears to #interpret robots.txt differently to the standard... #See http://www.google.com/support/webmasters/bin/answer.py?answer=40360 #So I may have to explicitly give it "Allow:" tags, that other robots #wouldn't recognise. #Note, I added the feedback page to BOTH user agent records, before even #creating the page, so if Googlebot spiders it or Google shows it, it's #not because of me!!! #Ok, I've restored most of the pages to its blocklist, as they've delisted #them now, and I want them to stop spidering them again too. However now #there's bookmarkier to do too. User-agent: * Disallow: /japanese/kanatest.php Disallow: /japanese/kanjitest_PROT.php Disallow: /ironman/kanatest.php Disallow: /maths/modulo.php Disallow: /utils/bookmarkier.php Disallow: /utils/tileedit.php Disallow: /utils/tilegen_form.php Disallow: /utils/feedback.php