# $Id: robots.txt,v 1.3 1998/11/11 11:11:11 kat Exp $ # robots.txt for http://research.legalaid.on.ca/ # see # for an explanation. # # 11Nov98 Eric Lee # 9May06 EL Add pickup/ # 11Nov98 EL Set to prevent all access until we decide if some access should # be allowed. # Specific agents #User-agent: MOMspider # The Multi-Owner Maintenance Spider # All agents User-agent: * # All spiders should avoid # These are the ones "at the same level as" the level of /web/httpd/docs # (I don't really understand how a browser gets to these, since they are above # the "root", but for cgi's it clearly does, so let's protect the rest. # I'm copying this idea from the robots.txt documentation referred to above, # which may be based on a different server organization.) Disallow: /*/ Disallow: /answerbook/ Disallow: /format.dat Disallow: /home/ Disallow: /lost+found/ Disallow: /netra users/ Disallow: /catdocs/ Disallow: /ftp/ Disallow: /htdocs/ Disallow: /mail/ Disallow: /nsr/ # # These are the ones "below" the level of /web/httpd/docs # Exclude everything you don't want included. This is everything which is not # specifically intended to be publically available. # Individual files Disallow: /* Disallow: /*.html Disallow: /*.gif Disallow: /*.jpg # Directories, i.e. projects with their own index.html Disallow: /cat/ # Catalogue Disallow: /cat1/ Disallow: /cat2/ Disallow: /cat3/ Disallow: /text/ Disallow: /tocs/ # Table of Contents pages Disallow: /pickup/