Template:Web crawlers
From LinuxReviews
Hostile web crawlers (Doesn't follow the Robots Exclusion Standard)
See HOWTO stop automated spam-bots using .htaccess on how to block them
See HOWTO stop automated spam-bots using .htaccess on how to block them
Annoying web crawlers (follow robots.txt, but doesn't provide any public benefit)
Friendly web crawlers (used by search-engines, or somehow gives public benefit)