mod_security 960015 blocks Google and other good bots

Vlad asked:

mod_security rule 960015 keeps catching Google and other good bots. I have the following in the vhost to prevent good bots from being caught:

SecRule REQUEST_HEADERS:User-Agent "" log,allow
SecRule HTTP_USER_AGENT "Mail.RU_Bot" log,allow

Same for Google and Yandex.

It works 99% of the times, however fails at other times for some really bizare reason, here are the logs example for bot:

Successfull: - - [07/Mar/2014:10:17:13 +0400] "GET / HTTP/1.1" 200 189934 "-"
"Mozilla/5.0 (compatible; Linux x86_64; Mail.RU_Bot/Fast/2.0; 

[Fri Mar 07 10:17:13 2014] [error] [client] ModSecurity: Access 
allowed (phase 2). Pattern match "Mail" at REQUEST_HEADERS:User-Agent. 
[file "/etc/apache2/sites-enabled/xxx"] [line "28"] [hostname "xxx"] 
[uri "/"] [unique_id "UxlkaQp-d4EAABU9BSIAAAAV"]

And next minute it fails: - - [08/Mar/2014:02:14:19 +0400] "GET / HTTP/1.1" 403 389 "-" "
Mozilla/5.0 (compatible; Linux x86_64; Mail.RU_Bot/2.0; +

[Sat Mar 08 02:14:19 2014] [error] [client] ModSecurity: Access 
denied with code 403 (phase 2). Operator EQ matched 0 at REQUEST_HEADERS. 
[file "/usr/share/modsecurity-crs/activated_rules/
modsecurity_crs_21_protocol_anomalies.conf"] [line "47"] [id "960015"] 
[rev "2.2.5"] [msg "Request Missing an Accept Header"] [severity "CRITICAL"] 
[tag "OWASP_TOP_10/A7"] [tag "PCI/6.5.10"] [hostname "xxx"] [uri "/"] 
[unique_id "UxpEuwp-d4EAAEMnBFQAAAAE"]

I know the proper way is to do reverse lookups, however they slow down the website, and I want to have at least some security but as it is at the moment cant use the 960015 because it blocks Google and others. In the same time it is a very usefull rule that caught 100s of bad bots.

If someone knows how to set it up with reverse lookup that will actually work and allow Google and other good bots to index – you are welcome to post here. However I am also looking for a quick and dirty solution to make it work right now, since some security is better then no security.

My answer:

First a disclaimer: I’m the author of Bad Behavior, a similar product, and some of the ModSecurity core rules were derived from Bad Behavior.

RFC 2616 states that the Accept header SHOULD be present in all requests. Note that this isn’t an absolute requirement, so a user-agent is still conditionally compliant (as defined in the RFC) if it doesn’t send this header.

The rationale for denying requests without an Accept header is that all regular web browsers do send the header, while many bots do not. In practice, though, after seeing millions of requests, some “good” bots don’t send the Accept header either. So this rule is not perfect and does generate false positives.

Bad Behavior does not block these unless the request is a POST request. This cuts down on spam and reduces false positives to approximately zero, but still passes other bots. In my experience, many of those get caught by other rules anyway.

In your situation I would just disable this rule. It isn’t buying you quite as much as you seem to think. If you want, you can modify it so that it only applies to POST requests.

View the full question and answer on Server Fault.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.