Bypass Hetrixtools through Cloudflare's bot fight mode
Hi.
I've been receiving a lot of requests from bots today and server is becoming slow so I'm trying to enable bot fight mode in Cloudflare, but Hetrixtools bot is being blocked.
I tried adding a firewall rule to allow the bot, but CF keeps enforcing the blocking rule:
Any ideas?
Tagged:
Comments
check the firewall overview, look at the reason IP blocked
This is an issue with CloudFlare and the way their firewall works. We've had several such complains before.
It looks like CloudFlare's Bot Fight Mode will simply ignore or trump any 'allow' firewall rules... which doesn't make sense.
I've done all I could on our part, applied for our monitoring locations to be whitelisted in their Bot Fight Mode, but haven't heard anything back from CloudFlare in over a month.
I'd also be interested in seeing this.
Cheers.
Shoot CF in the face.
Figure out what url's the bots are calling, block it with nginx or its just 404's fail2ban go over the error log and let fail2ban ban them by IP.
Free NAT KVM | Free NAT LXC
If bots slow down your server, it‘s not always because there are too many bots crawling around but that they hit some „expensive“ resource, such as a long-running PHP script or something causing complex database queries. Could as well just be the GoogleBot following some links to something you didn‘t considered to be found; and I guess you wouldn‘t want to block Google, right?
Instead of using a firewall, you might consider to figure out what is this „expensive“ route/path they hit and:
Good point to start would be to look at the servers access log.
Alwyzon - Virtual Servers in Austria starting at 3,99 €/month (excl. VAT)
I have a page that displays a map of available bus stops: https://yoursunny.com/p/rideon-today/
It contains a query argument of the date, because bus service is different on different dates.
Google crawled thousands of pages, because each date results in a different URI.
I initially added
noindex
on the page with specified date other than "today". Google stops indexing excessive pages but calling continued.Then I added
rel=nofollow
to the links. Google stopped crawling unnecessary pages.PS. I usually avoid databases, but this page has SQLite and it's the most complicated SQL query I wrote in a decade:
https://bitbucket.org/yoursunny/yoursunny-website/src/0616b70e484bbb736185a7debb2e3addf8153fe5/www/p/rideon-today/gtfs-db.inc.php#lines-5
check the firewall overview, look at the reason IP blocked
Well... reason is Bot Fight mode And the action taken includes Block, JS Challenge and more. I tried to bypass all of them without success. @Andrei is right.
I understand, but there is not much room for optimization with Wordpress + plugins + multipurpose themes. These are not the good bots and they don't respect robots.txt. Caching is enabled but they are scraping years old archives very fast, content that is not cached by LSCache yet. Wordfence makes everything slow and ModSecurity throws many false positives that backend becomes unusable.
In the meantine I've disable bot fight mode, the attack has stopped. Will check some rate limiting options for the old content.
Thank you!