How can I block crawlers sent bij services such as adplexity.com / whatrunswhere.com ?

September 30, 2017 136 views
PHP Nginx Security Firewall Ubuntu 16.04

Hi all

I would like to block crawling tools such as adplexity.com and whatrunswhere.com . These crawlers seem to be pretty smart and keep indexing media placements (ads). As a 'services' they allow people to view the indexed ads and let people view the web pages they link to. The result is that people steal idea's, concepts and even complete landing page code.

There are several services which claim to 'cloak' pages and redirect these bots to dummy pages. They charge a very high fee (200 dollars and up / month) and they refuse to say anything about they techniques or ways they work. Not a single thing.

I was wondering if anyone has any idea of how I could be able to identify those spy tools and then redirect them to a dummy web page, while letting regular users go through to a correct page. Are there any applications, techniques or other ways that I can accomplish this on my droplets ? Any tips would be much appreciated!

Thanks,
Lex

2 Answers

If you are using apache you can use your .htaccess file to do so after identifying the crawler.

This may help: http://sitebeginner.com/apache/blockcrawlerbots/

For NGINX this may help: https://www.knthost.com/nginx/blocking-bots-nginx

Thanks for the answer Jason ... I had already looked into this technique. I believe that these crawlers change IP and User Agent very often, making it a cat and mouse game to catch them each and every time ... I'm still looking for a 'catch most' solution that limits the amount of work that I have to put in ...

Have another answer? Share your knowledge.