I host a normal website on a Apache server. Last 2 days, the website had 0 visitors, but more then 1GB of traffic per day. When monitoring the traffic, most of the traffic comes through the Tor network. I have no idea why this is happening. I blocked the tor exit bulklist just to check what would happen. The traffic goes down a lot. But a few days later the traffic comes back again. All against the homepage of the website. The website had a leak on the former hoster, that is why the website was rebuild and there are no leaks now. What can I do or what should I do?
I can’t say why it’s getting traffic. But
Is 1GB of traffic per day a lot for you? That 0.1 Mbit / s.
For a website with 0 visitors and only hits from tor network 1GB is a lot. Today a new ip was added to the list and immediately I got traffic from that ip.
OK, where do you get those stats (O visitors, 1GB)?
Does this come from analysis of the server logs?
Unless you are a brand new site you should be getting hammered by all those bots and scrapers and hacking attempts for vulnerable software like Wordpress and all those other CMS types. In this hammering I include legit traffic like Google, Bing, Yahoo, Yandex, Apple, etc etc etc.
Agreed 1GB is not a lot unless it is within 1 or 2 seconds.
I also don’t get this:
Today a new ip was added to the list and immediately I got traffic from that ip.
I presume this means to the blocklist; so how does adding an IP to a blocklist get you traffic from it.
The website had a leak on the former hoster, that is why the website was rebuild and there are no leaks now. What can I do or what should I do?
Rebuilt on a new hoster, I presume, like what software (not the hoster). Maybe the framework has some vulnerability and it is getting exploited.
Lastly, are those stats you state only for Tor traffic and I am reading this post wrong.
Most site get tons of traffic and I estimate 50% or more is scraping (junk).
I mean it in a sense “is it a problem”. The title says DDoS, but is the server actually overloaded?
stats: server logs and independent analytics on the website.
The website has little to none legit traffic. Bing and Google, all other bots are sort of blocked (sort of: robots.txt a few other bots besides Google and Bing are allowed, Yahoo, Yandex, Apple are blocked). This website was used with form spam, then the website was rebuild and put on other hoster and hosting.
New ip: https://check.torproject.org/torbulkexitlist
Framework: WordPress and no exploits, everything up to date, server up to date, virusdie firewall, server firewall the works.
The stats are 99% Tor traffic that is why I started this thread, most of the traffic is the homepage. Just a homepage visit and then nothing. No scrappers as far as I can see.
How can I stop this. Is the only solution blocking Tor (which I don’t want to do, because I love Tor). Is there any tooling that can connect to specific tor exit nodes and do this? I am looking for a solution. The browser is also different every time, so I can’t block a specific browser.
Please help me find a solution.
You are correct it is not a ddos, server does not get overloaded. But I can’t explain the traffic. This customer (like many of my customers) has 5GB bandwith a month. So his traffic is done in 5 days without a single visitor.
OUCH I see. Figured it was something malicious.
I understand not wanting to block Tor. I would not want to do it also.
Not sure how you setup is. Can you just block Tor for that customer with .htaccess using Deny from n.n.n.n using that blocklist as data.
A regular expression type statement converting the first character of each line to Deny from followed by whatever the first character was plus rest of the line.
Is there a common referer you can block.
robots.txt is not much respected any more.
Edit later:
I over complicated my reply a bit with that regular expression statement for the blocklist.
In a Windows command prompt:
FOR /F %i in (torbulkexitlist) do @echo deny from %i
In Linux:
awk '{print "Deny from", $0}' <torbulkexitlist
Thank you very much I will try this.
OK, so in this sense the server is overloaded.
I’d suggest looking into generic anti-DOS solutions, although I’m not an expert in this field. Nginx has a rate-limiting feature with per-IP buckets.
If you know that most of this spam traffic is for the home page, you can set a harder limit just for the home page, and a more lax limit for the rest of the site.
Not running Nginx, but I can setup a server with Nginx and try. Thank you for the suggestion.
Maybe it i also an option to make the site accessed much smaller, so less bandwidth is consumed, when the bots/the script are getting the website…
Allready did that: website is only accessable in 3 countries world wide (country of origin and country of Bing and Google) and still the bandwith is consumed. Also added lazy load, images are optimized and webp. There was an movie auto play (1,4mb) but also disabled. Website is 549.5 KB in size according to external check.
I can understand you saying no visitors to mean no legit visitors and I can see around 5GB in 5 days in your case of spam bombing.
But only Tor traffic for that 5GB seems weird.
No regular scrapers or traffic outside of Tor seems weird also.
Surely within that 5 day period some scrapers (legit or not) would have come around. This seems almost impossible to believe.
I would be curious to see the Apache logs.
What kind of traffic to the front page? GET, PUT, HEAD?
is the website actually rendered via tor in a headless-browser (javascript, images and such are also delivered) or is it just getting the plain index?
If I take the total visitors according to Independent Analytics plug-in for the last 9 days, it is none. Then we have some bots, but that is so little. No external scripts are loaded (so also no Google analytics)
If I open directadmin (or webalizer) and look at the bandwith for the last 5 days it says this:
2024 12 10 1.02 GB
2024 12 11 1.06 GB
2024 12 12 1.29 GB
2024 12 13 932.5 MB
2024 12 14 645.7 MB (yeah it is going down!)
I looked at my Apache log of yesterday and I see a lot of this:
GET / HTTP/2.0
POST / HTTP/2.0
Both from the same ip address: 185.220.101.69
The get has a big size, the post does not have a big size.
Normal traffic shows something like this: GET /masterclass/ HTTP/1.1 (where masterclass is a page of the website).
Here is another example of another ip address:
185.100.87.250 - - [13/Dec/2024:02:07:28 +0100] “GET / HTTP/2.0” 301 331 "domain “Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.86 YaBrowser/21.3.1.186 Yowser/2.5 Safari/537.36”
185.100.87.250 - - [13/Dec/2024:04:11:32 +0100] “GET / HTTP/2.0” 301 331 “domain” “Mozilla/5.0 (Windows NT 6.1; WOW64; rv:34.0) Gecko/20100101 Firefox/34.0”
185.100.87.250 - - [13/Dec/2024:04:11:33 +0100] “GET / HTTP/2.0” 200 307338 “domain” “Mozilla/5.0 (Windows NT 6.1; WOW64; rv:34.0) Gecko/20100101 Firefox/34.0”
185.100.87.250 - - [13/Dec/2024:04:11:34 +0100] “POST / HTTP/2.0” 403 947 “domain” “Mozilla/5.0 (Windows NT 6.1; WOW64; rv:34.0) Gecko/20100101 Firefox/34.0”
(Domain redacted for privacy reasons and I can only post 2 links)
2 different browsers?
I agree so much traffic and only Tor is weird and that is why I posted this thread here.
That’s a big front page 300K. (307338/1024) WOW! That Firefox 34 must be headless. It gets no images or js or anything else.
I would install the blocklist for Tor in the meantime until it goes down. You could tell as they would all be 403. Maybe that was the point of all this. The site, from what you said, had some sort of spam problem or something then someone decided to take it offline by burning its 5GB/day quickly.
I would make the front page a lot smaller and spin off the content into different pages with links on the front page. No one is going to read 300K of anything. If the point of that 300K page is just data then I would zip it and put a link on the front page. What’s in that 300K if you can say.
Edited later:
Let me expand a bit more about splitting that front page.
You say they hit the front page only so with about 3000 hits they burn up the 5GB and take the site offline. Now, if the page only returns 10K, they will need 30 times more hits to burn the site.
You also said Google and Bing are allowed in. They also get 300k from a GET and they will probably ignore most of it. They will not index a 300k page except the title and meta data and probably a bit of text.
Where do you get 300K from? The frontpage is only 549.5 KB big. The website had a spam problem, on another hoster and the website was different. For now I am blocking tor visitors until this issue goes away, but it is allready going on for a few months allready. Next step will be cloudflare and prove that you are human, but I also don’t want that.
307338 is the bytes returned. 1k=1024bytes 307338/1024=300k
I made a mistake above: What kind of traffic to the front page? GET, PUT, HEAD?
I meant GET, POST, HEAD
Any news on this. I am really curious.