I started running a relay a couple of weeks ago (non-exit) and periodically check on it to see how it is going. A couple of times recently I’ve noticed it has the “overloaded” tag on its status page.
I followed the steps outlined in relay bridge overloaded to debug what is going on and assume at least the first overloaded flag was related to the DDOS on the 9th of July (possibly wrongly). Checking the metrics port output I see that it was related to the dropped ntor onionskins metric.
The relay was again reported as overloaded today, and again only the dropped ntor onionskins seem the issue.
My question is simply, how much effort should I put into responding to this sort of issue / can I for dropped ntor onionskins issue?
What affect does being overloaded have on the usefulness of the relay to the network? (I get there are multiple potential causes, that might change this answer)
My choice is to wait until DDoS is stopped.
Most likely, overload will go away.
If not, then this problem may be analyzed further.
Cheers for reply.
I’m not to concerned about the relay being in an overloaded state, since it is only ever like that for a day at most.
By the way, I see now that overloads are coming in bursts:
Jul 20 13:52:12.000 [warn] Your computer is too slow to handle this many circuit creation requests! Please consider using the MaxAdvertisedBandwidth config option or choosing a more restricted exit policy. [580255 similar message(s) suppressed in last 60 seconds]
Jul 20 15:41:15.000 [notice] General overload -> Ntor dropped (580417) fraction 36.2841% is above threshold of 0.5000%
Which means 580k circuits in 60 seconds. So that 36% of drops almost exclusivily belonged to attacker, not to ~36% of users.
If I’m not misunderstanding something of course.
Yeah I think the fast relays are the ones suffering the most because they have to handle a lot of these operations, my relay starts dropping after around 9000os/s
This is different from the ddos with connections because onion skins processing is apparently multithreaded and it can hurt your relay if it does not have enough cpu power. Apart from limiting the advertised bandwidth I don’t see any other mitigation, side effect is also that tor process is using almost 5G of memory …
It is possible to set
NumCPUs 1 limit.
Tor actually spawns 2 more threads, resulting in 3 cores used, but it is better than nothing - I’m guaranteed to have 1 core free (my CPU is 4 core).
Interesting but I have spare CPU on my case, I think limiting to 1 will result in even more dropped OS
The tor network is experiencing many DDoS attacks these days with connections flood. One solution is to limit the number of connections, limit the memory usage or do what @Vort said with the number of cpu cores.
I have the same issue on my middle relay, Im using a memory limit but its not enough unfortunately.