[tor-relays] Worse throughput with 0.4.8.x, on a slow CPU

Hello,

Ater upgrading from Tor 0.4.7.13 to 0.4.8.9, I get a much worse bandwidth
numbers.

The CPU is Atom C2338 with two cores at 1.75 Ghz. Multiple Tor instances are
running to take advantage of both cores.

On the older version it gets about 80+80 Mbit total in+out. On the new one the
average is at most 45+45 Mbit. There are frequent periods where the bandwidth
drops to 5-10 Mbit for 3-5 seconds, while all Tor processes continue to use
100% of both CPUs, then gradually climbs back up.

Does anyone notice anything similar?

Here's how it looks for a few days on 0.4.8 then a roll-back:

···

------------------------+-------------+-------------+---------------
    today 461.94 GiB | 470.84 GiB | 932.78 GiB | 153.98 Mbit/s ########################################
12/12/23 462.81 GiB | 475.67 GiB | 938.48 GiB | 91.12 Mbit/s #######################
12/11/23 434.72 GiB | 443.08 GiB | 877.80 GiB | 85.23 Mbit/s ######################
12/10/23 446.13 GiB | 459.56 GiB | 905.70 GiB | 87.93 Mbit/s #######################
12/09/23 464.79 GiB | 473.17 GiB | 937.96 GiB | 91.07 Mbit/s #######################
12/08/23 454.67 GiB | 463.48 GiB | 918.16 GiB | 89.14 Mbit/s #######################
12/07/23 463.82 GiB | 472.84 GiB | 936.65 GiB | 90.94 Mbit/s #######################
12/06/23 670.01 GiB | 680.55 GiB | 1.32 TiB | 131.13 Mbit/s ##################################
12/05/23 808.50 GiB | 817.32 GiB | 1.59 TiB | 157.85 Mbit/s #########################################
12/04/23 854.09 GiB | 866.43 GiB | 1.68 TiB | 167.05 Mbit/s ###########################################
12/03/23 782.20 GiB | 799.00 GiB | 1.54 TiB | 153.52 Mbit/s ########################################
12/02/23 805.18 GiB | 817.91 GiB | 1.59 TiB | 157.59 Mbit/s #########################################
12/01/23 733.80 GiB | 745.35 GiB | 1.44 TiB | 143.61 Mbit/s #####################################
11/30/23 742.48 GiB | 756.56 GiB | 1.46 TiB | 145.54 Mbit/s #####################################
11/29/23 657.90 GiB | 673.66 GiB | 1.30 TiB | 129.28 Mbit/s #################################

In general, are there any tweaks to reduce relay CPU usage on a slow processor?
I did seemingly most of what is possible, ethtool, iptables, sysctl, etc.

How long before the Rust-based Tor will be ready for use on relays?

--
With respect,
Roman
_______________________________________________
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays

Hello,

Ater upgrading from Tor 0.4.7.13 to 0.4.8.9, I get a much worse bandwidth
numbers.

Hello!

I'm not aware of any changes in that interval that should affect relays. Conflux and proof of work both arrived in that time period, but neither of these features affect relays.

If you can indeed generate this change in throughput reproducibly by switching versions, it would be useful to know more. Do you have access to a sampling profiler like linux-perf "perf top"? If you could compare both the bandwidth and the location of top CPU usage across both versions, that would be ideal.

It's possible that there's a bottleneck in system calls or cache footprint that is especially noticeable on a CPU like the Atom. That's the kind of question that may be answerable by comparing CPU counters between the two runs.

We haven't done this type of profiling on C-tor recently and we are still early in defining the specific performance characteristics that we expect from Arti. If you can help us identify the actual bottlenecks your relay is hitting that may prove instructive for any future design choices we make in Arti!

That said, it may be better for the network to also encourage everyone to run their relays below 100% CPU load. Relays with lower load are going to be lower latency, and raw bandwidth numbers aren't necessarily what users will find the most beneficial.

Thanks for running a relay, and for any additional performance data you can gather from your system.

--beth

···

On 12/13/23 06:15, Roman Mamedov wrote:

The CPU is Atom C2338 with two cores at 1.75 Ghz. Multiple Tor instances are
running to take advantage of both cores.

On the older version it gets about 80+80 Mbit total in+out. On the new one the
average is at most 45+45 Mbit. There are frequent periods where the bandwidth
drops to 5-10 Mbit for 3-5 seconds, while all Tor processes continue to use
100% of both CPUs, then gradually climbs back up.

Does anyone notice anything similar?

Here's how it looks for a few days on 0.4.8 then a roll-back:

------------------------+-------------+-------------+---------------
     today 461.94 GiB | 470.84 GiB | 932.78 GiB | 153.98 Mbit/s ########################################
  12/12/23 462.81 GiB | 475.67 GiB | 938.48 GiB | 91.12 Mbit/s #######################
  12/11/23 434.72 GiB | 443.08 GiB | 877.80 GiB | 85.23 Mbit/s ######################
  12/10/23 446.13 GiB | 459.56 GiB | 905.70 GiB | 87.93 Mbit/s #######################
  12/09/23 464.79 GiB | 473.17 GiB | 937.96 GiB | 91.07 Mbit/s #######################
  12/08/23 454.67 GiB | 463.48 GiB | 918.16 GiB | 89.14 Mbit/s #######################
  12/07/23 463.82 GiB | 472.84 GiB | 936.65 GiB | 90.94 Mbit/s #######################
  12/06/23 670.01 GiB | 680.55 GiB | 1.32 TiB | 131.13 Mbit/s ##################################
  12/05/23 808.50 GiB | 817.32 GiB | 1.59 TiB | 157.85 Mbit/s #########################################
  12/04/23 854.09 GiB | 866.43 GiB | 1.68 TiB | 167.05 Mbit/s ###########################################
  12/03/23 782.20 GiB | 799.00 GiB | 1.54 TiB | 153.52 Mbit/s ########################################
  12/02/23 805.18 GiB | 817.91 GiB | 1.59 TiB | 157.59 Mbit/s #########################################
  12/01/23 733.80 GiB | 745.35 GiB | 1.44 TiB | 143.61 Mbit/s #####################################
  11/30/23 742.48 GiB | 756.56 GiB | 1.46 TiB | 145.54 Mbit/s #####################################
  11/29/23 657.90 GiB | 673.66 GiB | 1.30 TiB | 129.28 Mbit/s #################################

In general, are there any tweaks to reduce relay CPU usage on a slow processor?
I did seemingly most of what is possible, ethtool, iptables, sysctl, etc.

How long before the Rust-based Tor will be ready for use on relays?

_______________________________________________
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays

Hi Roman,

Yes, I am seeing something similar on 0.4.8.9 (and potentially earlier versions as well, not 100% sure when it started). I upgraded to 0.4.8.10 today hoping it would go away, but I’m seeing it again. Watching in nyx (screenshot of bandwidth graph attached), reliably every ~30 seconds, I see the bandwidth briefly plummet and the tor process CPU spike. Guard relay running in docker under ubuntu server on a Ryzen 3600 machine with 32GB RAM. Note that when the relay restarted after the upgrade today, it didn’t do this for a while (maybe an hour or so? wasn’t watching it the whole time), but then started glitching every 30s. Once it starts it does this every ~30 seconds forever. Relay has been running like this for weeks, maybe months.

best,

-jeff

···

On 12/13/23 06:15, Roman Mamedov wrote:

On the older version it gets about 80+80 Mbit total in+out. On the new one the
average is at most 45+45 Mbit. There are frequent periods where the bandwidth
drops to 5-10 Mbit for 3-5 seconds, while all Tor processes continue to use
100% of both CPUs, then gradually climbs back up.

Does anyone notice anything similar?