[tor-relays] Guidance on optimal Tor relay server configurations

usetor.wtf · February 3, 2025, 4:00pm

Hi All,

Looking for guidance around running high performance Tor relays on Ubuntu.

Few questions:

If a full IPv4 /24 Class C was available to host Tor relays, what are some optimal ways to allocate bandwidth, CPU cores and RAM to maximize utilization of the IPv4 /24 for Tor?
If a full 10 Gbps connection was available for Tor relays, how many CPU cores, RAM and IPv4 addresses would be required to saturate the 10 Gbps connection?
Same for a 20 Gbps connection, how many CPU cores, RAM and IPv4 addresses are required to saturate?

Thanks!

···

Sent with Proton Mail secure email.

George_Hartley · February 7, 2025, 11:22am

Hi there “usetor”,

I am going to answer a few of your questions:

“If a full IPv4 /24 Class C was available to host Tor relays, what are some optimal ways to allocate bandwidth, CPU cores and RAM to maximize utilization of the IPv4 /24 for Tor?”

With 2 IPv4 addreses per relay as a hard limit, the biggest bottleneck you will encounter is that most of Tor’s code-base is singe-threaded, except for maybe onionskin decryption and compression of files.

I used to host a Tor exit node on a single IPv4 address, which was running inside an encrypted ArchLinux VM through QEMU/KVM on our colocated dedicated server.

Here is the config I used for libvirtd: https://pastebin.com/cxSicEnN

I had the relay bandwidth limit using the following config:

BandwidthRate 75 MBits
BandwidthBurst 100 MBits

After starting up the relay for the first second, and waiting 2 weeks for the relay to get some traffic, it was using up 75-90 MBit/s constantly, or around 30TB per month.

To get the maximum out of my machine, I used the following config options:

NumCPUs 4
HardwareAccel 1

The second option made use of my CPU’s AES instruction, which should be available in all Intel and AMD server CPU’s made since the year 2011.

Even when doing 100MBit/s, the use of hardware accelerated AES only made the Tor process use ~30%, on an Intel Xeon E5-2620 running at only 2 GHz… without the bandwidth restrictions, I imagine it could have done 350MBit/s easily.

If a full 10 Gbps connection was available for Tor relays, how many CPU cores, RAM and IPv4 addresses would be required to saturate the 10 Gbps connection?"

Another user already calculated how much it would take to saturate 2GBit/s, so you can take it from there.

However I disagree with the memory limit of 512MB, is okay in my opinion but not less… you can achieve that by using the following config option:

MaxMemInQueues 1024MB

Same for a 20 Gbps connection, how many CPU cores, RAM and IPv4 addresses are required to saturate?

Look at my answer for question 2.

I also suggest you to use the seccomp syscall sandboxing options built into Tor:

Sandbox 1

Also, remember one very important thing: Make sure that your relays are located in a host, datacenter and country that is not already saturated with Tor nodes.

At last, thank you for running Tor nodes!

All the best,
-GH

···

Sent with Proton Mail secure email.

usetor.wtf · February 8, 2025, 7:25am

Great - very helpful.

Your email said 8 physical cores but your blog link said 4 physical cores. Did you measure both?

I'll share my experience from Fall of 2022, which also had significant potential "DoS" type traffic patterns as using some common iptable rule shared here significant reduced traffic/load.

I'm going to run some more experiments over the next few months on 10 Gbe connections with different CPU core/RAM configurations to see what things look like now and will share back.

CPU: Dual Xeon E5-2670v2 (40 cores @ 2.5 Ghz)
CPU Usage: 75 load average over 15 min from htop
RAM Capacity: 64GB + 64GB Swap
RAM Usage: 55G + 14G Swap (previously maxed out 64G and needed swap added)
Tor Relays: 30, 2 per IPv4
IPv4 Addresses: 15
Time: 45 days, 9/15/2022 - 10/30/2022
Traffic: 2 PB total. Max In: 2.15 gbps, Max Out: 2.15 gbps
Per Day: 40TB, (0.04 PB) = 2 PB / 45 days
Tor Relay to Core ratio: 40/30, 1.3x.
RAM to Tor Relay Ratio: 64/30, 2x.

From the load average and the RAM usage - two potential conclusions:

1) CPU: 1 physical core per tor instance or 2 threads per tor instance?
2) RAM: 2GB RAM per tor instance?

Core conclusion might be similar?
RAM conclusion seems 4x more?

Linear extrapolation to 256 IPs means, at a minimum, 256 physical cores and 512GB RAM?

···

Sent with Proton Mail secure email.

On Tuesday, February 4th, 2025 at 2:27 AM, bic via tor-relays <tor-relays@lists.torproject.org> wrote:

we wrote down some notes on our experiece:
How to configure multiple Tor relays on the same interface with different IPs - Osservatorio Nessuno

On 2/4/25 9:41 AM, bic wrote:

> hello
>
> I have a configuration quite similar1 to yours and previously posted a
> similar question on the list. I try to summarize the response that I
> received
>
> 1. The big bottleneck is clock per core, for this is quite hard to
> predict bandwidth per core consumption. In the range 1GHz-4GHz you can
> have from 6 to 40MBs
> 2. Run a separate tor instance for every physical core that you have
> 3. Allocation ~500MB of memory for every instance, this is quite
> empirical for my experience
> 5. Try to use a different ip for every instance, this is not mandatory
> but if you share multiple relay on the same ip is easier to block them
> in bulk
> 6. Make sure to configure the SrcIp of every relay to match their public ip
>
> My personal suggestion is to make experiments and share on the list/
> forum the result also with some information on the hardware. But to put
> down some numbers:
>
> Imagine to have a good 3GHz cpu with good cache and AES support for
> crypto operation and 8 phisical core:
>
> (n core) * (measured bandwidth)
> 8 * 30Mb * 8 (bits) = ~2Gbit
>
> Later in this days I plan to publish a blog post on running this
> configuration, home that will be useful for you.
>
> basement-and-other-tales/
>
> On 2/3/25 5:00 PM, usetor.wtf via tor-relays wrote:
>
> > Hi All,
> >
> > Looking for guidance around running high performance Tor relays on
> > Ubuntu.
> >
> > Few questions:
> > 1) If a full IPv4 /24 Class C was available to host Tor relays, what
> > are some optimal ways to allocate bandwidth, CPU cores and RAM to
> > maximize utilization of the IPv4 /24 for Tor?
> >
> > 2) If a full 10 Gbps connection was available for Tor relays, how many
> > CPU cores, RAM and IPv4 addresses would be required to saturate the 10
> > Gbps connection?
> >
> > 3) Same for a 20 Gbps connection, how many CPU cores, RAM and IPv4
> > addresses are required to saturate?
> >
> > Thanks!
> >
> > Sent with Proton Mail Proton Mail: Get a private, secure, and encrypted email account | Proton secure email.
> >
> > _______________________________________________
> > tor-relays mailing list -- tor-relays@lists.torproject.org
> > To unsubscribe send an email to tor-relays-leave@lists.torproject.org

_______________________________________________
tor-relays mailing list -- tor-relays@lists.torproject.org
To unsubscribe send an email to tor-relays-leave@lists.torproject.org

_______________________________________________
tor-relays mailing list -- tor-relays@lists.torproject.org
To unsubscribe send an email to tor-relays-leave@lists.torproject.org

George_Hartley · February 8, 2025, 12:33pm

Sorry, I have to correct myself, as I spread some misinformation in my previous email.

The hard limit of 2 relays per IPv4 was bumped up to 8.

There were also several typos, as I was at work when writing that e-mail, i.e. under time pressure.

I hope I could help you anyway.

Best Regards,
-GH

···

On Friday, February 7th, 2025 at 12:22 PM, George Hartley via tor-relays tor-relays@lists.torproject.org wrote:

Hi there “usetor”,

I am going to answer a few of your questions:

“If a full IPv4 /24 Class C was available to host Tor relays, what are some optimal ways to allocate bandwidth, CPU cores and RAM to maximize utilization of the IPv4 /24 for Tor?”

With 2 IPv4 addreses per relay as a hard limit, the biggest bottleneck you will encounter is that most of Tor’s code-base is singe-threaded, except for maybe onionskin decryption and compression of files.

I used to host a Tor exit node on a single IPv4 address, which was running inside an encrypted ArchLinux VM through QEMU/KVM on our colocated dedicated server.

Here is the config I used for libvirtd: https://pastebin.com/cxSicEnN

I had the relay bandwidth limit using the following config:

BandwidthRate 75 MBits
BandwidthBurst 100 MBits

After starting up the relay for the first second, and waiting 2 weeks for the relay to get some traffic, it was using up 75-90 MBit/s constantly, or around 30TB per month.

To get the maximum out of my machine, I used the following config options:

NumCPUs 4
HardwareAccel 1

The second option made use of my CPU’s AES instruction, which should be available in all Intel and AMD server CPU’s made since the year 2011.

Even when doing 100MBit/s, the use of hardware accelerated AES only made the Tor process use ~30%, on an Intel Xeon E5-2620 running at only 2 GHz… without the bandwidth restrictions, I imagine it could have done 350MBit/s easily.

If a full 10 Gbps connection was available for Tor relays, how many CPU cores, RAM and IPv4 addresses would be required to saturate the 10 Gbps connection?"

Another user already calculated how much it would take to saturate 2GBit/s, so you can take it from there.

However I disagree with the memory limit of 512MB, is okay in my opinion but not less… you can achieve that by using the following config option:

MaxMemInQueues 1024MB

Same for a 20 Gbps connection, how many CPU cores, RAM and IPv4 addresses are required to saturate?

Look at my answer for question 2.

I also suggest you to use the seccomp syscall sandboxing options built into Tor:

Sandbox 1

Also, remember one very important thing: Make sure that your relays are located in a host, datacenter and country that is not already saturated with Tor nodes.

At last, thank you for running Tor nodes!

All the best,
-GH

On Monday, February 3rd, 2025 at 5:00 PM, usetor.wtf via tor-relays tor-relays@lists.torproject.org wrote:

Hi All,

Looking for guidance around running high performance Tor relays on Ubuntu.

Few questions:

If a full IPv4 /24 Class C was available to host Tor relays, what are some optimal ways to allocate bandwidth, CPU cores and RAM to maximize utilization of the IPv4 /24 for Tor?

If a full 10 Gbps connection was available for Tor relays, how many CPU cores, RAM and IPv4 addresses would be required to saturate the 10 Gbps connection?

Same for a 20 Gbps connection, how many CPU cores, RAM and IPv4 addresses are required to saturate?

Thanks!

Sent with Proton Mail secure email.

usetor.wtf · February 8, 2025, 5:17pm

Appreciate the details!

Some questions to better understand:

Why did you limit relay bandwidth? How did you calculate the values to use for the limits?
“BandwidthRate 75 MBits
BandwidthBurst 100 MBits”
CPU - how did you decide to only use 4 out of 6 cores?
Why use 4 cores to 1 tor relay instead of 4 cores to 4 relays?
“NumCPUs 4”
“Xeon E5-2620”
Max Memory - why did you set this parameter and how did you decide the value?
I see older tickets / threads on this, ~6 years, but unsure what the latest is, i.e. https://archive.torproject.org/websites/lists.torproject.org/pipermail/tor-relays/2018-January/014014.html
“MaxMemInQueues 1024MB”
CPU Utilization - only seeing “~30%” was the result of the bandwidth restriction or memory restriction or 4 core restriction? Holding all else constant in your setup, do you know what would increase the CPU utilization the most: removing bandwidth restriction, memory restriction, or something else?
Sandbox 1 - does setting this value impact the performance, i.e. mitigation overheads, of the Tor relay?

···

Sent with Proton Mail secure email.

On Saturday, February 8th, 2025 at 4:33 AM, George Hartley hartley_george@proton.me wrote:

Sorry, I have to correct myself, as I spread some misinformation in my previous email.

The hard limit of 2 relays per IPv4 was bumped up to 8.

There were also several typos, as I was at work when writing that e-mail, i.e. under time pressure.

I hope I could help you anyway.

Best Regards,
-GH

On Friday, February 7th, 2025 at 12:22 PM, George Hartley via tor-relays tor-relays@lists.torproject.org wrote:

Hi there “usetor”,

I am going to answer a few of your questions:

“If a full IPv4 /24 Class C was available to host Tor relays, what are some optimal ways to allocate bandwidth, CPU cores and RAM to maximize utilization of the IPv4 /24 for Tor?”

With 2 IPv4 addreses per relay as a hard limit, the biggest bottleneck you will encounter is that most of Tor’s code-base is singe-threaded, except for maybe onionskin decryption and compression of files.

I used to host a Tor exit node on a single IPv4 address, which was running inside an encrypted ArchLinux VM through QEMU/KVM on our colocated dedicated server.

Here is the config I used for libvirtd: https://pastebin.com/cxSicEnN

I had the relay bandwidth limit using the following config:

BandwidthRate 75 MBits
BandwidthBurst 100 MBits

After starting up the relay for the first second, and waiting 2 weeks for the relay to get some traffic, it was using up 75-90 MBit/s constantly, or around 30TB per month.

To get the maximum out of my machine, I used the following config options:

NumCPUs 4
HardwareAccel 1

The second option made use of my CPU’s AES instruction, which should be available in all Intel and AMD server CPU’s made since the year 2011.

Even when doing 100MBit/s, the use of hardware accelerated AES only made the Tor process use ~30%, on an Intel Xeon E5-2620 running at only 2 GHz… without the bandwidth restrictions, I imagine it could have done 350MBit/s easily.

If a full 10 Gbps connection was available for Tor relays, how many CPU cores, RAM and IPv4 addresses would be required to saturate the 10 Gbps connection?"

Another user already calculated how much it would take to saturate 2GBit/s, so you can take it from there.

However I disagree with the memory limit of 512MB, is okay in my opinion but not less… you can achieve that by using the following config option:

MaxMemInQueues 1024MB

Same for a 20 Gbps connection, how many CPU cores, RAM and IPv4 addresses are required to saturate?

Look at my answer for question 2.

I also suggest you to use the seccomp syscall sandboxing options built into Tor:

Sandbox 1

Also, remember one very important thing: Make sure that your relays are located in a host, datacenter and country that is not already saturated with Tor nodes.

At last, thank you for running Tor nodes!

All the best,
-GH

On Monday, February 3rd, 2025 at 5:00 PM, usetor.wtf via tor-relays tor-relays@lists.torproject.org wrote:

Hi All,

Looking for guidance around running high performance Tor relays on Ubuntu.

Few questions:

If a full IPv4 /24 Class C was available to host Tor relays, what are some optimal ways to allocate bandwidth, CPU cores and RAM to maximize utilization of the IPv4 /24 for Tor?

If a full 10 Gbps connection was available for Tor relays, how many CPU cores, RAM and IPv4 addresses would be required to saturate the 10 Gbps connection?

Same for a 20 Gbps connection, how many CPU cores, RAM and IPv4 addresses are required to saturate?

Thanks!

Sent with Proton Mail secure email.

usetor.wtf · February 18, 2025, 4:00pm

Another question - what’s the most optimal count of Tor relays per IP when using an IPv4 /24, i.e. roughly 256 IPs?
Looking for thoughts / guidance as this can quickly be a costly endeavor with slow turn around times on securing data center capacity.

Current hypothesis is around 2 Tor Instances per 256 IPs for 512 relays at 5 MiB/s each needing 21 Gbps port speed. See details below.

Option 1: Is it 8 Tor instances per IP, the current maximum? 2048 total Tor instances across 256 IPs in /24? 1/4 of the current ~8000 running relays (~8200 relays bandwidth measured today)? Seems too many.
Example: At 256 IPs, 8 Tor instances per IP, average speed of 10 MiB/s per Tor relay, need roughly 172 Gbps, which is much less common, especially among volunteer Tor relays.

Option 2: Is it 1 Tor instance per IP, the minimum amount per IP? When Tor is blocked, it’s done by IP, so have 8 per IP is less efficient when 256 are available to spread out the relays and minimize blockage, unless the full /24 gets blocked?
Example: At 256 IPs, 1 Tor instances per IP, average speed of 10 MiB/s per Tor relay, need roughly 21 Gbps, which seems much more reasonable using 2 x 10 Gbps links on one node with ~256 cores or split across 2 nodes of each having 10 Gbps and 128 cores.

Option 3: Seems like the ideal would be however many can be utilized per available bandwidth?

Here’s a rough sizing table (attached and inline) of Port Speed in Gbps needed depending on # of available IPs, # of Tor instances per IPv4 and Speed per Tor (MiB/s).
Legend: <= 10 Gbps is green, <= 20 Gbps is yellow, and > 20 Gbps is red.

During the Fall of 2021, I saw ~15 MiB/s per Tor Instance and now I see around ~5 MiB/s per Tor Instance (no changes on my servers other than OS and Tor updates).

···

Sent with Proton Mail secure email.

On Monday, February 3rd, 2025 at 8:00 AM, usetor.wtf usetor.wtf@protonmail.com wrote:

Hi All,

Looking for guidance around running high performance Tor relays on Ubuntu.

Few questions:

If a full IPv4 /24 Class C was available to host Tor relays, what are some optimal ways to allocate bandwidth, CPU cores and RAM to maximize utilization of the IPv4 /24 for Tor?

If a full 10 Gbps connection was available for Tor relays, how many CPU cores, RAM and IPv4 addresses would be required to saturate the 10 Gbps connection?

Same for a 20 Gbps connection, how many CPU cores, RAM and IPv4 addresses are required to saturate?

Thanks!

Sent with Proton Mail secure email.

(attachments)

lists · February 18, 2025, 4:43pm

Another question - what's the most optimal count of Tor relays per IP when
using an IPv4 /24, i.e. roughly 256 IPs? Looking for thoughts / guidance as
this can quickly be a costly endeavor with slow turn around times on
securing data center capacity.

The number of IPs is unimportant.
CPU cores count and network bandwidth, fast cores, the fastest and best
cooling! The higher the CPU clock speed, the more MiB/s traffic per tor
instance.
Slam 60 tor instances onto a 64-core CPU (or 120 instances on 128 core) with
2x10 or 2x25G card and let it run for a few weeks. Then you will see if you
can create some more instances.
You also have to do DNS. PowerDNS + dnsdist is your friend with 2x10G or more.
Where do you do BGP on the server or router? Full table BGP need recources
too. You can't fully utilize a /24 with 6x 64 core servers on a 100G Router.

Current hypothesis is around 2 Tor Instances per 256 IPs for 512 relays at 5
MiB/s each needing 21 Gbps port speed. See details below.

Option 1: Is it 8 Tor instances per IP, the current maximum? 2048 total Tor
instances across 256 IPs in /24? 1/4 of the current ~8000 running relays
(~8200 relays bandwidth measured today)? Seems too many. Example: At 256
IPs, 8 Tor instances per IP, average speed of 10 MiB/s per Tor relay, need
roughly 172 Gbps, which is much less common, especially among volunteer Tor
relays.

Option 2: Is it 1 Tor instance per IP, the minimum amount per IP? When Tor
is blocked, it's done by IP, so have 8 per IP is less efficient when 256
are available to spread out the relays and minimize blockage, unless the
full /24 gets blocked? Example: At 256 IPs, 1 Tor instances per IP, average
speed of 10 MiB/s per Tor relay, need roughly 21 Gbps, which seems much
more reasonable using 2 x 10 Gbps links on one node with ~256 cores or
split across 2 nodes of each having 10 Gbps and 128 cores.

If you use a /24 for Tor exit traffic, it is completely blacklisted anyway. Stop
doing the math

···

On Tuesday, 18 February 2025 17:00 usetor.wtf via tor-relays wrote:

Option 3: Seems like the ideal would be however many can be utilized per
available bandwidth?

Here's a rough sizing table (attached and inline) of Port Speed in Gbps
needed depending on # of available IPs, # of Tor instances per IPv4 and
Speed per Tor (MiB/s). Legend: <= 10 Gbps is green, <= 20 Gbps is yellow,
and > 20 Gbps is red.

During the Fall of 2021, I saw ~15 MiB/s per Tor Instance and now I see
around ~5 MiB/s per Tor Instance (no changes on my servers other than OS
and Tor updates).

Current conclusion: I'm looking at the 256, 2, 512, 5, 2560, 21 row as where
I'll likely start. 512 is a lot of Tor instances... [image.png]

~8200 relays bandwidth measured today:
Consensus health

Sent with [Proton Mail](Proton Mail: Get a private, secure, and encrypted email account | Proton) secure email.

On Monday, February 3rd, 2025 at 8:00 AM, usetor.wtf <usetor.wtf@protonmail.com> wrote:
> Hi All,
>
> Looking for guidance around running high performance Tor relays on Ubuntu.
>
> Few questions:
> 1) If a full IPv4 /24 Class C was available to host Tor relays, what are
> some optimal ways to allocate bandwidth, CPU cores and RAM to maximize
> utilization of the IPv4 /24 for Tor?
>
> 2) If a full 10 Gbps connection was available for Tor relays, how many CPU
> cores, RAM and IPv4 addresses would be required to saturate the 10 Gbps
> connection?
>
> 3) Same for a 20 Gbps connection, how many CPU cores, RAM and IPv4
> addresses are required to saturate?
>
> Thanks!
>
> Sent with [Proton Mail](Proton Mail: Get a private, secure, and encrypted email account | Proton) secure email.

--
╰_╯ Ciao Marco!

Debian GNU/Linux

It's free software and it gives you freedom!

usetor.wtf · February 18, 2025, 4:32pm

Not sure the image worked for everybody, attempting an inline table with same information.

···

Sent with Proton Mail secure email.

On Tuesday, February 18th, 2025 at 8:00 AM, usetor.wtf via tor-relays tor-relays@lists.torproject.org wrote:

Another question - what’s the most optimal count of Tor relays per IP when using an IPv4 /24, i.e. roughly 256 IPs?
Looking for thoughts / guidance as this can quickly be a costly endeavor with slow turn around times on securing data center capacity.

Current hypothesis is around 2 Tor Instances per 256 IPs for 512 relays at 5 MiB/s each needing 21 Gbps port speed. See details below.

Option 1: Is it 8 Tor instances per IP, the current maximum? 2048 total Tor instances across 256 IPs in /24? 1/4 of the current ~8000 running relays (~8200 relays bandwidth measured today)? Seems too many.
Example: At 256 IPs, 8 Tor instances per IP, average speed of 10 MiB/s per Tor relay, need roughly 172 Gbps, which is much less common, especially among volunteer Tor relays.

Option 2: Is it 1 Tor instance per IP, the minimum amount per IP? When Tor is blocked, it’s done by IP, so have 8 per IP is less efficient when 256 are available to spread out the relays and minimize blockage, unless the full /24 gets blocked?
Example: At 256 IPs, 1 Tor instances per IP, average speed of 10 MiB/s per Tor relay, need roughly 21 Gbps, which seems much more reasonable using 2 x 10 Gbps links on one node with ~256 cores or split across 2 nodes of each having 10 Gbps and 128 cores.

Option 3: Seems like the ideal would be however many can be utilized per available bandwidth?

Here’s a rough sizing table (attached and inline) of Port Speed in Gbps needed depending on # of available IPs, # of Tor instances per IPv4 and Speed per Tor (MiB/s).
Legend: <= 10 Gbps is green, <= 20 Gbps is yellow, and > 20 Gbps is red.

During the Fall of 2021, I saw ~15 MiB/s per Tor Instance and now I see around ~5 MiB/s per Tor Instance (no changes on my servers other than OS and Tor updates).

Current conclusion: I’m looking at the 256, 2, 512, 5, 2560, 21 row as where I’ll likely start. 512 is a lot of Tor instances…

image.png667×903 73.1 KB

~8200 relays bandwidth measured today: https://consensus-health.torproject.org/graphs.html

Sent with Proton Mail secure email.

On Monday, February 3rd, 2025 at 8:00 AM, usetor.wtf usetor.wtf@protonmail.com wrote:

Hi All,

Looking for guidance around running high performance Tor relays on Ubuntu.

Few questions:

If a full IPv4 /24 Class C was available to host Tor relays, what are some optimal ways to allocate bandwidth, CPU cores and RAM to maximize utilization of the IPv4 /24 for Tor?

If a full 10 Gbps connection was available for Tor relays, how many CPU cores, RAM and IPv4 addresses would be required to saturate the 10 Gbps connection?

Same for a 20 Gbps connection, how many CPU cores, RAM and IPv4 addresses are required to saturate?

Thanks!

Sent with Proton Mail secure email.

Gurpinder · February 18, 2025, 4:48pm

cores how many 8
where are you getting them from ?

···

On Wed, 19 Feb 2025, 03:47 boldsuck via tor-relays, <tor-relays@lists.torproject.org> wrote:

On Tuesday, 18 February 2025 17:00 usetor.wtf via tor-relays wrote:

Another question - what’s the most optimal count of Tor relays per IP when
using an IPv4 /24, i.e. roughly 256 IPs? Looking for thoughts / guidance as
this can quickly be a costly endeavor with slow turn around times on
securing data center capacity.

The number of IPs is unimportant.
CPU cores count and network bandwidth, fast cores, the fastest and best
cooling! The higher the CPU clock speed, the more MiB/s traffic per tor
instance.
Slam 60 tor instances onto a 64-core CPU (or 120 instances on 128 core) with
2x10 or 2x25G card and let it run for a few weeks. Then you will see if you
can create some more instances.
You also have to do DNS. PowerDNS + dnsdist is your friend with 2x10G or more.
Where do you do BGP on the server or router? Full table BGP need recources
too. You can’t fully utilize a /24 with 6x 64 core servers on a 100G Router.

Current hypothesis is around 2 Tor Instances per 256 IPs for 512 relays at 5
MiB/s each needing 21 Gbps port speed. See details below.

Option 1: Is it 8 Tor instances per IP, the current maximum? 2048 total Tor
instances across 256 IPs in /24? 1/4 of the current ~8000 running relays
(~8200 relays bandwidth measured today)? Seems too many. Example: At 256
IPs, 8 Tor instances per IP, average speed of 10 MiB/s per Tor relay, need
roughly 172 Gbps, which is much less common, especially among volunteer Tor
relays.

Option 2: Is it 1 Tor instance per IP, the minimum amount per IP? When Tor
is blocked, it’s done by IP, so have 8 per IP is less efficient when 256
are available to spread out the relays and minimize blockage, unless the
full /24 gets blocked? Example: At 256 IPs, 1 Tor instances per IP, average
speed of 10 MiB/s per Tor relay, need roughly 21 Gbps, which seems much
more reasonable using 2 x 10 Gbps links on one node with ~256 cores or
split across 2 nodes of each having 10 Gbps and 128 cores.

If you use a /24 for Tor exit traffic, it is completely blacklisted anyway. Stop
doing the math

Option 3: Seems like the ideal would be however many can be utilized per
available bandwidth?

Here’s a rough sizing table (attached and inline) of Port Speed in Gbps
needed depending on # of available IPs, # of Tor instances per IPv4 and
Speed per Tor (MiB/s). Legend: <= 10 Gbps is green, <= 20 Gbps is yellow,
and > 20 Gbps is red.

During the Fall of 2021, I saw ~15 MiB/s per Tor Instance and now I see
around ~5 MiB/s per Tor Instance (no changes on my servers other than OS
and Tor updates).

Current conclusion: I’m looking at the 256, 2, 512, 5, 2560, 21 row as where
I’ll likely start. 512 is a lot of Tor instances… [image.png]

~8200 relays bandwidth measured today:
https://consensus-health.torproject.org/graphs.html

Sent with Proton Mail secure email.

On Monday, February 3rd, 2025 at 8:00 AM, usetor.wtf > <usetor.wtf@protonmail.com> wrote:

Hi All,

Looking for guidance around running high performance Tor relays on Ubuntu.

Few questions:

If a full IPv4 /24 Class C was available to host Tor relays, what are
some optimal ways to allocate bandwidth, CPU cores and RAM to maximize
utilization of the IPv4 /24 for Tor?

If a full 10 Gbps connection was available for Tor relays, how many CPU
cores, RAM and IPv4 addresses would be required to saturate the 10 Gbps
connection?

Same for a 20 Gbps connection, how many CPU cores, RAM and IPv4
addresses are required to saturate?

Thanks!

Sent with Proton Mail secure email.

–
╰_╯ Ciao Marco!

Debian GNU/Linux

It’s free software and it gives you freedom!_______________________________________________
tor-relays mailing list – tor-relays@lists.torproject.org
To unsubscribe send an email to tor-relays-leave@lists.torproject.org

Gurpinder · February 18, 2025, 5:17pm

11 ? your luck. keep posted

···

On Wed, 19 Feb 2025, 03:48 Gurpinder, <singaaa1983@gmail.com> wrote:

cores how many 8
where are you getting them from ?

On Wed, 19 Feb 2025, 03:47 boldsuck via tor-relays, <tor-relays@lists.torproject.org> wrote:

On Tuesday, 18 February 2025 17:00 usetor.wtf via tor-relays wrote:

Another question - what’s the most optimal count of Tor relays per IP when
using an IPv4 /24, i.e. roughly 256 IPs? Looking for thoughts / guidance as
this can quickly be a costly endeavor with slow turn around times on
securing data center capacity.

The number of IPs is unimportant.
CPU cores count and network bandwidth, fast cores, the fastest and best
cooling! The higher the CPU clock speed, the more MiB/s traffic per tor
instance.
Slam 60 tor instances onto a 64-core CPU (or 120 instances on 128 core) with
2x10 or 2x25G card and let it run for a few weeks. Then you will see if you
can create some more instances.
You also have to do DNS. PowerDNS + dnsdist is your friend with 2x10G or more.
Where do you do BGP on the server or router? Full table BGP need recources
too. You can’t fully utilize a /24 with 6x 64 core servers on a 100G Router.

Current hypothesis is around 2 Tor Instances per 256 IPs for 512 relays at 5
MiB/s each needing 21 Gbps port speed. See details below.

Option 1: Is it 8 Tor instances per IP, the current maximum? 2048 total Tor
instances across 256 IPs in /24? 1/4 of the current ~8000 running relays
(~8200 relays bandwidth measured today)? Seems too many. Example: At 256
IPs, 8 Tor instances per IP, average speed of 10 MiB/s per Tor relay, need
roughly 172 Gbps, which is much less common, especially among volunteer Tor
relays.

Option 2: Is it 1 Tor instance per IP, the minimum amount per IP? When Tor
is blocked, it’s done by IP, so have 8 per IP is less efficient when 256
are available to spread out the relays and minimize blockage, unless the
full /24 gets blocked? Example: At 256 IPs, 1 Tor instances per IP, average
speed of 10 MiB/s per Tor relay, need roughly 21 Gbps, which seems much
more reasonable using 2 x 10 Gbps links on one node with ~256 cores or
split across 2 nodes of each having 10 Gbps and 128 cores.

If you use a /24 for Tor exit traffic, it is completely blacklisted anyway. Stop
doing the math

Option 3: Seems like the ideal would be however many can be utilized per
available bandwidth?

Here’s a rough sizing table (attached and inline) of Port Speed in Gbps
needed depending on # of available IPs, # of Tor instances per IPv4 and
Speed per Tor (MiB/s). Legend: <= 10 Gbps is green, <= 20 Gbps is yellow,
and > 20 Gbps is red.

During the Fall of 2021, I saw ~15 MiB/s per Tor Instance and now I see
around ~5 MiB/s per Tor Instance (no changes on my servers other than OS
and Tor updates).

Current conclusion: I’m looking at the 256, 2, 512, 5, 2560, 21 row as where
I’ll likely start. 512 is a lot of Tor instances… [image.png]

~8200 relays bandwidth measured today:
https://consensus-health.torproject.org/graphs.html

Sent with Proton Mail secure email.

On Monday, February 3rd, 2025 at 8:00 AM, usetor.wtf > > <usetor.wtf@protonmail.com> wrote:

Hi All,

Looking for guidance around running high performance Tor relays on Ubuntu.

Few questions:

If a full IPv4 /24 Class C was available to host Tor relays, what are
some optimal ways to allocate bandwidth, CPU cores and RAM to maximize
utilization of the IPv4 /24 for Tor?

If a full 10 Gbps connection was available for Tor relays, how many CPU
cores, RAM and IPv4 addresses would be required to saturate the 10 Gbps
connection?

Same for a 20 Gbps connection, how many CPU cores, RAM and IPv4
addresses are required to saturate?

Thanks!

Sent with Proton Mail secure email.

–
╰_╯ Ciao Marco!

Debian GNU/Linux

It’s free software and it gives you freedom!_______________________________________________
tor-relays mailing list – tor-relays@lists.torproject.org
To unsubscribe send an email to tor-relays-leave@lists.torproject.org

George_Hartley · February 27, 2025, 6:19am

1.)

To not exceed 30GB per month, as we have to pay for every TB ourselves, since the dedicated server the VM runs on is colocated.

2.)

The VM simply had 4 cores, and really, you only need 2-4, beyond that you get diminishing returns as the Tor main loop is still single threaded.

3.) 1024MB was based on the actual memory used when the relay was under maximum, artificial traffic pressure (around 850MB total allocated).

4.) It was because of bandwidth limits - as said before, Tor is mostly single-threaded.

5.) No, the Sandbox is a syscall sandbox based on seccomp only, and checks syscall arguments only, the performance loss is absolutely negligible.

Thanks,
-GH

···

On Saturday, February 8th, 2025 at 6:17 PM, usetor.wtf usetor.wtf@protonmail.com wrote:

Appreciate the details!

Some questions to better understand:

Why did you limit relay bandwidth? How did you calculate the values to use for the limits?
“BandwidthRate 75 MBits
BandwidthBurst 100 MBits”

CPU - how did you decide to only use 4 out of 6 cores?
Why use 4 cores to 1 tor relay instead of 4 cores to 4 relays?
“NumCPUs 4”
“Xeon E5-2620”

Max Memory - why did you set this parameter and how did you decide the value?
I see older tickets / threads on this, ~6 years, but unsure what the latest is, i.e. https://archive.torproject.org/websites/lists.torproject.org/pipermail/tor-relays/2018-January/014014.html
“MaxMemInQueues 1024MB”

CPU Utilization - only seeing “~30%” was the result of the bandwidth restriction or memory restriction or 4 core restriction? Holding all else constant in your setup, do you know what would increase the CPU utilization the most: removing bandwidth restriction, memory restriction, or something else?

Sandbox 1 - does setting this value impact the performance, i.e. mitigation overheads, of the Tor relay?

Sent with Proton Mail secure email.

On Saturday, February 8th, 2025 at 4:33 AM, George Hartley hartley_george@proton.me wrote:

Sorry, I have to correct myself, as I spread some misinformation in my previous email.

The hard limit of 2 relays per IPv4 was bumped up to 8.

There were also several typos, as I was at work when writing that e-mail, i.e. under time pressure.

I hope I could help you anyway.

Best Regards,
-GH

On Friday, February 7th, 2025 at 12:22 PM, George Hartley via tor-relays tor-relays@lists.torproject.org wrote:

Hi there “usetor”,

I am going to answer a few of your questions:

“If a full IPv4 /24 Class C was available to host Tor relays, what are some optimal ways to allocate bandwidth, CPU cores and RAM to maximize utilization of the IPv4 /24 for Tor?”

With 2 IPv4 addreses per relay as a hard limit, the biggest bottleneck you will encounter is that most of Tor’s code-base is singe-threaded, except for maybe onionskin decryption and compression of files.

I used to host a Tor exit node on a single IPv4 address, which was running inside an encrypted ArchLinux VM through QEMU/KVM on our colocated dedicated server.

Here is the config I used for libvirtd: https://pastebin.com/cxSicEnN

I had the relay bandwidth limit using the following config:

BandwidthRate 75 MBits
BandwidthBurst 100 MBits

After starting up the relay for the first second, and waiting 2 weeks for the relay to get some traffic, it was using up 75-90 MBit/s constantly, or around 30TB per month.

To get the maximum out of my machine, I used the following config options:

NumCPUs 4
HardwareAccel 1

The second option made use of my CPU’s AES instruction, which should be available in all Intel and AMD server CPU’s made since the year 2011.

Even when doing 100MBit/s, the use of hardware accelerated AES only made the Tor process use ~30%, on an Intel Xeon E5-2620 running at only 2 GHz… without the bandwidth restrictions, I imagine it could have done 350MBit/s easily.

If a full 10 Gbps connection was available for Tor relays, how many CPU cores, RAM and IPv4 addresses would be required to saturate the 10 Gbps connection?"

Another user already calculated how much it would take to saturate 2GBit/s, so you can take it from there.

However I disagree with the memory limit of 512MB, is okay in my opinion but not less… you can achieve that by using the following config option:

MaxMemInQueues 1024MB

Same for a 20 Gbps connection, how many CPU cores, RAM and IPv4 addresses are required to saturate?

Look at my answer for question 2.

I also suggest you to use the seccomp syscall sandboxing options built into Tor:

Sandbox 1

Also, remember one very important thing: Make sure that your relays are located in a host, datacenter and country that is not already saturated with Tor nodes.

At last, thank you for running Tor nodes!

All the best,
-GH

On Monday, February 3rd, 2025 at 5:00 PM, usetor.wtf via tor-relays tor-relays@lists.torproject.org wrote:

Hi All,

Looking for guidance around running high performance Tor relays on Ubuntu.

Few questions:

If a full IPv4 /24 Class C was available to host Tor relays, what are some optimal ways to allocate bandwidth, CPU cores and RAM to maximize utilization of the IPv4 /24 for Tor?

If a full 10 Gbps connection was available for Tor relays, how many CPU cores, RAM and IPv4 addresses would be required to saturate the 10 Gbps connection?

Same for a 20 Gbps connection, how many CPU cores, RAM and IPv4 addresses are required to saturate?

Thanks!

Sent with Proton Mail secure email.

George_Hartley · February 27, 2025, 6:39am

I have to disagree with this statement by “mail@nothingtohide.nl”:

- only run middle relays on very high clocked CPUs (4-5 Ghz).

Using hardware AES acceleration, older CPU’s are fine.

For example, you can get Xeon E3-1231 v3 server CPU’s (LGA1150) for around $9,99 a piece, and they run at 3,4 GHz default, and have a boost clock of up to 3,8 GHz.

In a dual socket system, that’s 16 logical cores at 3,8 GHz.

Our colocated server only had a CPU with 2.00 GHz, and 2.50 GHz boost clock, yet when I ran my exit node without any rate limit, it could still do 350 MBit/s.

I assume the higher clocked Xeon that I mentioned could easily do 500 MBit/s per core and relay when using hardware AES acceleration.

https://2019.www.torproject.org/docs/tor-manual.html.en#HardwareAccel

So, 16 x 500 MBit/s = 8GBit/s.

If you were to deploy this exact machine, I would set NumCPUs to 2, so that compression/decompression and onionskin decryption operations won’t choke the main thread.

Also, I would rate limit each relay to 85-95% of it’s possible bandwidth, to still have CPU headroom for DNS and the aforementioned operations.

I never used a local DNS resolver, instead I relied on Cloudflare DNS using systemd-resolved with DNS over TLS and DNSSEC enabled - DNSSEC is very important, otherwise resolved will cache invalid DNS lookups.

All the best,
-GH

···

On Tuesday, February 18th, 2025 at 11:23 PM, mail— via tor-relays tor-relays@lists.torproject.org wrote:

Hi,

Many people already replied, but here are my (late) two cents.

If a full IPv4 /24 Class C was available to host Tor relays, what are some optimal ways to allocate bandwidth, CPU cores and RAM to maximize utilization of the IPv4 /24 for Tor?

“Optimal” depends on your preferences and goals. Some examples:

IP address efficiency: run 8 relays per IPv4 address.

Use the best ports: 256 relays (443) or 512 relays (443+80).

Lowest kernel/system congestion: 1 locked relay per core/SMT thread combination, ideally on high clocked CPUs.

Easiest to manage: as few relays as possible.

Memory efficiency: only run middle relays on very high clocked CPUs (4-5 Ghz).

Cost efficiency: run many relays on 1-2 generations old Epyc CPUs with a high core count (64 or more).

There are always constraints. The hardware/CPU/memory and bandwidth/routing capability available to you are probably not infinite. Also the Tor Project maximizes bandwidth contributions to 20% and 10% for exit relay and overall consensus weight respectively.

With 256 IP addresses on modern hardware, it will be very hard to not run in to one of these limitations long before you can make it ‘optimal’. Hardware wise, one modern/current gen high performance server only running exit relays will easily push enough Tor traffic to do more than half of the total exit bandwidth of the Tor network.

My advice would be:

Get the fastest/best hardware with current-ish generation CPU IPC capabilities you can get within your budget. To lower complexity with keeping congestion in control, one socket is easier to deal with than a dual socket system.

(tip for NIC: if your switch/router has 10 Gb/s or 25 Gb/s ports, get some of the older Mellanox cards. They are very stable (more so than their Intel counterparts in my experience) and extremely affordable nowadays because of all the organizations that throw away their digital sovereignty and privacy of their employees/users to move to the cloud).

Start with 1 Tor relay per physical core (ignoring SMT). When the Tor relays have ramped up (this takes 2-3 months for guard relays) and there still is considerable headroom on the CPU (Tor runs extremely poorly at scale sadly, so this would be my expectation) then move to 1 Tor relay per thread (SMT included).

(tip: already run/‘train’ some Tor relays with a very limited bandwidth (2 MB/s or something) parallel to your primary ones and pin them all to 1-2 cores to let them ramp up in parallel to your primary ones. This makes it much less cumbersome to scale up your Tor contribution when you need/want/can do that in the future).

Assume at least 1 GB of RAM per relay on modern CPUs + 32 GB additionally for OS, DNS, networking and to have some headroom for DoS attacks. This may sound high, especially considering the advice in the Tor documentation. But on modern CPUs (especially with a high clockspeed) guard relays can use a lot more than 512 MB of RAM, especially when they are getting attacked. Middle and exit relays require less RAM.

Don’t skimp out on system memory capacity. DDR4 RDIMMs with decent clockspeeds are so cheap nowadays. For reference: we ran our smaller Tor servers (16C@3.4Ghz) with 64 GB of RAM and had to upgrade it to 128 GB because during attacks RAM usage exceeded the amount available and killed processes.

If you have the IP space available, use one IPv4 address per relay and use all the good ports such as 443. If IP addresses are more scarce, it’s also not bad to run 4 or 8 relays per IP address. Especially for middle and exit relays the port doesn’t matter (much). Guard relays should ideally always run on a generally used (and generally unblocked) port.

If a full 10 Gbps connection was available for Tor relays, how many CPU cores, RAM and IPv4 addresses would be required to saturate the 10 Gbps connection?

That greatly depends on the CPU and your configuration. I can offer 3 references based on real world examples. They all run a mix of guard/middle/exit relays.

Typical low core count (16+SMT) with higher clockspeed (3.4 Ghz) saturates a 10 Gb/s connection with ~18.5 physical cores + SMT.

Typical higher core count (64+SMT) with lower clockspeed (2.25 Ghz) saturates a 10 Gb/s connection with ~31.5 physical cores + SMT.

Typical energy efficient/low performance CPU with low core count (16) with very low clockspeed (2.0 Ghz) used often in networking appliances saturates a 10 Gb/s connection with ~75 physical cores (note: no SMT).

The amount of IP addresses required also depends on multiple factors. But I’d say that you would need between the amount and double the amount of relays of the mentioned core+SMT count in order to saturate 10 Gb/s. This would be 37-74, 63-126 and 75-150 relays respectively. So between 5 and 19 IPv4 addresses would be required at minimum, depending on CPU performance level.

RAM wise the more relays you run, the more RAM overhead you will have. So in general it’s better to run less relays at a higher speed each than run many at a low clock speed. But since Tor scales so badly you need more Relays anyway so optimizing this isn’t easy in practice.

Same for a 20 Gbps connection, how many CPU cores, RAM and IPv4 addresses are required to saturate?

Double the amount compared to 10 Gb/s.

Good luck with your Tor adventure. And let us know your findings with achieving 10 Gb/s when you get there :-).

Cheers,

tornth

Feb 3, 2025, 18:14 by tor-relays@lists.torproject.org:

Hi All,

Looking for guidance around running high performance Tor relays on Ubuntu.

Few questions:

If a full IPv4 /24 Class C was available to host Tor relays, what are some optimal ways to allocate bandwidth, CPU cores and RAM to maximize utilization of the IPv4 /24 for Tor?

If a full 10 Gbps connection was available for Tor relays, how many CPU cores, RAM and IPv4 addresses would be required to saturate the 10 Gbps connection?

Same for a 20 Gbps connection, how many CPU cores, RAM and IPv4 addresses are required to saturate?

Thanks!

Sent with Proton Mail secure email.

Tor_at_1AEO · March 18, 2025, 8:30am

Somewhat of a surprise based on the 2-4x RAM to core/threads/relay ratio in this email thread, ran out of 256GB RAM with 66 Tor relays (roughly ~4x ratio).

Something misconfigured or this expected as part of the relay ramping up behavior or just regular relay behavior?

Summary: 64 cores / 128 threads (EPYC 7702P) running 66 Tor relays (all middle/guard), Tor version 0.4.8.14, used 256GB RAM and went to swap, on a 10 Gbps unmetered connection.

Details:

~30 relays are 30 days old and within 24 hours of adding ~30 new relays, used up 256GB RAM.
Average Advertised Bandwidth is ~3 MiB/s per 33 relays and the other 33 are unmeasured / not listed advertised bandwidth yet.
Swap was set at default 8GB and maxed out. Changed to 256G temporarily. Swap usage is slowly climbing and reached 20G within first hour of increasing size.
Ubuntu 24.04.2 LTS default server install.
Nothing else running.
Will upgrade RAM this server to 768GB within next few days

Have a different server, with 88 cores/threads and ~88 relays, all less than 30 days old, hovering around 240GB RAM, same average advertised bandwidth of ~3 MiB/s per half the relays, but have already upgraded RAM to 384GB and plan to take it to 512GB within the next week or two. Same Ubuntu and Tor versions and software configuration.

Will keep sharing back as more relays and servers ramp up traffic.

···

On Tuesday, February 18th, 2025 at 11:23 PM, mail— via tor-relays tor-relays@lists.torproject.org wrote:

Hi,

Many people already replied, but here are my (late) two cents.

If a full IPv4 /24 Class C was available to host Tor relays, what are some optimal ways to allocate bandwidth, CPU cores and RAM to maximize utilization of the IPv4 /24 for Tor?

“Optimal” depends on your preferences and goals. Some examples:

IP address efficiency: run 8 relays per IPv4 address.

Use the best ports: 256 relays (443) or 512 relays (443+80).

Lowest kernel/system congestion: 1 locked relay per core/SMT thread combination, ideally on high clocked CPUs.

Easiest to manage: as few relays as possible.

Memory efficiency: only run middle relays on very high clocked CPUs (4-5 Ghz).

Cost efficiency: run many relays on 1-2 generations old Epyc CPUs with a high core count (64 or more).

There are always constraints. The hardware/CPU/memory and bandwidth/routing capability available to you are probably not infinite. Also the Tor Project maximizes bandwidth contributions to 20% and 10% for exit relay and overall consensus weight respectively.

With 256 IP addresses on modern hardware, it will be very hard to not run in to one of these limitations long before you can make it ‘optimal’. Hardware wise, one modern/current gen high performance server only running exit relays will easily push enough Tor traffic to do more than half of the total exit bandwidth of the Tor network.

My advice would be:

Get the fastest/best hardware with current-ish generation CPU IPC capabilities you can get within your budget. To lower complexity with keeping congestion in control, one socket is easier to deal with than a dual socket system.

(tip for NIC: if your switch/router has 10 Gb/s or 25 Gb/s ports, get some of the older Mellanox cards. They are very stable (more so than their Intel counterparts in my experience) and extremely affordable nowadays because of all the organizations that throw away their digital sovereignty and privacy of their employees/users to move to the cloud).

Start with 1 Tor relay per physical core (ignoring SMT). When the Tor relays have ramped up (this takes 2-3 months for guard relays) and there still is considerable headroom on the CPU (Tor runs extremely poorly at scale sadly, so this would be my expectation) then move to 1 Tor relay per thread (SMT included).

(tip: already run/‘train’ some Tor relays with a very limited bandwidth (2 MB/s or something) parallel to your primary ones and pin them all to 1-2 cores to let them ramp up in parallel to your primary ones. This makes it much less cumbersome to scale up your Tor contribution when you need/want/can do that in the future).

Assume at least 1 GB of RAM per relay on modern CPUs + 32 GB additionally for OS, DNS, networking and to have some headroom for DoS attacks. This may sound high, especially considering the advice in the Tor documentation. But on modern CPUs (especially with a high clockspeed) guard relays can use a lot more than 512 MB of RAM, especially when they are getting attacked. Middle and exit relays require less RAM.

Don’t skimp out on system memory capacity. DDR4 RDIMMs with decent clockspeeds are so cheap nowadays. For reference: we ran our smaller Tor servers (16C@3.4Ghz) with 64 GB of RAM and had to upgrade it to 128 GB because during attacks RAM usage exceeded the amount available and killed processes.

If you have the IP space available, use one IPv4 address per relay and use all the good ports such as 443. If IP addresses are more scarce, it’s also not bad to run 4 or 8 relays per IP address. Especially for middle and exit relays the port doesn’t matter (much). Guard relays should ideally always run on a generally used (and generally unblocked) port.

If a full 10 Gbps connection was available for Tor relays, how many CPU cores, RAM and IPv4 addresses would be required to saturate the 10 Gbps connection?

That greatly depends on the CPU and your configuration. I can offer 3 references based on real world examples. They all run a mix of guard/middle/exit relays.

Typical low core count (16+SMT) with higher clockspeed (3.4 Ghz) saturates a 10 Gb/s connection with ~18.5 physical cores + SMT.

Typical higher core count (64+SMT) with lower clockspeed (2.25 Ghz) saturates a 10 Gb/s connection with ~31.5 physical cores + SMT.

Typical energy efficient/low performance CPU with low core count (16) with very low clockspeed (2.0 Ghz) used often in networking appliances saturates a 10 Gb/s connection with ~75 physical cores (note: no SMT).

The amount of IP addresses required also depends on multiple factors. But I’d say that you would need between the amount and double the amount of relays of the mentioned core+SMT count in order to saturate 10 Gb/s. This would be 37-74, 63-126 and 75-150 relays respectively. So between 5 and 19 IPv4 addresses would be required at minimum, depending on CPU performance level.

RAM wise the more relays you run, the more RAM overhead you will have. So in general it’s better to run less relays at a higher speed each than run many at a low clock speed. But since Tor scales so badly you need more Relays anyway so optimizing this isn’t easy in practice.

Same for a 20 Gbps connection, how many CPU cores, RAM and IPv4 addresses are required to saturate?

Double the amount compared to 10 Gb/s.

Good luck with your Tor adventure. And let us know your findings with achieving 10 Gb/s when you get there :-).

Cheers,

tornth

Feb 3, 2025, 18:14 by tor-relays@lists.torproject.org:

Hi All,

Looking for guidance around running high performance Tor relays on Ubuntu.

Few questions:

If a full IPv4 /24 Class C was available to host Tor relays, what are some optimal ways to allocate bandwidth, CPU cores and RAM to maximize utilization of the IPv4 /24 for Tor?

If a full 10 Gbps connection was available for Tor relays, how many CPU cores, RAM and IPv4 addresses would be required to saturate the 10 Gbps connection?

Same for a 20 Gbps connection, how many CPU cores, RAM and IPv4 addresses are required to saturate?

Thanks!

Sent with Proton Mail secure email.

George_Hartley · March 18, 2025, 5:29pm

Hi, I don’t have much time but…

Try to use zwap / compressed swap, and lower your MaxMemInQueues setting as I recommended already.

Regards,
-GH

···

On Tuesday, March 18th, 2025 at 9:30 AM, Tor at 1AEO via tor-relays tor-relays@lists.torproject.org wrote:

Somewhat of a surprise based on the 2-4x RAM to core/threads/relay ratio in this email thread, ran out of 256GB RAM with 66 Tor relays (roughly ~4x ratio).

Something misconfigured or this expected as part of the relay ramping up behavior or just regular relay behavior?

Summary: 64 cores / 128 threads (EPYC 7702P) running 66 Tor relays (all middle/guard), Tor version 0.4.8.14, used 256GB RAM and went to swap, on a 10 Gbps unmetered connection.

Details:

~30 relays are 30 days old and within 24 hours of adding ~30 new relays, used up 256GB RAM.

Average Advertised Bandwidth is ~3 MiB/s per 33 relays and the other 33 are unmeasured / not listed advertised bandwidth yet.

Swap was set at default 8GB and maxed out. Changed to 256G temporarily. Swap usage is slowly climbing and reached 20G within first hour of increasing size.

Ubuntu 24.04.2 LTS default server install.

Nothing else running.

Will upgrade RAM this server to 768GB within next few days

Have a different server, with 88 cores/threads and ~88 relays, all less than 30 days old, hovering around 240GB RAM, same average advertised bandwidth of ~3 MiB/s per half the relays, but have already upgraded RAM to 384GB and plan to take it to 512GB within the next week or two. Same Ubuntu and Tor versions and software configuration.

Will keep sharing back as more relays and servers ramp up traffic.

On Tuesday, February 18th, 2025 at 11:23 PM, mail— via tor-relays tor-relays@lists.torproject.org wrote:

Hi,

Many people already replied, but here are my (late) two cents.

If a full IPv4 /24 Class C was available to host Tor relays, what are some optimal ways to allocate bandwidth, CPU cores and RAM to maximize utilization of the IPv4 /24 for Tor?

“Optimal” depends on your preferences and goals. Some examples:

IP address efficiency: run 8 relays per IPv4 address.

Use the best ports: 256 relays (443) or 512 relays (443+80).

Lowest kernel/system congestion: 1 locked relay per core/SMT thread combination, ideally on high clocked CPUs.

Easiest to manage: as few relays as possible.

Memory efficiency: only run middle relays on very high clocked CPUs (4-5 Ghz).

Cost efficiency: run many relays on 1-2 generations old Epyc CPUs with a high core count (64 or more).

There are always constraints. The hardware/CPU/memory and bandwidth/routing capability available to you are probably not infinite. Also the Tor Project maximizes bandwidth contributions to 20% and 10% for exit relay and overall consensus weight respectively.

With 256 IP addresses on modern hardware, it will be very hard to not run in to one of these limitations long before you can make it ‘optimal’. Hardware wise, one modern/current gen high performance server only running exit relays will easily push enough Tor traffic to do more than half of the total exit bandwidth of the Tor network.

My advice would be:

Get the fastest/best hardware with current-ish generation CPU IPC capabilities you can get within your budget. To lower complexity with keeping congestion in control, one socket is easier to deal with than a dual socket system.

(tip for NIC: if your switch/router has 10 Gb/s or 25 Gb/s ports, get some of the older Mellanox cards. They are very stable (more so than their Intel counterparts in my experience) and extremely affordable nowadays because of all the organizations that throw away their digital sovereignty and privacy of their employees/users to move to the cloud).

Start with 1 Tor relay per physical core (ignoring SMT). When the Tor relays have ramped up (this takes 2-3 months for guard relays) and there still is considerable headroom on the CPU (Tor runs extremely poorly at scale sadly, so this would be my expectation) then move to 1 Tor relay per thread (SMT included).

(tip: already run/‘train’ some Tor relays with a very limited bandwidth (2 MB/s or something) parallel to your primary ones and pin them all to 1-2 cores to let them ramp up in parallel to your primary ones. This makes it much less cumbersome to scale up your Tor contribution when you need/want/can do that in the future).

Assume at least 1 GB of RAM per relay on modern CPUs + 32 GB additionally for OS, DNS, networking and to have some headroom for DoS attacks. This may sound high, especially considering the advice in the Tor documentation. But on modern CPUs (especially with a high clockspeed) guard relays can use a lot more than 512 MB of RAM, especially when they are getting attacked. Middle and exit relays require less RAM.

Don’t skimp out on system memory capacity. DDR4 RDIMMs with decent clockspeeds are so cheap nowadays. For reference: we ran our smaller Tor servers (16C@3.4Ghz) with 64 GB of RAM and had to upgrade it to 128 GB because during attacks RAM usage exceeded the amount available and killed processes.

If you have the IP space available, use one IPv4 address per relay and use all the good ports such as 443. If IP addresses are more scarce, it’s also not bad to run 4 or 8 relays per IP address. Especially for middle and exit relays the port doesn’t matter (much). Guard relays should ideally always run on a generally used (and generally unblocked) port.

If a full 10 Gbps connection was available for Tor relays, how many CPU cores, RAM and IPv4 addresses would be required to saturate the 10 Gbps connection?

That greatly depends on the CPU and your configuration. I can offer 3 references based on real world examples. They all run a mix of guard/middle/exit relays.

Typical low core count (16+SMT) with higher clockspeed (3.4 Ghz) saturates a 10 Gb/s connection with ~18.5 physical cores + SMT.

Typical higher core count (64+SMT) with lower clockspeed (2.25 Ghz) saturates a 10 Gb/s connection with ~31.5 physical cores + SMT.

Typical energy efficient/low performance CPU with low core count (16) with very low clockspeed (2.0 Ghz) used often in networking appliances saturates a 10 Gb/s connection with ~75 physical cores (note: no SMT).

The amount of IP addresses required also depends on multiple factors. But I’d say that you would need between the amount and double the amount of relays of the mentioned core+SMT count in order to saturate 10 Gb/s. This would be 37-74, 63-126 and 75-150 relays respectively. So between 5 and 19 IPv4 addresses would be required at minimum, depending on CPU performance level.

RAM wise the more relays you run, the more RAM overhead you will have. So in general it’s better to run less relays at a higher speed each than run many at a low clock speed. But since Tor scales so badly you need more Relays anyway so optimizing this isn’t easy in practice.

Same for a 20 Gbps connection, how many CPU cores, RAM and IPv4 addresses are required to saturate?

Double the amount compared to 10 Gb/s.

Good luck with your Tor adventure. And let us know your findings with achieving 10 Gb/s when you get there :-).

Cheers,

tornth

Feb 3, 2025, 18:14 by tor-relays@lists.torproject.org:

Hi All,

Looking for guidance around running high performance Tor relays on Ubuntu.

Few questions:

If a full IPv4 /24 Class C was available to host Tor relays, what are some optimal ways to allocate bandwidth, CPU cores and RAM to maximize utilization of the IPv4 /24 for Tor?

If a full 10 Gbps connection was available for Tor relays, how many CPU cores, RAM and IPv4 addresses would be required to saturate the 10 Gbps connection?

Same for a 20 Gbps connection, how many CPU cores, RAM and IPv4 addresses are required to saturate?

Thanks!

Sent with Proton Mail secure email.

Tor_at_1AEO · March 18, 2025, 5:31pm

Listed general information below. What other information is helpful?

Didn’t want to log but seems will need something to troubleshoot issues. Will work on metricsport, prometheus and grafana.
From a quick glance through htop sorting by memory and restarting the 7 relays at the top, ~4GB of RAM frees up (3 in RAM and 1 in Swap) per relay.

66 Tor relays (all middle/guard). All less than 40 days old.

···

General Setup:
EPYC 7702P.

256GB RAM

Software Versions:
Tor version 0.4.8.14

Ubuntu OS 24.04.2

Tor configuration:
SOCKSPort 0

ControlPort xxx.xxx.xxx.xxx:xxx
HashedControlPassword xxx

ORPort xxx.xxx.xxx.xxx:xxxxx
Address xxx.xxx.xxx.xxx
OutboundBindAddress xxx.xxx.xxx.xxx

Nickname …

ContactInfo …

MyFamily…
ExitPolicy reject :

On Tuesday, March 18th, 2025 at 3:44 AM, mail— via tor-relays tor-relays@lists.torproject.org wrote:

Hi,

To be honest I think something might be wrong. Maybe some memory leak or another issue because 66 relays with such low bandwidth shouldn’t even come close to a memory footprint of 256 GB. We use different operating systems and most likely also slightly different relay configurations, but our relays use significantly less memory. You shouldn’t need more than 128 GB of memory for ~10 Gb/s of Tor traffic, although 256 GB is recommended for some headroom for attacks and spikes and such.

Could you share your general setup, software versions and Tor configuration? Perhaps someone on this mailinglist will be able to help you. Also using node_exporter and MetricsPort (for example with a few Grafana dashboards) would probably yield valuable information about this excessive memory footprint. For example: if the memory footprint increases linearly over time, it might be some software memory leak that requires a fix.

Cheers,

tornth

Mar 18, 2025, 10:21 by tor-relays@lists.torproject.org:

Somewhat of a surprise based on the 2-4x RAM to core/threads/relay ratio in this email thread, ran out of 256GB RAM with 66 Tor relays (roughly ~4x ratio).

Something misconfigured or this expected as part of the relay ramping up behavior or just regular relay behavior?

Summary: 64 cores / 128 threads (EPYC 7702P) running 66 Tor relays (all middle/guard), Tor version 0.4.8.14, used 256GB RAM and went to swap, on a 10 Gbps unmetered connection.

Details:

~30 relays are 30 days old and within 24 hours of adding ~30 new relays, used up 256GB RAM.

Average Advertised Bandwidth is ~3 MiB/s per 33 relays and the other 33 are unmeasured / not listed advertised bandwidth yet.

Swap was set at default 8GB and maxed out. Changed to 256G temporarily. Swap usage is slowly climbing and reached 20G within first hour of increasing size.

Ubuntu 24.04.2 LTS default server install.

Nothing else running.

Will upgrade RAM this server to 768GB within next few days

Have a different server, with 88 cores/threads and ~88 relays, all less than 30 days old, hovering around 240GB RAM, same average advertised bandwidth of ~3 MiB/s per half the relays, but have already upgraded RAM to 384GB and plan to take it to 512GB within the next week or two. Same Ubuntu and Tor versions and software configuration.

Will keep sharing back as more relays and servers ramp up traffic.

On Tuesday, February 18th, 2025 at 11:23 PM, mail— via tor-relays tor-relays@lists.torproject.org wrote:

Hi,

Many people already replied, but here are my (late) two cents.

If a full IPv4 /24 Class C was available to host Tor relays, what are some optimal ways to allocate bandwidth, CPU cores and RAM to maximize utilization of the IPv4 /24 for Tor?

“Optimal” depends on your preferences and goals. Some examples:

IP address efficiency: run 8 relays per IPv4 address.

Use the best ports: 256 relays (443) or 512 relays (443+80).

Lowest kernel/system congestion: 1 locked relay per core/SMT thread combination, ideally on high clocked CPUs.

Easiest to manage: as few relays as possible.

Memory efficiency: only run middle relays on very high clocked CPUs (4-5 Ghz).

Cost efficiency: run many relays on 1-2 generations old Epyc CPUs with a high core count (64 or more).

There are always constraints. The hardware/CPU/memory and bandwidth/routing capability available to you are probably not infinite. Also the Tor Project maximizes bandwidth contributions to 20% and 10% for exit relay and overall consensus weight respectively.

With 256 IP addresses on modern hardware, it will be very hard to not run in to one of these limitations long before you can make it ‘optimal’. Hardware wise, one modern/current gen high performance server only running exit relays will easily push enough Tor traffic to do more than half of the total exit bandwidth of the Tor network.

My advice would be:

Get the fastest/best hardware with current-ish generation CPU IPC capabilities you can get within your budget. To lower complexity with keeping congestion in control, one socket is easier to deal with than a dual socket system.

(tip for NIC: if your switch/router has 10 Gb/s or 25 Gb/s ports, get some of the older Mellanox cards. They are very stable (more so than their Intel counterparts in my experience) and extremely affordable nowadays because of all the organizations that throw away their digital sovereignty and privacy of their employees/users to move to the cloud).

Start with 1 Tor relay per physical core (ignoring SMT). When the Tor relays have ramped up (this takes 2-3 months for guard relays) and there still is considerable headroom on the CPU (Tor runs extremely poorly at scale sadly, so this would be my expectation) then move to 1 Tor relay per thread (SMT included).

(tip: already run/‘train’ some Tor relays with a very limited bandwidth (2 MB/s or something) parallel to your primary ones and pin them all to 1-2 cores to let them ramp up in parallel to your primary ones. This makes it much less cumbersome to scale up your Tor contribution when you need/want/can do that in the future).

Assume at least 1 GB of RAM per relay on modern CPUs + 32 GB additionally for OS, DNS, networking and to have some headroom for DoS attacks. This may sound high, especially considering the advice in the Tor documentation. But on modern CPUs (especially with a high clockspeed) guard relays can use a lot more than 512 MB of RAM, especially when they are getting attacked. Middle and exit relays require less RAM.

Don’t skimp out on system memory capacity. DDR4 RDIMMs with decent clockspeeds are so cheap nowadays. For reference: we ran our smaller Tor servers (16C@3.4Ghz) with 64 GB of RAM and had to upgrade it to 128 GB because during attacks RAM usage exceeded the amount available and killed processes.

If you have the IP space available, use one IPv4 address per relay and use all the good ports such as 443. If IP addresses are more scarce, it’s also not bad to run 4 or 8 relays per IP address. Especially for middle and exit relays the port doesn’t matter (much). Guard relays should ideally always run on a generally used (and generally unblocked) port.

If a full 10 Gbps connection was available for Tor relays, how many CPU cores, RAM and IPv4 addresses would be required to saturate the 10 Gbps connection?

That greatly depends on the CPU and your configuration. I can offer 3 references based on real world examples. They all run a mix of guard/middle/exit relays.

Typical low core count (16+SMT) with higher clockspeed (3.4 Ghz) saturates a 10 Gb/s connection with ~18.5 physical cores + SMT.

Typical higher core count (64+SMT) with lower clockspeed (2.25 Ghz) saturates a 10 Gb/s connection with ~31.5 physical cores + SMT.

Typical energy efficient/low performance CPU with low core count (16) with very low clockspeed (2.0 Ghz) used often in networking appliances saturates a 10 Gb/s connection with ~75 physical cores (note: no SMT).

The amount of IP addresses required also depends on multiple factors. But I’d say that you would need between the amount and double the amount of relays of the mentioned core+SMT count in order to saturate 10 Gb/s. This would be 37-74, 63-126 and 75-150 relays respectively. So between 5 and 19 IPv4 addresses would be required at minimum, depending on CPU performance level.

RAM wise the more relays you run, the more RAM overhead you will have. So in general it’s better to run less relays at a higher speed each than run many at a low clock speed. But since Tor scales so badly you need more Relays anyway so optimizing this isn’t easy in practice.

Same for a 20 Gbps connection, how many CPU cores, RAM and IPv4 addresses are required to saturate?

Double the amount compared to 10 Gb/s.

Good luck with your Tor adventure. And let us know your findings with achieving 10 Gb/s when you get there :-).

Cheers,

tornth

Feb 3, 2025, 18:14 by tor-relays@lists.torproject.org:

Hi All,

Looking for guidance around running high performance Tor relays on Ubuntu.

Few questions:

If a full IPv4 /24 Class C was available to host Tor relays, what are some optimal ways to allocate bandwidth, CPU cores and RAM to maximize utilization of the IPv4 /24 for Tor?

If a full 10 Gbps connection was available for Tor relays, how many CPU cores, RAM and IPv4 addresses would be required to saturate the 10 Gbps connection?

Same for a 20 Gbps connection, how many CPU cores, RAM and IPv4 addresses are required to saturate?

Thanks!

Sent with Proton Mail secure email.

lists · March 22, 2025, 6:52pm

Listed general information below. What other information is helpful?

Didn't want to log but seems will need something to troubleshoot issues.
Will work on metricsport, prometheus and grafana. From a quick glance
through htop sorting by memory and restarting the 7 relays at the top, ~4GB
of RAM frees up (3 in RAM and 1 in Swap) per relay.

66 Tor relays (all middle/guard). All less than 40 days old.

I have 2x10G:
80 instances (40 guards/40 bridges) using 130G RAM

Welcome to the Internet, I suspect DDoS. Do you have ip/nftables?

With routed IPs I have no conntrack / no table filter

On other 1G servers I have dynamic NFT rules.

Search for information about YOUR 10G network card driver. e.g:
pre-up /sbin/ethtool commands may be required.

General Setup:
EPYC 7702P.

256GB RAM

Software Versions:
Tor version 0.4.8.14
Ubuntu OS 24.04.2

Tor configuration:
SOCKSPort 0

SocksPolicy reject *
^
# I'm paranoid and I don't need ControlPort
ControlPort 0

ControlPort xxx.xxx.xxx.xxx:xxx
HashedControlPassword xxx

ORPort xxx.xxx.xxx.xxx:xxxxx
Address xxx.xxx.xxx.xxx
OutboundBindAddress xxx.xxx.xxx.xxx

RelayBandwidthRate 100 MBytes
RelayBandwidthBurst 200 MBytes

Nickname ...

ContactInfo ...

MyFamily...
ExitPolicy reject *:*

Only a hint if you have several dozen relays, you can use one file. eg:

## Include MyFamily & ContactInfo
%include /etc/tor/torrc.all
## Include Exit Policy
%include /etc/tor/torrc.exit

···

On Tuesday, 18 March 2025 18:31 Tor at 1AEO via tor-relays wrote:

On Tuesday, March 18th, 2025 at 3:44 AM, mail--- via tor-relays <tor-relays@lists.torproject.org> wrote:
> Hi,
>
> To be honest I think something might be wrong. Maybe some memory leak or
> another issue because 66 relays with such low bandwidth shouldn't even
> come close to a memory footprint of 256 GB. We use different operating
> systems and most likely also slightly different relay configurations, but
> our relays use *significantly* less memory. You shouldn't need more than
> 128 GB of memory for ~10 Gb/s of Tor traffic, although 256 GB is
> recommended for some headroom for attacks and spikes and such.
>
> Could you share your general setup, software versions and Tor
> configuration? Perhaps someone on this mailinglist will be able to help
> you. Also using node_exporter and MetricsPort (for example with a few
> Grafana dashboards) would probably yield valuable information about this
> excessive memory footprint. For example: if the memory footprint
> increases linearly over time, it might be some software memory leak that
> requires a fix.
>
> Cheers,
>
> tornth
>
> Mar 18, 2025, 10:21 by tor-relays@lists.torproject.org:
>> Somewhat of a surprise based on the 2-4x RAM to core/threads/relay ratio
>> in this email thread, ran out of 256GB RAM with 66 Tor relays (roughly
>> ~4x ratio).
>>
>> Something misconfigured or this expected as part of the relay ramping up
>> behavior or just regular relay behavior?
>>
>> Summary: 64 cores / 128 threads (EPYC 7702P) running 66 Tor relays (all
>> middle/guard), Tor version 0.4.8.14, used 256GB RAM and went to swap, on
>> a 10 Gbps unmetered connection.
>>
>> Details:
>>
>> - ~30 relays are 30 days old and within 24 hours of adding ~30 new
>> relays, used up 256GB RAM.
>>
>> - Average Advertised Bandwidth is ~3 MiB/s per 33 relays and the other 33
>> are unmeasured / not listed advertised bandwidth yet.
>>
>> - Swap was set at default 8GB and maxed out. Changed to 256G temporarily.
>> Swap usage is slowly climbing and reached 20G within first hour of
>> increasing size.
>>
>> - Ubuntu 24.04.2 LTS default server install.
>>
>> - Nothing else running.
>>
>> - Will upgrade RAM this server to 768GB within next few days
>>
>> Have a different server, with 88 cores/threads and ~88 relays, all less
>> than 30 days old, hovering around 240GB RAM, same average advertised
>> bandwidth of ~3 MiB/s per half the relays, but have already upgraded RAM
>> to 384GB and plan to take it to 512GB within the next week or two. Same
>> Ubuntu and Tor versions and software configuration.
>>
>> Will keep sharing back as more relays and servers ramp up traffic.
>>
>>> On Tuesday, February 18th, 2025 at 11:23 PM, mail--- via tor-relays <tor-relays@lists.torproject.org> wrote:
>>>> Hi,
>>>>
>>>> Many people already replied, but here are my (late) two cents.
>>>>
>>>>> 1) If a full IPv4 /24 Class C was available to host Tor relays, what
>>>>> are some optimal ways to allocate bandwidth, CPU cores and RAM to
>>>>> maximize utilization of the IPv4 /24 for Tor?>>>>
>>>> "Optimal" depends on your preferences and goals. Some examples:
>>>>
>>>> - IP address efficiency: run 8 relays per IPv4 address.
>>>> - Use the best ports: 256 relays (443) or 512 relays (443+80).
>>>> - Lowest kernel/system congestion: 1 locked relay per core/SMT thread
>>>> combination, ideally on high clocked CPUs. - Easiest to manage: as few
>>>> relays as possible.
>>>> - Memory efficiency: only run middle relays on very high clocked CPUs
>>>> (4-5 Ghz). - Cost efficiency: run many relays on 1-2 generations old
>>>> Epyc CPUs with a high core count (64 or more).
>>>>
>>>> There are always constraints. The hardware/CPU/memory and
>>>> bandwidth/routing capability available to you are probably not
>>>> infinite. Also the Tor Project maximizes bandwidth contributions to
>>>> 20% and 10% for exit relay and overall consensus weight respectively.
>>>>
>>>> With 256 IP addresses on modern hardware, it will be very hard to not
>>>> run in to one of these limitations long before you can make it
>>>> 'optimal'. Hardware wise, one modern/current gen high performance
>>>> server only running exit relays will easily push enough Tor traffic to
>>>> do more than half of the total exit bandwidth of the Tor network.
>>>>
>>>> My advice would be:
>>>> 1) Get the fastest/best hardware with current-ish generation CPU IPC
>>>> capabilities you can get within your budget. To lower complexity with
>>>> keeping congestion in control, one socket is easier to deal with than
>>>> a dual socket system.
>>>>
>>>> (tip for NIC: if your switch/router has 10 Gb/s or 25 Gb/s ports, get
>>>> some of the older Mellanox cards. They are very stable (more so than
>>>> their Intel counterparts in my experience) and extremely affordable
>>>> nowadays because of all the organizations that throw away their
>>>> digital sovereignty and privacy of their employees/users to move to
>>>> the cloud).
>>>>
>>>> 3) Start with 1 Tor relay per physical core (ignoring SMT). When the
>>>> Tor relays have ramped up (this takes 2-3 months for guard relays) and
>>>> there still is considerable headroom on the CPU (Tor runs extremely
>>>> poorly at scale sadly, so this would be my expectation) then move to 1
>>>> Tor relay per thread (SMT included).
>>>>
>>>> (tip: already run/'train' some Tor relays with a very limited bandwidth
>>>> (2 MB/s or something) parallel to your primary ones and pin them all
>>>> to 1-2 cores to let them ramp up in parallel to your primary ones.
>>>> This makes it *much* less cumbersome to scale up your Tor contribution
>>>> when you need/want/can do that in the future).
>>>>
>>>> 4) Assume at least 1 GB of RAM per relay on modern CPUs + 32 GB
>>>> additionally for OS, DNS, networking and to have some headroom for DoS
>>>> attacks. This may sound high, especially considering the advice in the
>>>> Tor documentation. But on modern CPUs (especially with a high
>>>> clockspeed) guard relays can use a lot more than 512 MB of RAM,
>>>> especially when they are getting attacked. Middle and exit relays
>>>> require less RAM.
>>>>
>>>> Don't skimp out on system memory capacity. DDR4 RDIMMs with decent
>>>> clockspeeds are so cheap nowadays. For reference: we ran our smaller
>>>> Tor servers (16C@3.4Ghz) with 64 GB of RAM and had to upgrade it to
>>>> 128 GB because during attacks RAM usage exceeded the amount available
>>>> and killed processes.
>>>>
>>>> 5) If you have the IP space available, use one IPv4 address per relay
>>>> and use all the good ports such as 443. If IP addresses are more
>>>> scarce, it's also not bad to run 4 or 8 relays per IP address.
>>>> Especially for middle and exit relays the port doesn't matter (much).
>>>> Guard relays should ideally always run on a generally used (and
>>>> generally unblocked) port.>>>>
>>>>> 2) If a full 10 Gbps connection was available for Tor relays, how many
>>>>> CPU cores, RAM and IPv4 addresses would be required to saturate the
>>>>> 10 Gbps connection?>>>>
>>>> That greatly depends on the CPU and your configuration. I can offer 3
>>>> references based on real world examples. They all run a mix of
>>>> guard/middle/exit relays.
>>>>
>>>> 1) Typical low core count (16+SMT) with higher clockspeed (3.4 Ghz)
>>>> saturates a 10 Gb/s connection with ~18.5 physical cores + SMT. 2)
>>>> Typical higher core count (64+SMT) with lower clockspeed (2.25 Ghz)
>>>> saturates a 10 Gb/s connection with ~31.5 physical cores + SMT. 3)
>>>> Typical energy efficient/low performance CPU with low core count (16)
>>>> with very low clockspeed (2.0 Ghz) used often in networking appliances
>>>> saturates a 10 Gb/s connection with ~75 physical cores (note: no SMT).
>>>>
>>>> The amount of IP addresses required also depends on multiple factors.
>>>> But I'd say that you would need between the amount and double the
>>>> amount of relays of the mentioned core+SMT count in order to saturate
>>>> 10 Gb/s. This would be 37-74, 63-126 and 75-150 relays respectively.
>>>> So between 5 and 19 IPv4 addresses would be required at minimum,
>>>> depending on CPU performance level.
>>>>
>>>> RAM wise the more relays you run, the more RAM overhead you will have.
>>>> So in general it's better to run less relays at a higher speed each
>>>> than run many at a low clock speed. But since Tor scales so badly you
>>>> need more Relays anyway so optimizing this isn't easy in practice.>>>>
>>>>> 3) Same for a 20 Gbps connection, how many CPU cores, RAM and IPv4
>>>>> addresses are required to saturate?>>>>
>>>> Double the amount compared to 10 Gb/s.
>>>>
>>>> Good luck with your Tor adventure. And let us know your findings with
>>>> achieving 10 Gb/s when you get there :-).
>>>>
>>>> Cheers,
>>>>
>>>> tornth
>>>>
>>>> Feb 3, 2025, 18:14 by tor-relays@lists.torproject.org:
>>>>> Hi All,
>>>>>
>>>>> Looking for guidance around running high performance Tor relays on
>>>>> Ubuntu.
>>>>>
>>>>> Few questions:
>>>>> 1) If a full IPv4 /24 Class C was available to host Tor relays, what
>>>>> are some optimal ways to allocate bandwidth, CPU cores and RAM to
>>>>> maximize utilization of the IPv4 /24 for Tor?
>>>>>
>>>>> 2) If a full 10 Gbps connection was available for Tor relays, how many
>>>>> CPU cores, RAM and IPv4 addresses would be required to saturate the
>>>>> 10 Gbps connection?
>>>>>
>>>>> 3) Same for a 20 Gbps connection, how many CPU cores, RAM and IPv4
>>>>> addresses are required to saturate?
>>>>>
>>>>> Thanks!
>>>>>
>>>>> Sent with [Proton Mail](Proton Mail: Get a private, secure, and encrypted email account | Proton) secure email.

--
╰_╯ Ciao Marco!

Debian GNU/Linux

It's free software and it gives you freedom!

Tor_at_1AEO · March 23, 2025, 4:01am

Those include statements in torrc would have saved me so much time for updating MyFamily! Been using fragile sed command...
Will start using them now going forward.

Any way to tell what amount of the 130G RAM your 40 guards are using (w/o accounting for the 40 bridges)? Seems much lower than the averages that I'm seeing below.

Also curious, why split 40 guards and 40 bridges instead of all guards?

While DDoS is possible, it seems odd only one node out of four would have DDoS when they're all less than 30 days old and there's no discernible increase for in/out traffic at the port.

Other 3 nodes, same software setup and versions as shared below, same early relays <=30 days first seen as guard/middle, just different hardware in different data centers seem to be around 3-4.5GB RAM averaged per relay.
280GB RAM at 88 relays = 3.2 GB per relay
130GB RAM at 32 relays = 4 GB per relay
141GB RAM at 32 relays = 4.4 GB per relay

Upgraded the node with only 256GB and the OOM issues to 768GB RAM and overall seems steady at 320GB RAM for now for the 66 relays. Since nothing is crashing now, will come back to troubleshoot later with metricsport, prometheus and grafana, after getting everything else set up and running.

···

On Saturday, March 22nd, 2025 at 11:52 AM, boldsuck via tor-relays <tor-relays@lists.torproject.org> wrote:

On Tuesday, 18 March 2025 18:31 Tor at 1AEO via tor-relays wrote:

> Listed general information below. What other information is helpful?
>
> Didn't want to log but seems will need something to troubleshoot issues.
> Will work on metricsport, prometheus and grafana. From a quick glance
> through htop sorting by memory and restarting the 7 relays at the top, ~4GB
> of RAM frees up (3 in RAM and 1 in Swap) per relay.
>
> 66 Tor relays (all middle/guard). All less than 40 days old.

I have 2x10G:
80 instances (40 guards/40 bridges) using 130G RAM

Welcome to the Internet, I suspect DDoS. Do you have ip/nftables?

With routed IPs I have no conntrack / no table filter
Systemli Paste
On other 1G servers I have dynamic NFT rules.

Search for information about YOUR 10G network card driver. e.g:
pre-up /sbin/ethtool commands may be required.

> General Setup:
> EPYC 7702P.
>
> 256GB RAM
>
> Software Versions:
> Tor version 0.4.8.14
> Ubuntu OS 24.04.2
>
> Tor configuration:
> SOCKSPort 0

SocksPolicy reject *
^
# I'm paranoid and I don't need ControlPort
ControlPort 0

> ControlPort xxx.xxx.xxx.xxx:xxx
> HashedControlPassword xxx
>
> ORPort xxx.xxx.xxx.xxx:xxxxx
> Address xxx.xxx.xxx.xxx
> OutboundBindAddress xxx.xxx.xxx.xxx

RelayBandwidthRate 100 MBytes
RelayBandwidthBurst 200 MBytes

> Nickname ...
>
> ContactInfo ...
>
> MyFamily...
> ExitPolicy reject :

Only a hint if you have several dozen relays, you can use one file. eg:

## Include MyFamily & ContactInfo
%include /etc/tor/torrc.all
## Include Exit Policy
%include /etc/tor/torrc.exit

> On Tuesday, March 18th, 2025 at 3:44 AM, mail--- via tor-relays tor-relays@lists.torproject.org wrote:
>
> > Hi,
> >
> > To be honest I think something might be wrong. Maybe some memory leak or
> > another issue because 66 relays with such low bandwidth shouldn't even
> > come close to a memory footprint of 256 GB. We use different operating
> > systems and most likely also slightly different relay configurations, but
> > our relays use significantly less memory. You shouldn't need more than
> > 128 GB of memory for ~10 Gb/s of Tor traffic, although 256 GB is
> > recommended for some headroom for attacks and spikes and such.
> >
> > Could you share your general setup, software versions and Tor
> > configuration? Perhaps someone on this mailinglist will be able to help
> > you. Also using node_exporter and MetricsPort (for example with a few
> > Grafana dashboards) would probably yield valuable information about this
> > excessive memory footprint. For example: if the memory footprint
> > increases linearly over time, it might be some software memory leak that
> > requires a fix.
> >
> > Cheers,
> >
> > tornth
> >
> > Mar 18, 2025, 10:21 by tor-relays@lists.torproject.org:
> >
> > > Somewhat of a surprise based on the 2-4x RAM to core/threads/relay ratio
> > > in this email thread, ran out of 256GB RAM with 66 Tor relays (roughly
> > > ~4x ratio).
> > >
> > > Something misconfigured or this expected as part of the relay ramping up
> > > behavior or just regular relay behavior?
> > >
> > > Summary: 64 cores / 128 threads (EPYC 7702P) running 66 Tor relays (all
> > > middle/guard), Tor version 0.4.8.14, used 256GB RAM and went to swap, on
> > > a 10 Gbps unmetered connection.
> > >
> > > Details:
> > >
> > > - ~30 relays are 30 days old and within 24 hours of adding ~30 new
> > > relays, used up 256GB RAM.
> > >
> > > - Average Advertised Bandwidth is ~3 MiB/s per 33 relays and the other 33
> > > are unmeasured / not listed advertised bandwidth yet.
> > >
> > > - Swap was set at default 8GB and maxed out. Changed to 256G temporarily.
> > > Swap usage is slowly climbing and reached 20G within first hour of
> > > increasing size.
> > >
> > > - Ubuntu 24.04.2 LTS default server install.
> > >
> > > - Nothing else running.
> > >
> > > - Will upgrade RAM this server to 768GB within next few days
> > >
> > > Have a different server, with 88 cores/threads and ~88 relays, all less
> > > than 30 days old, hovering around 240GB RAM, same average advertised
> > > bandwidth of ~3 MiB/s per half the relays, but have already upgraded RAM
> > > to 384GB and plan to take it to 512GB within the next week or two. Same
> > > Ubuntu and Tor versions and software configuration.
> > >
> > > Will keep sharing back as more relays and servers ramp up traffic.
> > >
> > > > On Tuesday, February 18th, 2025 at 11:23 PM, mail--- via tor-relays tor-relays@lists.torproject.org wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > Many people already replied, but here are my (late) two cents.
> > > > >
> > > > > > 1) If a full IPv4 /24 Class C was available to host Tor relays, what
> > > > > > are some optimal ways to allocate bandwidth, CPU cores and RAM to
> > > > > > maximize utilization of the IPv4 /24 for Tor?>>>>
> > > > > > "Optimal" depends on your preferences and goals. Some examples:
> > > > >
> > > > > - IP address efficiency: run 8 relays per IPv4 address.
> > > > > - Use the best ports: 256 relays (443) or 512 relays (443+80).
> > > > > - Lowest kernel/system congestion: 1 locked relay per core/SMT thread
> > > > > combination, ideally on high clocked CPUs. - Easiest to manage: as few
> > > > > relays as possible.
> > > > > - Memory efficiency: only run middle relays on very high clocked CPUs
> > > > > (4-5 Ghz). - Cost efficiency: run many relays on 1-2 generations old
> > > > > Epyc CPUs with a high core count (64 or more).
> > > > >
> > > > > There are always constraints. The hardware/CPU/memory and
> > > > > bandwidth/routing capability available to you are probably not
> > > > > infinite. Also the Tor Project maximizes bandwidth contributions to
> > > > > 20% and 10% for exit relay and overall consensus weight respectively.
> > > > >
> > > > > With 256 IP addresses on modern hardware, it will be very hard to not
> > > > > run in to one of these limitations long before you can make it
> > > > > 'optimal'. Hardware wise, one modern/current gen high performance
> > > > > server only running exit relays will easily push enough Tor traffic to
> > > > > do more than half of the total exit bandwidth of the Tor network.
> > > > >
> > > > > My advice would be:
> > > > > 1) Get the fastest/best hardware with current-ish generation CPU IPC
> > > > > capabilities you can get within your budget. To lower complexity with
> > > > > keeping congestion in control, one socket is easier to deal with than
> > > > > a dual socket system.
> > > > >
> > > > > (tip for NIC: if your switch/router has 10 Gb/s or 25 Gb/s ports, get
> > > > > some of the older Mellanox cards. They are very stable (more so than
> > > > > their Intel counterparts in my experience) and extremely affordable
> > > > > nowadays because of all the organizations that throw away their
> > > > > digital sovereignty and privacy of their employees/users to move to
> > > > > the cloud).
> > > > >
> > > > > 3) Start with 1 Tor relay per physical core (ignoring SMT). When the
> > > > > Tor relays have ramped up (this takes 2-3 months for guard relays) and
> > > > > there still is considerable headroom on the CPU (Tor runs extremely
> > > > > poorly at scale sadly, so this would be my expectation) then move to 1
> > > > > Tor relay per thread (SMT included).
> > > > >
> > > > > (tip: already run/'train' some Tor relays with a very limited bandwidth
> > > > > (2 MB/s or something) parallel to your primary ones and pin them all
> > > > > to 1-2 cores to let them ramp up in parallel to your primary ones.
> > > > > This makes it much less cumbersome to scale up your Tor contribution
> > > > > when you need/want/can do that in the future).
> > > > >
> > > > > 4) Assume at least 1 GB of RAM per relay on modern CPUs + 32 GB
> > > > > additionally for OS, DNS, networking and to have some headroom for DoS
> > > > > attacks. This may sound high, especially considering the advice in the
> > > > > Tor documentation. But on modern CPUs (especially with a high
> > > > > clockspeed) guard relays can use a lot more than 512 MB of RAM,
> > > > > especially when they are getting attacked. Middle and exit relays
> > > > > require less RAM.
> > > > >
> > > > > Don't skimp out on system memory capacity. DDR4 RDIMMs with decent
> > > > > clockspeeds are so cheap nowadays. For reference: we ran our smaller
> > > > > Tor servers (16C@3.4Ghz) with 64 GB of RAM and had to upgrade it to
> > > > > 128 GB because during attacks RAM usage exceeded the amount available
> > > > > and killed processes.
> > > > >
> > > > > 5) If you have the IP space available, use one IPv4 address per relay
> > > > > and use all the good ports such as 443. If IP addresses are more
> > > > > scarce, it's also not bad to run 4 or 8 relays per IP address.
> > > > > Especially for middle and exit relays the port doesn't matter (much).
> > > > > Guard relays should ideally always run on a generally used (and
> > > > > generally unblocked) port.>>>>
> > > > >
> > > > > > 2) If a full 10 Gbps connection was available for Tor relays, how many
> > > > > > CPU cores, RAM and IPv4 addresses would be required to saturate the
> > > > > > 10 Gbps connection?>>>>
> > > > > > That greatly depends on the CPU and your configuration. I can offer 3
> > > > > > references based on real world examples. They all run a mix of
> > > > > > guard/middle/exit relays.
> > > > >
> > > > > 1) Typical low core count (16+SMT) with higher clockspeed (3.4 Ghz)
> > > > > saturates a 10 Gb/s connection with ~18.5 physical cores + SMT. 2)
> > > > > Typical higher core count (64+SMT) with lower clockspeed (2.25 Ghz)
> > > > > saturates a 10 Gb/s connection with ~31.5 physical cores + SMT. 3)
> > > > > Typical energy efficient/low performance CPU with low core count (16)
> > > > > with very low clockspeed (2.0 Ghz) used often in networking appliances
> > > > > saturates a 10 Gb/s connection with ~75 physical cores (note: no SMT).
> > > > >
> > > > > The amount of IP addresses required also depends on multiple factors.
> > > > > But I'd say that you would need between the amount and double the
> > > > > amount of relays of the mentioned core+SMT count in order to saturate
> > > > > 10 Gb/s. This would be 37-74, 63-126 and 75-150 relays respectively.
> > > > > So between 5 and 19 IPv4 addresses would be required at minimum,
> > > > > depending on CPU performance level.
> > > > >
> > > > > RAM wise the more relays you run, the more RAM overhead you will have.
> > > > > So in general it's better to run less relays at a higher speed each
> > > > > than run many at a low clock speed. But since Tor scales so badly you
> > > > > need more Relays anyway so optimizing this isn't easy in practice.>>>>
> > > > >
> > > > > > 3) Same for a 20 Gbps connection, how many CPU cores, RAM and IPv4
> > > > > > addresses are required to saturate?>>>>
> > > > > > Double the amount compared to 10 Gb/s.
> > > > >
> > > > > Good luck with your Tor adventure. And let us know your findings with
> > > > > achieving 10 Gb/s when you get there :-).
> > > > >
> > > > > Cheers,
> > > > >
> > > > > tornth
> > > > >
> > > > > Feb 3, 2025, 18:14 by tor-relays@lists.torproject.org:
> > > > >
> > > > > > Hi All,
> > > > > >
> > > > > > Looking for guidance around running high performance Tor relays on
> > > > > > Ubuntu.
> > > > > >
> > > > > > Few questions:
> > > > > > 1) If a full IPv4 /24 Class C was available to host Tor relays, what
> > > > > > are some optimal ways to allocate bandwidth, CPU cores and RAM to
> > > > > > maximize utilization of the IPv4 /24 for Tor?
> > > > > >
> > > > > > 2) If a full 10 Gbps connection was available for Tor relays, how many
> > > > > > CPU cores, RAM and IPv4 addresses would be required to saturate the
> > > > > > 10 Gbps connection?
> > > > > >
> > > > > > 3) Same for a 20 Gbps connection, how many CPU cores, RAM and IPv4
> > > > > > addresses are required to saturate?
> > > > > >
> > > > > > Thanks!
> > > > > >
> > > > > > Sent with Proton Mail secure email.

--
╰_╯ Ciao Marco!

Debian GNU/Linux

It's free software and it gives you freedom!_______________________________________________
tor-relays mailing list -- tor-relays@lists.torproject.org
To unsubscribe send an email to tor-relays-leave@lists.torproject.org

_______________________________________________
tor-relays mailing list -- tor-relays@lists.torproject.org
To unsubscribe send an email to tor-relays-leave@lists.torproject.org