New kind of attack?

There’s something going on for a while and I haven’t seen any mentions of it. Wondering if anyone else is experiencing it.

The problem is the huge unreasonable spikes of outgoing packets which cause RAM to max out and eventually causes Tor to crash. The interesting part is that even when you shut down tor and restart it after a few minutes, it start right from where it left off and in about a minute, you’re back where you were. See below:

This one shows the RAM usage as well

The gaps are when I shut down Tor and as you can see, the spike happens immediately after the restart even when I start Tor 5 to 7 minutes later. Shouldn’t there be a period when tor establishes new circuits when it restarts? Why does it start where it was left off and continues sending data from previously established connections? Does this mean that this is an attack directly pointed at my specific relay and IP address?

I’m assuming all this traffic is going to one or more exit relay.

Sample log:

Excellent. Publishing server descriptor.
Jan 11 14:00:04.000 [notice] Bootstrapped 100% (done): Done
Jan 11 14:02:22.000 [notice] We're low on memory (cell queues total alloc: 4142541744 buffer total alloc: 304637952, tor compress total alloc: 43280 (zlib: 43264, zstd: 0, lzma: 0), rendezvous cache total alloc: 3465829). Killing circuits withover-long queues. (This behavior is controlled by MaxMemInQueues.)
Jan 11 14:02:28.000 [notice] Removed 448201792 bytes by killing 23077 circuits; 266323 circuits remain alive. Also killed 0 non-linked directory connections. Killed 1 edge connections
Jan 11 14:02:28.000 [warn] connection_edge_about_to_close(): Bug: (Harmless.) Edge connection (marked at src/core/or/circuitlist.c:2747) hasn't sent end yet? (on Tor 0.4.8.10 )
Jan 11 14:02:28.000 [warn] tor_bug_occurred_(): Bug: src/core/or/connection_edge.c:1086: connection_edge_about_to_close: This line should not have been reached. (Future instances of this warning will be silenced.) (on Tor 0.4.8.10 )
Jan 11 14:02:28.000 [warn] Bug: Tor 0.4.8.10: Line unexpectedly reached at connection_edge_about_to_close at src/core/or/connection_edge.c:1086. Stack trace: (on Tor 0.4.8.10 )
Jan 11 14:02:28.000 [warn] Bug:     /usr/bin/tor(log_backtrace_impl+0x5b) [0x55f92817a82b] (on Tor 0.4.8.10 )
Jan 11 14:02:28.000 [warn] Bug:     /usr/bin/tor(tor_bug_occurred_+0x18a) [0x55f928191d7a] (on Tor 0.4.8.10 )
Jan 11 14:02:28.000 [warn] Bug:     /usr/bin/tor(connection_about_to_close_connection+0x6c) [0x55f92823711c] (on Tor 0.4.8.10 )
Jan 11 14:02:28.000 [warn] Bug:     /usr/bin/tor(+0x6cb3e) [0x55f9280f9b3e] (on Tor 0.4.8.10 )
Jan 11 14:02:28.000 [warn] Bug:     /usr/bin/tor(+0x6cee8) [0x55f9280f9ee8] (on Tor 0.4.8.10 )
Jan 11 14:02:28.000 [warn] Bug:     /lib64/libevent-2.1.so.7(+0x24958) [0x7f2b6ad1d958] (on Tor 0.4.8.10 )
Jan 11 14:02:28.000 [warn] Bug:     /lib64/libevent-2.1.so.7(event_base_loop+0x577) [0x7f2b6ad1f2a7] (on Tor 0.4.8.10 )
Jan 11 14:02:28.000 [warn] Bug:     /usr/bin/tor(do_main_loop+0x127) [0x55f9280fdb17] (on Tor 0.4.8.10 )
Jan 11 14:02:28.000 [warn] Bug:     /usr/bin/tor(tor_run_main+0x205) [0x55f928101b35] (on Tor 0.4.8.10 )
Jan 11 14:02:28.000 [warn] Bug:     /usr/bin/tor(tor_main+0x4d) [0x55f928101f5d] (on Tor 0.4.8.10 )
Jan 11 14:02:28.000 [warn] Bug:     /usr/bin/tor(main+0x1d) [0x55f9280f4cad] (on Tor 0.4.8.10 )
Jan 11 14:02:28.000 [warn] Bug:     /lib64/libc.so.6(+0x3feb0) [0x7f2b6a43feb0] (on Tor 0.4.8.10 )
Jan 11 14:02:28.000 [warn] Bug:     /lib64/libc.so.6(__libc_start_main+0x80) [0x7f2b6a43ff60] (on Tor 0.4.8.10 )
Jan 11 14:02:28.000 [warn] Bug:     /usr/bin/tor(_start+0x25) [0x55f9280f4d05] (on Tor 0.4.8.10 )
Jan 11 14:02:34.000 [notice] Performing bandwidth self-test...done.
Jan 11 14:02:58.000 [notice] We're low on memory (cell queues total alloc: 4073659392 buffer total alloc: 369604608, tor compress total alloc: 0 (zlib: 0, zstd: 0, lzma: 0), rendezvous cache total alloc: 3978040). Killing circuits withover-long queues. (This behavior is controlled by MaxMemInQueues.)
Jan 11 14:03:03.000 [notice] Removed 444758160 bytes by killing 23863 circuits; 275706 circuits remain alive. Also killed 0 non-linked directory connections. Killed 0 edge connections
Jan 11 14:03:47.000 [notice] We're low on memory (cell queues total alloc: 4118419008 buffer total alloc: 324532224, tor compress total alloc: 0 (zlib: 0, zstd: 0, lzma: 0), rendezvous cache total alloc: 4539033). Killing circuits withover-long queues. (This behavior is controlled by MaxMemInQueues.)
Jan 11 14:03:50.000 [notice] Removed 445017936 bytes by killing 22438 circuits; 287151 circuits remain alive. Also killed 0 non-linked directory connections. Killed 0 edge connections

Thank you.

2 Likes

I’ve noticed this exact behaviour for the first time last week on one of my nodes. The downtime was very minimal for me though, but can imagine stability is bad for users. Hadn’t really looked into it thus far as it isn’t really an issue for now. Detected it by basic metrics, as you can imagine.

I’ll see later this week if I can see some useful info in the logs. Let me know if I can help.

@cozybeardev

Well, the downtime may be a lot longer than you imagine. I have set up remote monitoring for my relays and I get a message when the port is unresponsive. In my experience, even though Tor is running, it won’t accept new connections. In my case Tor is unresponsive sometimes for 10 minutes and then it accepts new connections and of course within minutes it’s back to being unresponsive. This keeps on going sometimes for an hour or more.

In other words, you can see traffic but most of it is not new connections. It’s simply busy processing the existing connections initiated by the attack.

1 Like
1 Like

Linking the tor-relays mailing list thread to this topic too:

https://lists.torproject.org/pipermail/tor-relays/2024-January/021460.html

1 Like

I had the same issue on one of my relays. I saw hundreds of short lived connections too. I added these iptables rules and it seems to have helped, but I’m now getting much less throughput through my relay.

The throughput you were previously experiencing included a lot of garbage sent and received due to the attack so it’s reasonable to have less usage once the attackers are blocked.

Once the authorities realize you’re doing less than you’re capable of, you’ll start building new circuits and new users will choose your relays and the throughput will eventually get back up. Only this time, you’ll be processing mostly legitimate connections.

I remember seing traffic from attack going via connections with other relays.
In such case attacker is anonymous and can’t be blocked externally.

Did anyone saw this attack going from specific attacker addresses, which then can be banned with firewall?

The frequency of attacks is increasing for me. Last two days two different nodes were hit at around the same time of 6:00 am utc (one on each day).

It happened to one of my relays again with, even with the iptables setup, so I guess that doesn’t prevent it.

1 Like

I wonder if this has something to do with what I just experienced.

I restarted the machine which hosts a few of my instances so I could solve some VPN connectivity issues. After bootstrapping, you can see how CPU and RAM usage increases (as new circuits are created) and after roughly 2 hours there’s a big drop for no reason, no crashes, no warnings, nothing in logs. Info in Nyx is fine. Now the usage is at 40-55%, not usual, as it used to be at 65-80%.

Despite the CPU usage drop, RAM is still consumed, no drop at all.

The iptables setup doesn’t completely prevent it but it greatly reduces the impact. We’re preventing them from creating multiple connections but they can still pack a punch with the one or two connections they’re allowed.

Remember, some of those packets are coming from other relays and we don’t want to completely ban a lot of other relays in the network.

Well, the iptables script puts a bunch of IP addresses in the block list based on their behavior, mainly due to their attempts to make concurrent connections. Sometimes your relay is the point of entry and sometimes you’re just processing the packets that are coming from other relays under the attack.

There’s so much you can do with iptables. The fact that Tor is getting overloaded for processing the packets tells me that the best way to block this attack would be at the application layer. In other words, Tor should recognize the bogus requests and simply not process them.

This is how memory leaks (or related “heap fragmentation” problem) behave.
I suggest you to restart relay once again.

Issues are getting worse for me. Almost once per day a relay goes down. Considering creating scripting to restart my relays preventively to reduce downtime. Anyone else seeing the same thing?

For me attack happen several times a month.

I made such thing already, for Windows:

using System;
using System.Configuration.Install;
using System.ServiceProcess;
using System.Threading;
using System.IO;
using System.ComponentModel;
using System.Reflection;
using System.Diagnostics;

namespace TorRestart
{
    [RunInstaller(true)]
    public class MyWindowsServiceInstaller : Installer
    {
        public MyWindowsServiceInstaller()
        {
            var processInstaller = new ServiceProcessInstaller();
            var serviceInstaller = new ServiceInstaller();

            processInstaller.Account = ServiceAccount.LocalSystem;

            serviceInstaller.DisplayName = "TorRestart";
            serviceInstaller.StartType = ServiceStartMode.Automatic;

            serviceInstaller.ServiceName = "TorRestart";
            this.Installers.Add(processInstaller);
            this.Installers.Add(serviceInstaller);
        }
    }



    class Program : ServiceBase
    {
        Thread thread;
        ManualResetEvent shutdownEvent;
        static string exeLocation;
        static string logLocation;

        private void Worker()
        {
            var pc = new PerformanceCounter("Process", "Private Bytes", "tor");

            for (;;)
            {
                try
                {
                    int torMB = (int)(pc.NextValue() / 1024.0 / 1024.0);
                    Log($"Tor RAM: {torMB} MB");

                    if (torMB > 2560)
                    {
                        Log($"Stopping Tor...");
                        ServiceController service = new ServiceController("tor");
                        service.Stop();
                        while (Process.GetProcessesByName("tor").Length != 0)
                            Thread.Sleep(500);
                        Log($"Starting Tor...");
                        service.Start();
                    }
                }
                catch
                {
                }

                if (shutdownEvent.WaitOne(10000))
                    break;
            }
        }

        protected override void OnStart(string[] args)
        {
            Log("Service is starting");
            base.OnStart(args);
            shutdownEvent = new ManualResetEvent(false);
            thread = new Thread(Worker);
            thread.Start();
        }

        protected override void OnStop()
        {
            Log("Service is stopping");
            base.OnStop();
            shutdownEvent.Set();
            thread.Join();
        }

        Program()
        {
            ServiceName = "TorRestart";
        }

        static void Log(string message)
        {
            File.AppendAllText(logLocation,
                $"[{DateTime.Now:dd.MM.yyyy HH:mm:ss}] {message}\r\n");
        }

        static void CurrentDomainUnhandledException(object sender,
            UnhandledExceptionEventArgs e)
        {
            Log($"{e.ExceptionObject}\r\n");
        }

        static void Main(string[] args)
        {
            exeLocation = Assembly.GetExecutingAssembly().Location;
            logLocation = Path.Combine(Path.GetDirectoryName(exeLocation), "log.txt");

            AppDomain.CurrentDomain.UnhandledException +=
                CurrentDomainUnhandledException;

            if (Environment.UserInteractive)
            {
                string parameter = string.Concat(args);
                switch (parameter)
                {
                    case "--install":
                        ManagedInstallerClass.InstallHelper(new string[] { exeLocation });
                        break;
                    case "--uninstall":
                        ManagedInstallerClass.InstallHelper(new string[] { "/u", exeLocation });
                        break;
                }
            }
            else
            {
                Run(new Program());
            }
        }
    }
}
1 Like

I find restarting the service, or even the entire VM doesn’t help. It goes right back into the bad state. I let it ride for a while and it recovered on its own after several OOM kills.

image

Have you used compare.sh script to remove the relays that were blocked?

You can see it very well on some of my relay graphs lol.

https://metrics.torproject.org/rs.html#details/014326416058DCFD0965167026CBEF647409A000

https://metrics.torproject.org/rs.html#details/C7A51E46740C15DEC0535AF5560A1919CE6E5758

https://metrics.torproject.org/rs.html#details/2A134CF4E3CC5C7F77F331177791843794B96068

Is there any advise from official Tor devs? Perhaps some config we can change? I’m seeing persistence as well even after reboot. The impact on my relay health is serious now at this point, I have to intervene daily and I have 3 relays now with descriptor errors - which has never happend before. If there is any info I can collect for debugging, let me know.

Haha, this is too funny.

I think the issue is occuring for a long while now on one of my bandwidth restricted nodes, hence not resulting in any real issues.

The graph looks hilarious.

Relay Search (check the 6m time period, written bytes just steadily increasing since early this year).

Perhaps there is a solution in there → restricting bandwidth. Not really eager to do that though.

1 Like