Continue Discussion 12 replies
September 2022

Vort

How much performance of Rust version differs from C version?

1 reply
September 2022

QxXw4vK4PvW

Can it do stream isolation based on SOCKS5 user/password? That would let me help testing it.

1 reply
September 2022 ▶ QxXw4vK4PvW

nickm

Yes. Socks isolation is on by default.

September 2022 ▶ Vort

nickm

They should be roughly comparable for client uses: we’ve done preliminary testing, and gotten similar results.

(If you find cases where the Arti performance is way worse than the C tor performance, please let us know!)

There are some cases where we expect the Rust implementation to be more efficient than C for now: Arti is thoroughly multithreaded by default, whereas the C tor implementation only uses multithreading for limited calculations. There are other cases where we expect the C implementation to be more efficient: in C we have the improved RTT-based congestion control logic, which we have not yet built in Arti.

1 reply
September 2022 ▶ nickm

Vort

Thank you for the answer.
I remember that TLS library have heavy CPU-specific optimizations in C code.
And was wondering if it is possible to make such low level optimizations in Rust.
However I’m not entirely sure if I remember everything correctly.

1 reply
September 2022 ▶ Vort

nickm

@Vort said:

I remember that TLS library have heavy CPU-specific optimizations in C code.
And was wondering if it is possible to make such low level optimizations in Rust.

Well, by default, Arti will use your own operating system’s TLS implementation (SecureTransport, schannel, or OpenSSL), so it will get whatever optimizations that has for TLS. If you build with rustls instead, you’ll get the optimized implementations from ring. There are additional options you can set at compile time to use optimized crypto from other sources: see documentation for the arti crate for details.

All that said, though, if you’ve got a reasonable desktop or laptop environment, I’d expect CPU-bound cryptography won’t be a major performance issue for client usage. The CPU efficiency of your cryptography will only be noticeable on low-end mobile (where the CPU is pretty slow itself), or for relays or onion services (since they are processing a lot more traffic—also, Arti doesn’t support them yet).

1 reply
September 2022

Vort

I doubt that TLS implementation from my Windows 7 is usable for modern programs at all.

When relay support will be implemented it may be late to change TLS mechanisms.
But I hope that other options mentioned by you will have comparable to C version performance.

September 2022

shadykaty

Some thoughts / a bit of context about performance comparability between Rust and C, for those not familiar.

One nice thing about Rust’s compiler is that it is built on LLVM. For those not familiar, LLVM is a compiler infrastructure. It provides an intermediate form called Intermediate Representation or IR, and various optimizations which work on IR semantics. So if you can write a compiler which produces IR, you can use LLVM optimizations. This is a good thing for making new systems langs because you can use many of the same optimizations which allow C to be fast.

If you have some Rust code, and some perfectly equivalent C code, and compile them using rustc and clang, you will get the same or very close to the same machine code.

The subtle thing often missed in discussions of “X lang has C-like performance” is that equivalent code usually isn’t actually equivalent due to differences in the languages’ semantics. One of the ways Rust is different from C is that it injects runtime bounds checks for things which are only checked in C if you wrote a check yourself. So, there are potentially extra branches injected, which may be unreachable, which may impede performance. If your C isn’t bounds checking (and doesn’t need to) and your Rust is, your Rust is doing extra work and that work has a nonzero cost. But:

  1. these can be explicitly guaranteed to be unreachable, thus removing the extra code and giving you identical perf
  2. nearly all of the the overhead from a check is failing branch predictions, which should never or almost never happen
  3. if the code in question is not executed very frequently, the average overhead over the process’s lifetime will be negligible

Another tricky thing about the “C-like performance” we often demand from other langs is: writing performant C is actually pretty hard. You can’t just write some working C and be done, your C has to be written in a way that the compiler will produce sane code. The relative lack of high level abstractions in C are both a blessing and a curse. Idioms provided by the language are usually highly optimized. Lacking many high level abstractions in C leaves many idioms up to the author, and if the author isn’t an expert, their code will be worse and less performant than if they’d used a high level abstraction someone else wrote. So while Rust’s abstractions may be worse than expertly written C code, they may also be better than an average programmer’s C code. High level abstractions can be a performance improvement - they aren’t necessarily a performance detriment.

tl;dr: As someone who has written Rust and C and has a light obsession with performant code, I would expect the rewrite to be a little better in some places, a little worse in others, and roughly equivalent on the whole. Spots with drastic performance losses can be optimized to as good as they previously were, with near certainty. While I personally don’t enjoy writing Rust and it wouldn’t be my first choice for a rewrite, I still think that on the whole this is an improvement to the Tor project.

1 reply
September 2022 ▶ shadykaty

Vort

@shadykaty thanks for clarification.
I agree with most of your explanations.
But there is one thing missing, which I implied when said about TLS:
C allows to go even deeper - to replace some code with parts, written in assembly language.
Since not every instruction is available on every CPU, such code usually written in several modifications, wrapped with selection logic.
This is, as far as I know, what compilers usually not doing.
At the same time such code may be very important, for example when it uses hardware crypto (while allowing less optimized code run on CPUs, which do not have such accelerations).
Because of it I wonder if it is possible to effectively mix assembly code and Rust code.
By the way, GCC is so good at it, that allows mixing code even within single function.
While MSVC (al least for x64 code) require assembly and C functions to be in separate files.

1 reply
September 2022

shadykaty

100% possible. Support for FFI is crucial in any systems language and Rust is no exception. You just need to link the external file and specify the appropriate calling convention for its functions:

September 2022

Iheartcake

Why not just use a newer c++ standard to address these old C safety problems ? Or contribute to these ?
I don’t get how the “not-so-modular” design has anything to do with C. Get rid of your globals then. :stuck_out_tongue:

I wouldn’t want to use any language that declares a variable integer like this let mut x = 5;
But I also despise any web languages.

September 2022

katofrobo

I think you guys made a great choice going with Rust. I know that C++ has some great features but to be honest Rust just does stuff out of the box the right way. Rust was a good choice for a production language especially considering the vulnerabilities that can be created in code. I think the ability to use multi-threading and get it up and running fast likely was made possible by Rust- as I’ve used Rust and it does help enforce good practices with threading.

I am using the Arti in proxy mode now. I just replaced C-Tor with it and it is working well. I feel a lot more confident that a lot of the simple out of bounds accesses, use after free, and other problems associated with the C-language can now be a thing of the past.