Here are some unsorted references on traffic shaping or mimicry of video streams. It’s a good idea to read a few of these and examine how relevant the ideas are to your system. A good place to start would be section 6 of the “Protozoa” paper, which similarly evaluates against a classifier.
Do you mean like: can you “spoof” a VK cdn connection, but to a foreign server? Or do you mean renting a VK IP to do that? Or did you mean something else?
With the first option, it’s ok. There’s no checking of whether an SNI matches the “real IP” of a service, probably because it would increase the latency of traffic by too much.
Thank you so much for the references — really helpful.
Three papers caught my eye in particular: “Voiceover” (generative modeling for traffic schedules), “Learning to Behave” (behavioral independence), and the 2025 extended abstract (to get a sense of the current SOTA). I’ll start with these.
The core idea is statistical rather than infrastructural. The hypothesis is that if IP checks are only triggered by suspicious traffic, making the traffic pattern statistically indistinguishable from VK CDN could allow it to bypass that layer entirely — the system treats it as routine traffic and never escalates to IP verification.
That said, I don’t have visibility into how Russian censorship actually pipelines its decisions, so whether major services are whitelisted at the IP level regardless of traffic pattern is an open question. The effectiveness depends heavily on that internal structure.
The most realistic use case I see is using mimicry to strengthen something like WebTunnel rather than as a standalone solution.
Essentially, there are different “SNI” algorithms blocking different sites, so different traffic can be blocked differently. Most blocks are SNI-only, like Twitter, YouTube, Discord (probably hard to block completely because of it being hosted on Cloudflare). Those kinds of blocks can be bypassed using advanced circumvention/spoofing software. Some obfs4 bridges that are found are blocked by IP/port, Tor guard relays are blocked the same way automatically, There is no blocking from incoming connections (even if your IP is blacklisted, you can still connect to a RU IP), in-country traffic is also subject to DPI inspection and blocking.
Some WebTunnel bridges were manually SNI-blocked, it’s not very clear if they tried to do IP blocking with them.
Recently, WhatsApp got IP-blocked, some Telegram sites also. Facebook stays accessible while SNI spoofing. There’s no IP checking for whether the IP/domain match. Some (foreign) ASN’s get throttled, although it’s pretty inconsistent and may be lifted randomly.
After recently blocking Telegram, there were reports that said that the DPI infrastructure is overloaded, which is allowing some WhatsApp traffic to slip through.
This is very helpful, thank you. The fact that IP/SNI consistency checks are not being performed is a crucial point. This means that SNI-level cover identities function without going through the actual service infrastructure.
The WebTunnel SNI blocking example is particularly interesting. If we assume these bridges are initially flagged by traffic pattern analysis and then manually SNI blocked, statistical shaping would prevent the initial flagging, and the SNI blocking would not occur. This is precisely the gap this approach is trying to fill.
The DPI overload report obtained after the Telegram block also suggests that there are actual capacity limitations in the inspection infrastructure, providing grounds for supporting a probabilistic inspection model rather than an exhaustive one.
I see, so the WebTunnel blocking was entirely manual, not pattern-triggered. This changes my perspective somewhat.
The 30x DPI upgrade is a crucial data point. Once capacity constraints are no longer an issue, the approach of using infrastructure overload as a secondary effect will no longer work. Statistical indistinguishability at the pattern level will be the only reliable long-term strategy.
Perhaps mimicking a niche service would be a better long-term strategy.
Thank you for your suggestions and feedback. Based on your advice, I’ve created an issue on Tor GitLab to continue the discussion and more formally track progress.
The system framework (bridge, client, protocol framing) is publicly available at the repository linked in the issue. Note that the GitHub repository linked in my earlier post has been made private — I had shared it temporarily, but moved the public release to GitLab and closed the GitHub one to avoid confusion. Due to dual-use concerns, the trained models are kept private, but I’m happy to invite anyone who wants to review the full implementation as a project member.
I’d appreciate your participation in the GitLab discussion!
We should be wary of their statements. The DPI is already overloaded with filtering fake zapret packets . There was a statistic somewhere that most BitTorrent traffic contains traces of packet modification.
Most likely, the focus will be on internet isolation and administrative pressure.
That’s interesting. If DPI is already suffering from noise issues, a bridge acting like a proxy for actual services might be more practical than expected.
And indeed, increasing it 30 times by 2030 seems quite impossible.
An indirect sign of problems with traffic filtering is that the function of detecting VPNs has been assigned to businesses (Ozon, VK, etc.)using their apps. For Tor Browser, this isn’t as relevant.
But even introducing administrative liability for VPN use is being considered. This is if it apparently fails to stop their use.