Thanks, that explanation has helped me understand. While there is currently no option for querying based on country/bandwidth/uptime, you are positing that such an interface might exist in the future, and when it does, you want to make it hard for a malicious or compromised BridgeDB server to associate user identifiers (email addresses, IP addresses) and assigned bridges or query preferences. The same query privacy could protect the current limited options of transport and IPv4/IPv6.
I find your phrasing “the bridge choice of users” and “the choices of bridges are revealed to BridgeDB” strange, because it is not the user that chooses the bridge. BridgeDB/rdsys chooses the bridge (according to its own logic, which includes compartmentalizing bridge pools according to access method) and assigns it to the user. Maybe that’s what you mean. It’s true that currently BridgeDB/rdsys could record what bridges were assigned to what query identities (email addresses, source IP addresses), which are additionally used to try to rate-limit queries.
One question that comes to mind is whether a more fine-grained query protocol is a useful thing to provide. Why would I, as a bridge user, not always ask for the maximum bandwidth and maximum uptime? Why would I care about the country the bridge is in, as long as it works, except for minimizing geographical distance, which is another way of saying I want to maximize performance? Is a bridge query protocol a problem that needs solving, or is it just a problem that admits of a novel cryptographic solution?
Another question, more important that load balancing IMO, is how does the proposed query protocol interact with anti-enumeration defenses? What stops an attacker from querying for
country:HR bandwidth:0-100k, then
country:HR bandwidth:100-200k, and so on, thereby discovering all the bridges in Croatia, and then repeating the process for all the other country codes? What if an honest user’s query is too specific, and the result contains 0 bridges? If there is still some kind of anti-enumeration defense, does that failed query “burn” one of their chances to ask for bridges?
I’m not trying to be challenging or provocative. You’ve done a good thing by starting a public discussion—it’s an act of bravery and honesty. I’m asking direct questions to try to get to the substance quickly. I’m ready to believe that you have good answers.
I can reiterate the recommendation to read the Lox paper from this year’s PETS. You can read the anti-censorship team’s reading group discussion of the paper here.
Regarding peer review, it is not true that blind review prevents you from identifying yourself in a discussion of your work before or during submission. You’re not meant to have to act like a secret agent. Your responsibility is to anonymize your submission; the burden falls on the reviewers not to snoop around to try to find out who the authors are. You can reassure yourself with the norms expressed by ePrint:
… authors are allowed to announce their results in public when they are in an anonymous refereeing process … Authors are allowed to give talks on their papers and submit them to existing preprint servers, which will usually be announced widely. … Anonymous submission just means that papers are submitted without author’s names and too obvious references.
I bring this up because it’s a minor problem in censorship circumvention research that research groups misunderstand details of each other’s work, or make unjustified assumptions about the problem space, in simple ways that could be alleviated by more open discussion, and I think part of the cause is unjustified fears regarding peer review. You’ve done a good thing by posting some of your research questions here, and I think you will find it improves the quality of your work.