Why is this important? Because many people may need to transfer a file that they downloaded over TOR to an unknown computer, e.g. of a friend or a family member, which will usually run a non-privacy-respecting operating system. Therefore, they must know whether or not that operating system, provided it does not respect privacy in any way, has the ability to know if the file was downloaded over TOR.
Is the file downloaded over TOR identical to the one downloaded over clearnet or another browser? Do websites have the ability to embed an identifier for each individual download of the file, making it possible for adversaries to detect the file was downloaded over TOR? (provided Safest mode was used - no JS)
Please provide trusted sources for your answers so people can validate itās true.
Iām going to say that it is possible for a downloaded file to be unique as you describe because anything is possible. The site knows the user is on Tor because the exit node IPs are public.
Now if the file downloaded on Tor has the same sha256 CRC as the one on clearnet then it is not unique.
Download twice from each clearnet and Tor. On Tor download from a different IP for each. All 4 should have the same sha256 checksum.
A file downloaded over Tor is not different from the same file downloaded over clearnet, per se. However, since IPs of Tor exit nodes are public, the distributor may āmarkā the file so that it is possible to figure out it was downloaded over Tor (or from a specific IP address, or browser version, or any of many other information that are usually transmitted by your browser). The ways the file may be marked depends upon its format, but I think it does be possible for most, if not all, of them.
However, the meaning of the mark would be known only to the file distributor (and those who are told by them): therefore, a malicious operating system should usually not be able to infer that the file was downloaded over Tor. I would consider safe enough to redistribute the file and read it o untrusted devices - but not to execute it.
You may use the tip from BobbyB:
But keep in mind that:
the website may āmarkā just the fact that the file was downloaded over Tor and not the IP address: therefore, downloading from different Tor IPs may result in two identical files, with identical āTor markā.
the website may instead just lways embed information about the download, irrespective of whether it was done over Tor or not. For example, many scientific publishers watermark downloaded PDF with date and time of download and the institution providing access, so every downloaded copy has a diferent checksum but still does not allow to identify the downloader from the file alone.
My answer is based on information that is more ore less ācommon knowledgeā, so Iām not going to research specific sources to support it. They would be some webserver documentation + some file format specification + some ExoneraTor documentaion.
I agree with @Eldalie. A modified version of a file can be served when using Tor, since the IP addresses of exit nodes are known publicly.
It is also possible to serve a completely different web page under the same URL when Tor is used vs. not used.
It is possible to serve specific content based on requesting IP address.
I started my post with āIām going to say that it is possibleā but in reality I knew it was.
The question which came to mind was ābut for what purposeā. It would be interesting for @deonna6 to give us a why so we can speculate.
If we push this enough it would be possible for the file to contain āsleeperā code (malicious??) which only gets executed from a Tor downloaded file. We have to assume this is about some rogue state condition⦠and what is a rogue state? To each the other is rogue.
A change to my tip. On the clearnet download wait some time, like an hour or so, in case time is a factor.
It is smart of you to think of that solution of checksums! I also had a thought like this before posting, but thereās the problem that downloading using clearnet isnāt feasible (e.g. for privacy concerns; youāre downloading the file using Tor in the first place so you donāt let e.g. ISP know youāre interested in it,) and downloading using two different Tor IPs doesnāt mitigate the mark of āTORā without āspecific IPā, as mentioned by @Eldalie:
This is great to know! Yet I request that you provide a trusted source of this statement, so other people reading this and I can easily validate.
I was asking if this is possible without JS, as I said:
Now, considering it is, it will be necessary to have a way of sanitizing. There is e.g. Dangerzone that takes a possibly-malicious file and gives you a file that you know it is certainly safe. But keep in mind that here we do not care about being malicious! but only about marks, because the file is only to be transferred to another unknown computer (we will not read/execute it on our own computer) and the only goal is to ensure that the unknown computer isnāt able to know the file was downloaded over Tor. The owner of the unknown computer is told that our file can be malicious and they will deal with that.
Also, note that that āfileā can be a .zip! (or an archive).
So, three questions:
Is marking a file as downloaded over Tor possible without Javascript? (because you quite didnāt answer this specifically)
Is there a way of sanitizing a file from marks such as āthe file was downloaded over Tor,ā in a way like Dangerzone or so? For example reading the file bit by bit and finding unimportant data or marks and creating a new file without them.
Do you have any other ideas to ultimately have a file that certainly has no marks it was downloaded over Tor?
Is marking a file as downloaded over Tor possible without Javascript?
Yes. The site examines the IP. It is Tor. Mark the file the user wants to download.
Is there a way of sanitizing a file from marks such as āthe file was downloaded over Tor,ā in a way like Dangerzone or so?
I only heard of Dangerzone just now. Interesting. The mark(s) could be very simple and devious. I will use the word copyright as an example since most files/programs would contain this. The mark is changing copyright to Copyright or vice versa. It is a 1 bit change. Same idea for a zip. 0100 0011 to 0110 0011
Would Dangerzone catch this? A CRC would but then you cannot do it via clearnet.
Do you have any other ideas to ultimately have a file that certainly has no marks it was downloaded over Tor?
I do not as explained in 2.
OK, even in this restrictive environment someone must know someone in a āfreeā environment to get a clearnet copy and compare CRCs. But wait, the word Stasi (DDR) comes to mine where 1 in 4 people were informants and that someone in a āfreeā environment could be a sleeper or plant. (Bill Haydon in Tinker Tailor Soldier Spy)
Sorry I canāt answer this question. Nothing is 100% if on the internet.
Then use a search engine and search for how server-side programming languages (like PHP, Python, JavaScript (nodejs), ā¦) work. The main purpose of server-side languages is to generate dynamic content dependent on some variable data. Such data can be the requesting IP address, which is known āto the languageā during the request.
Btw. the search terms you enter into a search engine are also such variable data. Do you trust me, that the search engine outputs different results, when you enter different search terms into it? Or do you need a trusted source for this? lol
Yes, it has nothing to do with JavaScript on the client side.
No, since nobody knows, how the mark looks like.
As already mentioned, you have to download the file through clearnet to know whether it is served modified for the Tor Network.
You can never be sure, the file isnāt modified, since the mark could be set during specific hours or minutes only.
If you want to be sure 100%, your only option is to download the file via clearnet.
I read a bit more about Dangerzone. Seems it is geared for documents or images. I was coming at this more from a potentially malicious executable. Not sure how Dangerzone would or could handle this. In their docs they identify things which they cannot handle. I assume the adversaries would concentrate on these methods. I would.
It doesnāt neutralize visual information, like printer tracking dots, or steganography, which could contain identifying information.
Why not use your own Onion Courier Network to send files, base64+ encoded to each other? Only your very own Tor Hidden Services are used, which you run locally.
mat2 - the metadata anonymisation toolkit 2, comes with Tails:
āmat2 is a metadata removal tool, supporting a wide range of commonly used file formats, written in python3: at its core, itās a library, used by an eponymous command-line interface, as well as several file manager extensions.ā - Julien (jvoisin) Voisin