Why is this important? Because many people may need to transfer a file that they downloaded over TOR to an unknown computer, e.g. of a friend or a family member, which will usually run a non-privacy-respecting operating system. Therefore, they must know whether or not that operating system, provided it does not respect privacy in any way, has the ability to know if the file was downloaded over TOR.
Is the file downloaded over TOR identical to the one downloaded over clearnet or another browser? Do websites have the ability to embed an identifier for each individual download of the file, making it possible for adversaries to detect the file was downloaded over TOR? (provided Safest mode was used - no JS)
Please provide trusted sources for your answers so people can validate it’s true.
I’m going to say that it is possible for a downloaded file to be unique as you describe because anything is possible. The site knows the user is on Tor because the exit node IPs are public.
Now if the file downloaded on Tor has the same sha256 CRC as the one on clearnet then it is not unique.
Download twice from each clearnet and Tor. On Tor download from a different IP for each. All 4 should have the same sha256 checksum.
A file downloaded over Tor is not different from the same file downloaded over clearnet, per se. However, since IPs of Tor exit nodes are public, the distributor may “mark” the file so that it is possible to figure out it was downloaded over Tor (or from a specific IP address, or browser version, or any of many other information that are usually transmitted by your browser). The ways the file may be marked depends upon its format, but I think it does be possible for most, if not all, of them.
However, the meaning of the mark would be known only to the file distributor (and those who are told by them): therefore, a malicious operating system should usually not be able to infer that the file was downloaded over Tor. I would consider safe enough to redistribute the file and read it o untrusted devices - but not to execute it.
You may use the tip from BobbyB:
But keep in mind that:
the website may “mark” just the fact that the file was downloaded over Tor and not the IP address: therefore, downloading from different Tor IPs may result in two identical files, with identical “Tor mark”.
the website may instead just lways embed information about the download, irrespective of whether it was done over Tor or not. For example, many scientific publishers watermark downloaded PDF with date and time of download and the institution providing access, so every downloaded copy has a diferent checksum but still does not allow to identify the downloader from the file alone.
My answer is based on information that is more ore less “common knowledge”, so I’m not going to research specific sources to support it. They would be some webserver documentation + some file format specification + some ExoneraTor documentaion.
I agree with @Eldalie. A modified version of a file can be served when using Tor, since the IP addresses of exit nodes are known publicly.
It is also possible to serve a completely different web page under the same URL when Tor is used vs. not used.
It is possible to serve specific content based on requesting IP address.
I started my post with “I’m going to say that it is possible” but in reality I knew it was.
The question which came to mind was “but for what purpose”. It would be interesting for @deonna6 to give us a why so we can speculate.
If we push this enough it would be possible for the file to contain “sleeper” code (malicious??) which only gets executed from a Tor downloaded file. We have to assume this is about some rogue state condition… and what is a rogue state? To each the other is rogue.
A change to my tip. On the clearnet download wait some time, like an hour or so, in case time is a factor.