Hi
glad I discovered this forum. You guys seem to be experts at this type of stuff. I don’t understand browser fingerprinting. I mean I get the idea but I don’t understand it at a conceptual level.
My question is, if I have a laptop that has windows 10 on it, and I go to some website, and then I change the operating system to windows 7 or ubuntu, and then I go to that same website again (with a different ip obviously), can the website owner tell this is the same device?
In theory, no. Tor Browser tries its best to make every user look the same. Websites can’t determine the exact 1:1 fingerprint of the device you’re using, so, simplifying, changing the OS moves you from “some Windows 10 Tor with JS enabled user” to “some Linux Tor with JS enabled user”.
You can check out sites like https://coveryourtracks.eff.org/ to see what your fingerprint looks like.
And here are open fingerprinting tickets.
What we have are “buckets” of Tor Browser users. We don’t know for sure how many users are in each “bucket” . We assume advanced scripts are not fooled by randomized values (the real value is still protected). We also assume advanced scripts and back-end servers to be able to fuzz not-so-stable-metrics (such as changing inner window dimensions, adblocking) to still link traffic - i.e some metrics are not very stable from a FPing perspective but they can help short term). So we expect everyone has a static fingerprint when it comes to advanced scripts. And we assume the worst.
Some things you cannot hide. Such as your OS, version, your OS architecture bits, available fonts (either you have it or not), that you are using Tor Browser and so on. And some things we cannot lie about because it is needed as is: e.g. if you need Arabic, then request pages in Arabic or it defeats the purpose, e.g. if your inner window’s viewport is a certain width and height we report that otherwise layout and other bits and bobs break. We still have defenses for these some of these: such as tightening secondary languages, snapping inner window viewports to set sizes, only shipping one of each locale such as en-US and not also en-GB, en-CA , limiting fonts and bundling some to make everyone (per OS) as similar as possible. We can’t lie about the timezone, but we can enforce everyone the same - but we have to actually use that one timezone.
And some things we can lie about - such as your hardwareConcurrency, or audio latency. We also lie about canvas - but totally randomizing it (and you can allow it per site for that session if you really need it: this does not compromise your fingerprint as it is only on that site, only for that Tor Browser session, it is only one metric albeit one that can yield good entropy, and for sites to link you all those sites need the exact same canvas tests - which is often not the case)
Anyway, long story short, all this means that not all Tor Browser users look the same - this is a phallacy. Tor Browser users instead should fit into as small a number of buckets overall as possible. The goal would be to get down to as little as e.g. 4 x OSes times 36 languages that we support times a dozen window startup sizes x 1 timezone x 1 version etc …
And we do that by eliminating or reducing the possible results or buckets per metric: a metric being something like your OS (windows, linux, mac or android), or your screen width, etc. We don’t mind if we split the protection between existing buckets, such as per OS, because you can’t hide that. Some we know work from testing and maths, and some we know because it’s hardcoded. And some we know need hardening.
The question is “how many overall buckets of users are there?” - and the answer is no-one knows, because to get that we would need a large real-world study of one test per profile/browser. This would give us a general idea of how many buckets and the spread of users. The spread matters, because not all buckets are equal - someone using english on windows 11 is going to be more prevalent than someone using tibetan on linux, for example.
But what we do know is that we have made it extremely hard for a Tor Browser user to stick out. With advanced scripts, the bigger the crowd, the better the protection. Imagine all those buckets from the large numbers of English-language users on Windows 11 with 1000x1000 res down the log thin tail to the less populated buckets such as Tibetan on Linux with 900 x 600 res - filled up with say 6 million Tor Browser users. Now imagine if we had 60 million users. Suddenly those small groups of users would be better protected as they have more users in their same bucket (i.e same fingerprint).
And of course, with Tor Browser, you are using the tor protocol, so your IP address (even as a fuzzy loose data point, such as using a VPN from company x, or from ISP y) is irrelevant. You are anonymous (until you tell someone)
Hope that answers your question
What if I am not using Tor browser? What if I am using firefox with vpn running in the background. Then I wipe windows 10 and install windows 7. And then I go to the same website in firefox, but change vpn location. Would the website know this is the same device?
What if I am not using Tor browser?
That depends on the script and what fingerprinting defenses you have. If you assume the worst (i.e an advanced script that overs enough everything), then you’re done for - see below about naive scripts. Also in your example, for starters, detecting Win10 vs Win7 is trivial due to font differences per OS release and even character fallback. You may have better success hiding some metrics (i.e making them look the same) but this is not how it works with naive scripts. Advanced scripts require a crowd (so lots of users with the same fingerprint), and on Firefox no matter what you do, you will NOT be a in crowd
Read this: 3.3 Overrides [To RFP or Not] · arkenfox/user.js Wiki · GitHub
The best any browser can confidently do, excluding Tor Browser, is fool naive scripts.
A naive script is one that doesn’t detect randomized values as random and thus each visit-per-session or per-execution the fingerprint changes. So an advanced script will still get you. Your VPN is also a data point. On tor can provide anonymity, as only you know the destination and the origin of requests - everyone else is blind
thank you, this is a very good explanation. Now what if I am not using Tor. What happens in this situation:
Windows 10, firefox, vpn running in background. I visit some website.
I format the hard drive, install windows 7 or ubuntu. turn on vpn, connect to a different location, and visit the same website in firefox.
Can the website tell this is the same device? or they can just tell this is a Dell latitude 5540 laptop (just making up a model number, not sure if that exists) but there are thousands of Dell latitude 5540 sold so it could be any of those laptops?
but I don’t understand how they can know it’s the same device. If I am using a dell latitude 5540 laptop, wouldn’t it have the same fingerprints as any other dell latitude 5540? or every laptop that dell sells is in some way unique?
not every device has the same settings. Not every dell latitude 5540 with the same OS will look the same. Here are some examples
- different OS language install
- affects system fonts and font fallback
- affects the language, locale, formatting
- timezone - not everyone is in the same timezone
- screen/window measurements
- users can modify their task bar height and position
- they can change the system scaling
- graphics cards, gpu
- different drivers and variances in rendering: such as canvas/webgl
- system settings such as custom formatting, or enabling/disabling cleartype, or changing text sizes
- system preferences for prefers light/dark/high contrast
- system themes (can affect css colors and fonts)
- IP ranges used
And I’m only touching the surface. Once again, if you do nothing, you are unique
What about if you used Brave with fingerprint protection set to strict, adblock set to aggressive, and a VPN. Would that hide enough for them not to tell? The biggest giveaway would be OS type in the useragent string and I don’ believe there is anything you can do to spoof it
I’ve already laid out the basic rules - see the arkenfox wiki link - only TB can defeat advanced scripts (always assume the worst case, adversary)
Brave randomizes or restricts or spoofs (and thus protects) a number of metrics, but not enough to fool advanced scripts [1]. Brave strict mode covers some high entropy metrics such as canvas, webgl, secondary languages, and limits others such as user agent, font enumeration (not to be confused with other font metrics). The user agent is low entropy. Brave does not claim, nor aim, to defeat advanced scripts
[1] Without real world controlled tests, it is hard to determine the exact entropy, but the more protections Brave adds, the harder it becomes for advanced scripts to determine you are unique (or unique enough: you need a crowd). My gut feeling is that subpixels and other sizing differences e.g from the system, other font metrics, timezone names (there are over 400 possible values here) and a dozen more other metrics would still create a high entropy fingerprint. And a VPN (or ISP) is still a fuzzy data point. Only tor will truly make your IP useless in a crowd (and Brave tor users are not a large enough crowd). And only TB covers enough metrics (for now: one day Brave may get there as a natural progression of Peter’s efforts - but this is not their goal - I occasionally chat with Peter Snyder)
I am done answering (or rather repeating the same answer), I’ve laid out, in as simple terms as I can, the general gist of it all
I would like to add that websites can only fingerprint data that’s being exposed by the browser in one way or another. The resolution of your webcam will never be part of a fingerprint (unless you grant some website access to your webcam, of course). Using the Safest security setting in Tor Browser disables a lot of the data sources that fingerprinters can use. Among other things, it disables JavaScript, which already rules out most fingerprinting on the web, although fingerprinting without JavaScript is also possible, see https://noscriptfingerprint.com/ for an example. This fingerprinting is less effective, because they have less data to work with, and the less data they have, the more overlap there will be between different users.
The most sophisticated fingerprinting software I have found is CreepJS. It’s fun to play around with.
Thanks for your response.
How do good you think the anonymity of Tor Browser Android is in comparison to the desktop bundle? The inability to hide screen size must add extra useful identifying information? Its a shame Safest level is required for full protection since that means no JavaScript, which effectively excludes you from the usability of most websites.
A browser know about which gpu is in the system too?