Please Revert Patch “Neutering” FireFox’s use of Apple APIs for Text Recognition in Images

abc · February 6, 2024, 8:32am

I read this announcement. Please revert the few patches. It is not a bug.

Bug tor-browser#42057: Disable Platform text-recognition functionality

I believe the analysis behind removing this feature Mozilla included, is wrong.

In 2017 it was known Apple was going to start with on-device analyses.

There was never any question Apple was going to pull the stunts marketing-oriented companies were/are doing, sending your images and text to them for scrutiny, analysis, and retention; and then try to profit from it.

2021-06-07 TheVerge.com

Apple says the feature is enabled using “deep neural networks” and “on-device intelligence,” with the latter being the company’s preferred phrasing for machine learning. (It stresses Apple’s privacy-heavy approach to AI, which focuses on processing data on-device rather than sending it to the cloud.)

2021-07-20 MacWorld.com: Live Text in macOS Monterey destroys these paid text extraction apps

In the upcoming release of macOS 12 Monterey (as well as in iOS 15 and iPadOS 15), Safari automatically recognizes text in images on a web page and in the Photos app when you’re viewing an image. You can select and copy that text. The feature requires Apple’s neural engine, available in M1 Apple silicon Macs and mobiles with an A12 Bionic chip or later, which appeared starting in some iPhones in 2018 and some iPads in 2019. You can test this out using the public beta. It does an excellent job.

2021-09-29 CNBC.com

While object recognition has been around for a while, Apple says that its implementation differs because it happens on the device, instead of on a cloud server.

2021-09-21 9to5mac.com quoting Apple:

iOS 15 uses secure on-device intelligence

Apple API Docs: Recognizing Text in Images

Add text-recognition features to your app using the Vision framework.

Apple API Docs: Detecting Objects in Still Images

You can initialize a VNImageRequestHandler from image data in the following formats:

Image data compressed or held in memory, as you might receive over a network connection. For example, photos downloaded from a website or the cloud fall into this category.

Emphasis mine.

Bug tor-browser#42057: Disable Platform text-recognition functionality

Mozilla’s text recognition API is currently macOS only and calls out to these platfoms apis: Recognizing Text in Images | Apple Developer Documentation

In the future this could/should be replaced with local in-process OCR system like teseract ( https://github.com/tesseract-ocr/tesseract ). For now let’s neuter the global check to hard return false always and prevent all the dependent code paths from being taken.

I am not against open source but a bird in the hand is worth two in the bush. This is here now, not another large software project to ingest — after auditing it.

Under “Activity”

… we’re going to disable this (macOS-only) feature in Tor Browser because it calls out to macOS platform APIs with unknown local disk-leak defense or remote telemetry to Apple. Do you have a preference for Mullvad Browser?

As pointed out above, Apple performs this work on-device and in memory.

I read quite a few pages of Apple’s docs and see nothing even hinting they would write anything to disk versus point to the string in memory. It would be a waste of CPU, time, and battery power.

Where is the evidence anything is being written to disk, let alone sending anything to Apple.com?

Please revert this in this alpha. Thank you.

morgan · February 6, 2024, 6:35pm

Even if we were able to audit the macOS implementation and verify it does nothing like this now there’s no promise that Apple will continue to respect user’s privacy in this regard in the future. It has already shown its willingness to pitch somehow privacy preserving image surveilance in the past and have demonstrated that users should not trust Apple with their data.

Any Firefox feature which hands user data in a gift box to Apple (or Microsoft, Google, etc) is a non-starter and will never be enabled in our browsers.

Now if someone where to implement or integrate an open-source, reviewable and verifiably local in-browser OCR system (ie using tesseract or similar) we would be happy to review the patches and offer this feature to our users.

abc · February 8, 2024, 1:54am

The WIRED article you linked to is from early August 2021. Apple was under pressure from the likes of the FBI, but public pressure caused them to back off.

Any Firefox feature which hands user data in a gift box to Apple (or Microsoft, Google, etc) is a non-starter and will never be enabled in our browsers.

Who’s arguing for that? And where is the evidence again (Apple sending scanned text to themselves)?

I don’t have the hardware/OS to try it out, but I’m sure lots of people do—and who run LittleSnitch &/or WireShark—it would be found out and reported on almost immediately. A coup for those who wish to revive the OS wars of the 80’s & 90’s, or make a name for themselves.

Referring to your link from 2021, I do have access to a modern iOS and see Apple’s “Sensitive Content Warning” is, as they describe, off by default.

Sensitive Content Warning uses on-device machine learning to analyze photos and videos. Because they're analyzed on your device, Apple doesn't receive an indication that nudity was detected and does not get access to the photos or videos as a result.

Settings > Privacy & Security
iOS 17 Settings > Privacy & Security

Settings > Privacy & Security > Sensitive Content Warning
iOS 17 Settings > Privacy & Security > Sensitive Content Warning

Back to OCR

So instead of letting the hard work of Mozilla—which last I heard, does have an interest in security and privacy—be used within Tor Browser, you are forcing users to save text filled images to disk, so they can use an uncrippled app to do so. How does that make users safer if you are concerned about potentially sensitive stuff being saved to disk?

morgan · February 8, 2024, 8:29am

We seem to be starting from incompatible first principles, namely we are starting from the position that privacy (and security, but that’s a separate axis) should be protected by design not by policy. You are never going to convince me or most anyone else here that we should just trust for-profit software companies to pinky swear to do the right thing with regards to protecting their user’s interests.

If users don’t care about protecting their privacy, then they can use another browser. We’re not going to compromise our values here and put the other users at risk who need their privacy because Apple says “yeah you can totally trust us, lol.”

FranklyFlawless · February 8, 2024, 12:33pm

Fork and maintain your own version of Tor Browser instead.