Apple Provides Further Clarity on Why It Abandoned Plan to Detect CSAM in iCloud Photos

sideshowuniqueuser · Sep 1, 2023

Apple Fan 2008 said:
Pornography itself is a slippery slope. You crave more and more explicit content and eventually get to CP.

return2sendai · Sep 1, 2023

Good. Now put the public benches back in the public parks. Removing them because homeless people sleep on them inconveniences all the people (homeless or homed) who want to use them.

svish · Sep 1, 2023

Glad to hear that Apple is not going ahead with CSAM scanning

Analog Kid · Sep 1, 2023

Ethosik said:
Thank you for explaining things. I appreciate it.

I don’t think I was referring to false negatives. I was referring to false positives. A false negative would be something illegal that was not flagged.

So if a false positive is so incredibly rare, probably less so than all false arrests combined, why not set the threshold to one or two then involve the authorities?

I’m concerned about false positives but if that is truly almost impossible I say anyone having even one illegal image should be arrested.

Ok, I'm going to try to explain this a bit by example. This isn't a rigorous mathematical treatment, but hopefully gives you an idea of what I mean. Please understand this is an example involving coins and not image hashes and that if you read this and walk away thinking every image has a 50% chance of a match, you're understanding it wrong.

Let's assume there's a gambling ring out there counterfeiting trick coins that always land on tails and they defraud the public by flipping coins and always betting tails.

The search is on for these crooks, and they form a citizen posse to to go door to door. They ask everyone to take a coin from their pocket and flip it. If they're a crook, you'd expect tails.

Of course even an honest person will flip tails half the time-- that's a 50% false positive rate. So that's unacceptable.

They ask everyone to take two coins from their pocket and flip them. If they flip tails twice, it's off to the hoosegow. Each flip of a fair coin has a 50% false positive rate, but the chance of two tails in a row is only 25% (0.5^2) so you'll have a 25% false positive rate.

Make everyone flip 5 coins and you have a false positive rate of 0.5^5=3%, or about 1 in 32 people.

Make everyone flip 30 coins and you have a false positive rate of 0.5^30, or about 1 in a billion.

If they're a crook, they probably have a pocket full of these coins and they're sure to flip 30 in a row. If it's just a lawful citizen, they have a one in a billion chance of being flagged. If the city is Dallas then the chance that someone in the population is falsely flagged is about one in a billion false positives for a population of about a million. That's about the same as Apple's iCloud estimates of one in a trillion for a billion accounts.

If the posse feels that's not enough to justify dragging someone into the street in front of their neighbors with the shame that brings even before a trial, they can agree that if someone flips 30 tails they'll go to the local university and put the coins in the scanning electron microscope to confirm that they're counterfeit before calling the cops. You don't have the time or money to scan everyone's coins, but 30 tails in a row is unusual enough that you can afford the extra due diligence.

That's how you get a situation where you have very high confidence with 30 tests but don't have enough trust in any single test to rely on it. Some petty criminals with one trick coin in their pocket will slip under the radar, but you'll get the kingpins and avoid the worst case scenario of accusing someone who's innocent.

The math for the Apple scheme gets far more complicated because there's more images, and more things to match, and it's not just a binary coin flip. I don't think I can make any estimates of the underlying probabilities from what's published, but it's basically a variation on this theme.

ddhhddhh2 · Sep 1, 2023

I think this has already become a philosophical issue. No matter what Apple says, there will be people who agree or disagree. So Apple just needs to say, "Hey, we're not doing it anymore."

Grey Area · Sep 2, 2023

Analog Kid said:
The point is that the false positive won't generally look anything like the target CSAM image.

Doubtful. I have experimented with many different perceptual hashing algorithms in my work, and there would always be false matches like a picture of a cow on a green field producing the same hash as a picture of a tractor on a green field. Turn up the threshold to prevent this, and suddenly minimal cropping prevents matching of some otherwise identical pictures. It is hard to find a good balance between false positives and not missing what you want to get, and it wil never be perfect.

AndiG · Sep 2, 2023

What are you guys arguing about? I think that really no one who talks about false positives and all that kind of 💩gets it.

First of all, CSAM would have opened the door to mass surveillance. Today it is about CP and tomorrow it is about text messages.

E.g. China, who controls the chinese iCloud infrastructure, would like to search for pictures of the Tian‘anmen massacre. It is a crime in this country if you know about history.

But not only China, all states who fight against terrorism or pretend to do so, they would like to search for photos of military buildings or critical infrastructure.

The authorities tells Apple what kind of photographs are illegal. But who controls the authorities?

It is basically the same story over and over again. Renember the backdoor, authorities want Apple to create? Apple has always argued that „once it becomes available, it will be abused“.

theluggage · Sep 2, 2023

Analog Kid said:
They ask everyone to take two coins from their pocket and flip them. If they flip tails twice, it's off to the hoosegow. Each flip of a fair coin has a 50% false positive rate, but the chance of two tails in a row is only 25% (0.5^2) so you'll have a 25% false positive rate.

...and you have neatly demonstrated the exact false logic that lead to a massive miscarriage of justice in the Sally Clarke case and which is more generally known as "The Prosecutor's Fallacy".

Specifically, you can only say "the chance of two tails in a row is only 25% (0.5^2)" when you are talking about uncorrelated events. Look at any high school probability task and they'll usually make a point of saying that it is a "fair coin" - part of that means that each toss is completely independent of what went before. Your specific example is kinda value because real-life coins are a reasonably close approximation to theoratical "fair coins" - but you have to be very, very careful applying that to a different situation where you don't know that the events are totally random and uncorrelated.

The basis of the Sally Clark error - which convicted an innocent woman and possibly contributed to her early death - was the false logic that "the probability of one sudden infant death in a family is p so the probability of two sudden infant deaths is p x p which is so infinitesimal that it must be murder" - treating the deaths as two totally random events and ignoring the possibility that there might be some genetic or environmental cause that made the second death a near certainty.

(It's a very common misconception - I once had to call out a question on a major UK maths exam where the same basic probability question got recycled year after year in a different context - it was OK while it was tossing two fair coins, or throwing a fair die, or throwing a dart (maybe) - but then one year it was the probability of both seeds planted in the same pot germinating which is obviously dependent on environmental factors, yet the expected answer was

More generally, though - if you're selecting people from a large population then after the fact that low-low false positive chance becomes misleading. Its the difference between DNA fingerprinting (which also has a false positive rate) 10 actual suspects in a case - a match would be pretty conclusive - vs. checking against a database of 300 million DNA fingerprints (which brings a significant chance of at least one false match).

Analog Kid said:
Make everyone flip 30 coins and you have a false positive rate of 0.5^30, or about 1 in a billion.

According to the interwebs there are about 1.46 billion active iPhone users. So a 1 in a billion chance of a false match means the probability of finding a false match is about 1.

...and the contents of a person's picture collection is clearly not uncorrelated - they're likely to have lots of photos of the same or similar subjects - if not multiple copies of the same image. if someone has one image that is generating a false match then it is very likely that they will have a second photo that matches.

Now, if Apple really did have an algorithm that could match images that had been squashed, stretched, cropped, resampled recoloured etc. without posing a significant risk of false matches, then that's an extraordinary claim and its up to Apple to produce the extrordinary evidence to match - before rolling it out to a hundred billion images. The white paper made a big deal about how hard the system was to fool, but was rather more vague about the false positives (and what they did say sounded an awful lot like the Prosecutor's Fallacy).

Analog Kid said:
When the 30th hash fails, Apple can unlock the hashes and derivative images. A human can then look, not at the actual images themselves, but some unexplained derivative of them.

...another claim that just doesn't add up, because I only see two possibilities:

(a) The derivative images resemble the original images, so Apple would be able to meaningfully check that the match had some basis in reality - in which case the "hash" isn't really a hash - it would have to contain a low-res version of the image - and the claims about Apple not being able to access your images "at all" would all be false (and the hashes of known CSAM would themselves be CSAM if they could be converted into something recognisable).

(b) The derivative images are a visual representation of the hash and bear no resemblance to the original image - and comparing them proves nothing about the relationship between the two actual images. That seems plausible, because it is awfully like this widely used technique for human-friendly comparions of long numbers:

...which is a quick way of visually checking that the "fingerprint" of the key on the host you are logging into is what you expected. However, all that amounts to is a more convenient way of checking that two long (256 bit or more) "fingerprints" are equal - humans are better at comparing images than long strings of numbers and letters. It does not guard against the possibility that the host has, somehow, falsely generated the expected fingerprint. In this example, the "fingerprint" is a cryptographic hash specifically designed to make that very unlikely, and which will change in response to the tiniest change in the actual key. In the Apple example we're talking about an image recognition hash specifically designed to produce the same hash for non-identical images. If the algorithm has, for some reason, generated a false match, the hashes will be equal and the derived images will be the same.

What that technique (and the '30 hits' threshold) would help guard against is random bit flips due to memory faults, power glitches and cosmic rays - which is absolutely an issue if you're dealing with data on that scale (its why people are complaining that the Mac Pro doesn't have ECC RAM) - but is completely unrelated to the chance of a 'false positive' caused by the algorithm matching some irrelevant feature of two images.

The real worry is that - if there are rebuttals to these concerns (and only the developers can say for sure), they should have been addressed at some length in the white paper - not glossed over.

jdavid_rp · Sep 2, 2023

After reading all this posts about pics that have the same hash, how likely would be to have trolls making pics with a hash that appears in the CSAM database and spread them like memes and other easily shared content?

hagar · Sep 2, 2023

MacBH928 said:
If MacOS scans your files they will be no different than google, at that point you might as well get an Android

Why would I install a mobile OS on my Mac? And why do you claim Google scans people's private files on their Android devices?

Grey Area · Sep 2, 2023

AndiG said:
What are you guys arguing about? I think that really no one who talks about false positives and all that kind of 💩gets it.

I think there is nothing wrong with discussing the technical aspects of such a system. However, I would agree that such a system should be rejected even if its implementation was proven to be flawless.

Something I do not see discussed much is the involvement of NCMEC, the organization that maintains the CSAM hash database, and that whole situation I find rather shady. NCMEC is a government-funded private NGO, a quasi-agency with agency powers but no oversight or transparency requirements. They effectively have the power to declare numbers illegal, and US companies are required by law to report user content matching such hash numbers to this private(?) organization.

There is no independent auditing of the NCMEC database, only they know what is in there. Insiders have claimed the database also includes individually harmless pictures that were confiscated along with CSAM. Whether true or not, I could see some sense in that, as possessing the exact same sundown picture that was also found on some creep's hard drive could indeed be a bit suspicious - but it would also make a detection algorithm that aims to find visually similar images all the more worrisome. Swiss federal police has complained that 90% of the warnings they get from NCMEC turn out to be irrelevant.

Last but not least, it was NCMEC who urged Apple to ignore criticism as the "screeching voice of the minority". This is an organization with a serious holier-than-thou attitude, they know what is right and everyone else better shut up and do as they say. Fighting CSAM is a worthy cause, but I have little trust in NCMEC or methods involving them.

MacBH928 · Sep 2, 2023

hagar said:
Why would I install a mobile OS on my Mac? And why do you claim Google scans people's private files on their Android devices?

You think if Apple scanned iOS they won't scan MacOS?

As for Google, you must be kidding. You do know their business is to collect your data to sell for advertisers right? They are worth $1.6T for that. So is Facebook.

AndiG · Sep 2, 2023

Grey Area said:
I think there is nothing wrong with discussing the technical aspects of such a system. However, I would agree that such a system should be rejected even if its implementation was proven to be flawless.

Something I do not see discussed much is the involvement of NCMEC, the organization that maintains the CSAM hash database, and that whole situation I find rather shady. NCMEC is a government-funded private NGO, a quasi-agency with agency powers but no oversight or transparency requirements. They effectively have the power to declare numbers illegal, and US companies are required by law to report user content matching such hash numbers to this private(?) organization.

There is no independent auditing of the NCMEC database, only they know what is in there. Insiders have claimed the database also includes individually harmless pictures that were confiscated along with CSAM. Whether true or not, I could see some sense in that, as possessing the exact same sundown picture that was also found on some creep's hard drive could indeed be a bit suspicious - but it would also make a detection algorithm that aims to find visually similar images all the more worrisome. Swiss federal police has complained that 90% of the warnings they get from NCMEC turn out to be irrelevant.

Last but not least, it was NCMEC who urged Apple to ignore criticism as the "screeching voice of the minority". This is an organization with a serious holier-than-thou attitude, they know what is right and everyone else better shut up and do as they say. Fighting CSAM is a worthy cause, but I have little trust in NCMEC or methods involving them.

It's not about the implementation or the technology - the question is who has control over this technology (it certainly won't be the smartphone user).

Once the technology is available, it will be used and abused. Apple will not be able to control this.

Analog Kid · Sep 2, 2023

theluggage said:
...and you have neatly demonstrated the exact false logic that lead to a massive miscarriage of justice in the Sally Clarke case and which is more generally known as "The Prosecutor's Fallacy".

Specifically, you can only say "the chance of two tails in a row is only 25% (0.5^2)" when you are talking about uncorrelated events. Look at any high school probability task and they'll usually make a point of saying that it is a "fair coin" - part of that means that each toss is completely independent of what went before. Your specific example is kinda value because real-life coins are a reasonably close approximation to theoratical "fair coins" - but you have to be very, very careful applying that to a different situation where you don't know that the events are totally random and uncorrelated.

The basis of the Sally Clark error - which convicted an innocent woman and possibly contributed to her early death - was the false logic that "the probability of one sudden infant death in a family is p so the probability of two sudden infant deaths is p x p which is so infinitesimal that it must be murder" - treating the deaths as two totally random events and ignoring the possibility that there might be some genetic or environmental cause that made the second death a near certainty.

(It's a very common misconception - I once had to call out a question on a major UK maths exam where the same basic probability question got recycled year after year in a different context - it was OK while it was tossing two fair coins, or throwing a fair die, or throwing a dart (maybe) - but then one year it was the probability of both seeds planted in the same pot germinating which is obviously dependent on environmental factors, yet the expected answer was

More generally, though - if you're selecting people from a large population then after the fact that low-low false positive chance becomes misleading. Its the difference between DNA fingerprinting (which also has a false positive rate) 10 actual suspects in a case - a match would be pretty conclusive - vs. checking against a database of 300 million DNA fingerprints (which brings a significant chance of at least one false match).

And you've neatly demonstrated how reading something and applying it outside of its expressed domain invalidates your argument:

Analog Kid said:
Ok, I'm going to try to explain this a bit by example. This isn't a rigorous mathematical treatment, but hopefully gives you an idea of what I mean. Please understand this is an example involving coins and not image hashes […]

The math for the Apple scheme gets far more complicated because there's more images, and more things to match, and it's not just a binary coin flip. I don't think I can make any estimates of the underlying probabilities from what's published

So while that's a very interesting discussion of your triumph over the UK school board, it has no bearing on what I'm discussing here.

theluggage said:
According to the interwebs there are about 1.46 billion active iPhone users. So a 1 in a billion chance of a false match means the probability of finding a false match is about 1.

Which misreads what I was saying and goes out of its way to ignore all the words around it. Also, I refer you back to this at the beginning:

Analog Kid said:
if you read this and walk away thinking every image has a 50% chance of a match, you're understanding it wrong.

Yet somehow your entire critique rests on that assumption...

theluggage said:
...and the contents of a person's picture collection is clearly not uncorrelated - they're likely to have lots of photos of the same or similar subjects - if not multiple copies of the same image. if someone has one image that is generating a false match then it is very likely that they will have a second photo that matches.

Your premise does not support your conclusion.

theluggage said:
...another claim that just doesn't add up, because I only see two possibilities:

For someone that came in waving logical fallacies around, I'm sure this one is on a list somewhere, right?

theluggage said:
(a) The derivative images resemble the original images, so Apple would be able to meaningfully check that the match had some basis in reality - in which case the "hash" isn't really a hash - it would have to contain a low-res version of the image - and the claims about Apple not being able to access your images "at all" would all be false (and the hashes of known CSAM would themselves be CSAM if they could be converted into something recognisable).

(b) The derivative images are a visual representation of the hash and bear no resemblance to the original image - and comparing them proves nothing about the relationship between the two actual images. That seems plausible, because it is awfully like this widely used technique for human-friendly comparions of long numbers:

View attachment 2254064

...which is a quick way of visually checking that the "fingerprint" of the key on the host you are logging into is what you expected. However, all that amounts to is a more convenient way of checking that two long (256 bit or more) "fingerprints" are equal - humans are better at comparing images than long strings of numbers and letters. It does not guard against the possibility that the host has, somehow, falsely generated the expected fingerprint. In this example, the "fingerprint" is a cryptographic hash specifically designed to make that very unlikely, and which will change in response to the tiniest change in the actual key. In the Apple example we're talking about an image recognition hash specifically designed to produce the same hash for non-identical images. If the algorithm has, for some reason, generated a false match, the hashes will be equal and the derived images will be the same.

What that technique (and the '30 hits' threshold) would help guard against is random bit flips due to memory faults, power glitches and cosmic rays - which is absolutely an issue if you're dealing with data on that scale (its why people are complaining that the Mac Pro doesn't have ECC RAM) - but is completely unrelated to the chance of a 'false positive' caused by the algorithm matching some irrelevant feature of two images.

The real worry is that - if there are rebuttals to these concerns (and only the developers can say for sure), they should have been addressed at some length in the white paper - not glossed over.

And just like that you a misinterpretation of my explanation, pass it through two strangely transparent strawmen, to a conclusion it's helpful for protecting against cosmic rays and try to use it as cover for side arguments that aren't related.

amartinez1660 · Sep 2, 2023

Just dropping an opinion droplet in this vast sea of comments:
These ideas whether implemented finally or not, I totally believe it comes from a good place… heck, maybe this is all diversion.

HOWEVER, companies time and time again have proven themselves to be quite unreliable for the most simple things, even for tasks that were solved decades ago (submarines imploding in 2023 anyone?).
Not for a single second I trust that this would be a flawless implementation both in privacy and correct tag triggering… since if simple things like “Hey Siri, set a timer for 15mins” fail often or macOS’s Systems Settings that has been working for 30+ years suddenly start having bugs.
I can totally see cases where taking a picture of a cloudy sky sends the FBI your way.
Should it be completely abandoned? Don’t know, debatable, I don’t think so… but if it will work properly!

I don’t know what’s up, it boils me how bad things have gotten overtime, hard to trust anything regarding sensitive matters.

Analog Kid · Sep 2, 2023

Grey Area said:
Doubtful. I have experimented with many different perceptual hashing algorithms in my work, and there would always be false matches like a picture of a cow on a green field producing the same hash as a picture of a tractor on a green field. Turn up the threshold to prevent this, and suddenly minimal cropping prevents matching of some otherwise identical pictures. It is hard to find a good balance between false positives and not missing what you want to get, and it wil never be perfect.

The training here matters, not just thresholds. If the network is trained to find contextually similar images, or not trained against it, it may cluster "things in fields" together. If the network is trained to find specific images and trained away from finding similar but distinct images, as Apple does here, it will better discriminate between them-- generate hashes a greater distance from the target hash. If non-target images of the types people are concerned about (generally just any image involving nudity) are in the training set of distraction images, it will train the network to differentiate on features other than simple nudity.

The intention here is to have the hash be invariant to perceptually similar transformations (crops, color shifts, noise, etc) while highly sensitive to differences in content.

I think people are generalizing a few things neural nets can do into assuming NeuralHash is using them for those things. It's hard to know what the collisions just from what we know, but most of the examples I've seen published of hash collisions against NeuralHash are very distinct images.

Analog Kid · Sep 2, 2023

amartinez1660 said:
Just dropping an opinion droplet in this vast sea of comments:
These ideas whether implemented finally or not, I totally believe it comes from a good place… heck, maybe this is all diversion.

HOWEVER, companies time and time again have proven themselves to be quite unreliable for the most simple things, even for tasks that were solved decades ago (submarines imploding in 2023 anyone?).
Not for a single second I trust that this would be a flawless implementation both in privacy and correct tag triggering… since if simple things like “Hey Siri, set a timer for 15mins” fail often or macOS’s Systems Settings that has been working for 30+ years suddenly start having bugs.
I can totally see cases where taking a picture of a cloudy sky sends the FBI your way.
Should it be completely abandoned? Don’t know, debatable, I don’t think so… but if it will work properly!

I don’t know what’s up, it boils me how bad things have gotten overtime, hard to trust anything regarding sensitive matters.

Yeah, I'm not sure what part of the conversations they were having convinced them to step back but I think Apple ent in trying to take a lesser of evils approach here in a bid to prevent more draconian approaches. I think we're going to continue to see things like CSAM and the like be raised as arguments against end-to-end encryption and the need for backdoors for government search.

If this particular system isn't deployed, hopefully it at least stands as an example of how we may not have to abandon all expectations of privacy in our efforts to protect children. If there were fundamental flaws found with this approach, maybe that will serve as a foundation for improvement.

That's not to say I'm particularly keen on having my personal devices acting as electronic snitches. As I've said in other threads, there's been concerns about dictators using this root out dissent which is something to worry about but Apple's included protections against that and, let's face it, dictators have other means of achieving that. From where I stand, this would be an unethical use of the technology.

What I'm less sure about is what the future looks like if we start believing that our devices (well the companies creating them) have a moral responsibility to report illegal behavior they're able to detect-- my small example is something like speeding: my phone as the ability to know my speed and the legal speed where I'm driving. Should it call the cops if break the law?

Analog Kid · Sep 2, 2023

Grey Area said:
I think there is nothing wrong with discussing the technical aspects of such a system. However, I would agree that such a system should be rejected even if its implementation was proven to be flawless.

We don't use flawless as our threshold for any other tool we use-- shouldn't the threshold be that it is at least as good and preferably better than current techniques? I mean heck, most law enforcement agencies still accept anonymous tips...

I think it's important to not launch a tool that's decidedly broken-- this has to work a whole heck of a lot better than the fall detection stuff does, for example.

Grey Area said:
Something I do not see discussed much is the involvement of NCMEC, the organization that maintains the CSAM hash database, and that whole situation I find rather shady. NCMEC is a government-funded private NGO, a quasi-agency with agency powers but no oversight or transparency requirements. They effectively have the power to declare numbers illegal, and US companies are required by law to report user content matching such hash numbers to this private(?) organization.

There is no independent auditing of the NCMEC database, only they know what is in there. Insiders have claimed the database also includes individually harmless pictures that were confiscated along with CSAM. Whether true or not, I could see some sense in that, as possessing the exact same sundown picture that was also found on some creep's hard drive could indeed be a bit suspicious - but it would also make a detection algorithm that aims to find visually similar images all the more worrisome. Swiss federal police has complained that 90% of the warnings they get from NCMEC turn out to be irrelevant.

Last but not least, it was NCMEC who urged Apple to ignore criticism as the "screeching voice of the minority". This is an organization with a serious holier-than-thou attitude, they know what is right and everyone else better shut up and do as they say. Fighting CSAM is a worthy cause, but I have little trust in NCMEC or methods involving them.

They did make an effort to mitigate that:

"The set of image hashes used for matching are from known, existing images of CSAM and only contains entries that were independently submitted by two or more child safety orga- nizations operating in separate sovereign jurisdictions. Apple does not add to the set of known CSAM image hashes, and the system is designed to be auditable."

There's real challenges though in the fact that nobody but a very few people are able to view the images in the databases.

I should be clear as I'm discussing all this that I'm not necessarily in favor of the approach Apple proposed, or of on-device scanning in any form, but if we're going to discuss it's merits and flaws it's worth trying to make sure we're discussing it accurately.

theorist9 · Sep 2, 2023

Analog Kid said:
Make everyone flip 30 coins and you have a false positive rate of 0.5^30, or about 1 in a billion. If it's just a lawful citizen, they have a one in a billion chance of being flagged. If the city is Dallas then the chance that someone in the population is falsely flagged is about one in a billion false positives for a population of about a million.

Let's assume everyone's honest, and that the probability any single test is wrong is 0.5^30, and we want to determine the chance we'll get one or more false positives in a city of 1M. Then I believe this is correct:

Probability any single test is wrong (false positive, since we've assume everyone's honest) = 0.5^30
Probability any single test is correct = 1 – 0.5^30
Probability that every one of the 10^6 tests are correct = (1 – 0.5^30)^(10^6)
Probability that the above is not the case, i.e., that one or more tests is wrong = 1 – (1 – 0.5^30)^(10^6) ≈ 0.001

Thus the probability that one or more honest citizens get a false positive is actually ≈ 0.1%. With a population of 100M, the probability increases to ≈ 10%.

Wanted797 · Sep 3, 2023

jdavid_rp said:
Isn’t this detecting feature already on their servers for iCloud? When this controversy came up I got to understand that they wanted to move the detecting process to the customers devices, so pictures on iCloud are being already monitorised.

Edit: A post of the telegraph talking about it:

Apple scans photos to check for child abuse

Apple scans photos to check for child sexual abuse images, an executive has said, as tech companies come under pressure to do more to tackle the crime.

www.telegraph.co.uk

Yes. Most cloud based photo storage systems have clear terms of service to not host illegal material.

Apple wanted to move this to being on device. So regardless if you agreed to iCloud Ts & Cs they were going to scan every photo you have or take.

Super creepy, and definitely likely to want to be used by less desirable governments.

Analog Kid · Sep 3, 2023

theorist9 said:
Let's assume everyone's honest, and that the probability any single test is wrong is 0.5^30, and we want to determine the chance we'll get one or more false positives in a city of 1M. Then I believe this is correct:

Probability any single test is wrong (false positive, since we've assume everyone's honest) = 0.5^30
Probability any single test is correct = 1 – 0.5^30
Probability that every one of the 10^6 tests are correct = (1 – 0.5^30)^(10^6)
Probability that the above is not the case, i.e., that one or more tests is wrong = 1 – (1 – 0.5^30)^(10^6) ≈ 0.001

Thus the probability that one or more honest citizens get a false positive is actually ≈ 0.1%. With a population of 100M, the probability increases to ≈ 10%.

That's right. That's what I meant though maybe I didn't say it clearly enough, I probably said it better here:

Analog Kid said:
Apple's analysis suggested one in a trillion accounts would trigger a false positive. So with a billion iCloud accounts that means a 1 in 1,000 chance of ever needing manual check. If that estimate is off by 5 or 6 orders of magnitude it means one person double checking the derivative data of 30 false positives a day. That doesn't sound like too much.

There's a 1 in a thousand chance (0.1%) that at least one honest citizen out of a million gets flagged (before the manual review). So a small load on the manual reviewers.

Another way to put it is that if you bet me $1 that an honest person would get flagged, it would be worth it to me to bet you up to $1000 that no one would. And even if I lost, we'd still do a manual confirmation before getting authorities involved.

And I think that scales to the 1 in a trillion accounts flagged with a billion accounts to mean (quite nearly) the same 1 in a thousand chance. If the population size scales by a 100 the probability scales by about 100-- it's not a truly linear relationship but the perturbations are small enough that the linear approximation holds to the number of digits we're talking about.

If you look at the (1 - 0.5^30)^(1e6) as (1+x)^n
with x= -0.5^30 ≈ -1e-9 and n=1e6
then each term of the binomial expansion falls off by 3 orders of magnitude: 1 - 1e-3 + 1e-6 - 1e-9
Only the first 2 terms are really useful here: 1 - 1e-3

For the case Apple describes x=-1e-12 and n≈1e-9 giving the same 3 orders of magnitude per term which gives essentially the same result by ignoring terms past the second.

I chose the population of 1M because it matched the Apple described case. The population of 100M doesn't really because x and n only differ by one order of magnitude, but it's still only a one in 10 chance you'd have to do a manual check of anyone in the population at that rate. I'd only bet you $10 to your $1. If you want to get back to a 1 in a thousand chance you'd have to flip 3 more coins.

Analog Kid · Sep 3, 2023

Wanted797 said:
So regardless if you agreed to iCloud Ts & Cs they were going to scan every photo you have or take.

It only would run if you uploaded your photos to iCloud, so presumably you'd have to agree to the iCloud Ts & Cs...

Wanted797 said:
definitely likely to want to be used by less desirable governments.

I never understood this argument. What's to stop bad governments from wanting to do this anyway? If they want to insist that Apple scan everyone's photos they still can demand that and Apple will likely deny the request. If they implement this hash scanning, they can demand Apple expand it, and Apple will likely deny the request.

hagar · Sep 3, 2023

MacBH928 said:
You think if Apple scanned iOS they won't scan MacOS?

As for Google, you must be kidding. You do know their business is to collect your data to sell for advertisers right? They are worth $1.6T for that. So is Facebook.

Sure. Apple might scan macOS. Just don’t see how Android is a realistic alternative.
And yes, Google collects user data. A lot. But do you have proof they scan personal, local files stored on user devices?

jdavid_rp · Sep 3, 2023

Analog Kid said:
It only would run if you uploaded your photos to iCloud, so presumably you'd have to agree to the iCloud Ts & Cs...

I never understood this argument. What's to stop bad governments from wanting to do this anyway? If they want to insist that Apple scan everyone's photos they still can demand that and Apple will likely deny the request. If they implement this hash scanning, they can demand Apple expand it, and Apple will likely deny the request.

There was someone that posted info about the company behind the CSAM database, that is has no transparency requirements and it’s a semi government company. They would just need to add the hashes of “problematic material” (EG: a pic depicting the US president as a clown”) to the database and send the updated version to Apple.
https://forums.macrumors.com/thread...t-csam-in-icloud-photos.2400202/post-32423927

No need to request to expand the feature as it’s made to work in any kind of pic and Apple wouldn’t know because they’re just hashes, no way to identify the pic they’re related to

mw360 · Sep 3, 2023

Wanted797 said:
Yes. Most cloud based photo storage systems have clear terms of service to not host illegal material.

Apple wanted to move this to being on device. So regardless if you agreed to iCloud Ts & Cs they were going to scan every photo you have or take.

Super creepy, and definitely likely to want to be used by less desirable governments.

Not true. The detection of offending photos still needed iCloud servers. There was no mechanisms for checking photos you didn't upload to iCloud and the phone couldn't complete the check on its own. Checks against a database of known illegal images couldn't possibly have any utility with photos you have just taken yourself, so why would they do that? The feature was mentioned in the iCloud T&Cs as 'pre-screening'.

Also, less desirable governments have a whole stack of features on iPhones they could exploit far more readily that this one. Photos, for example, already scans your images to understand the content and parse readable text, recognise people, and of course it tracks the date, time and GPS coordinates of each photo you take. All that is far more invasive that anything the CSAM detection could do, but nobody cares about that because it's not as overtly sinister as the topic of CSAM.

Apple Provides Further Clarity on Why It Abandoned Plan to Detect CSAM in iCloud Photos

macrumors 68030

macrumors 65816

macrumors G3

macrumors G3

macrumors regular

macrumors 6502

macrumors 65816

macrumors 604

macrumors regular

macrumors 68020

macrumors 6502

macrumors G3

macrumors 65816

macrumors G3

macrumors 68000

macrumors G3

macrumors G3

macrumors G3

macrumors 68040

macrumors 68000

macrumors G3

macrumors G3

macrumors 68020

macrumors regular

macrumors 68020

Our Staff