Digging into the AI ratings in Apple Photos
June 15, 2020 9:33 AM Subscribe
MeFite simonw has been building tools to liberate photo metadata from Apple Photos. In addition to machine learning labels (it automatically tags photos with categories as detailed as lemur, pelican and seal) Simon also found an intriguing collection of quality scores, with names like ZPLEASANTCAMERATILTSCORE and ZSHARPLYFOCUSEDSUBJECTSCORE. Come see the most aesthetically pleasing photographs of pelicans according to Apple's fancy machine learning algorithms! [via mefi projects]
The exact technology behind Apple Photos' machine learning for photo recognition is vague, but TechCrunch learned that Apple quietly acquired computer vision startup Regaind in September 2017, and included this write-up (and guesses) on the technology at the time:
The exact technology behind Apple Photos' machine learning for photo recognition is vague, but TechCrunch learned that Apple quietly acquired computer vision startup Regaind in September 2017, and included this write-up (and guesses) on the technology at the time:
Regaind has been working on a computer vision API to analyze the content of photos. Apple added intelligent search to the Photos app on your iPhone a couple of years ago. For instance, you can search for “sunset” or “dog” to get photos of sunsets and your dog.That's all fine and dandy, but as Simon notes, Apple Photos hides a lot of that interesting machine learning metadata from the end user. That's where Rhet Turnbull's osxphotos came in, allowing Simon to initially craft dogsheep photos with less than 100 lines of code.
In order to do this, Apple analyzes your photo library when you’re sleeping. When you plug your iPhone to a charger and you’re not using your iPhone, your device is doing some computing to figure out what’s inside your photos.
Regaind goes one step further and can tell you the technical and aesthetic values of your photos. For instance, if you shoot a bunch of photos in burst mode, Regaind could automatically find the best shot and use it as the main shot in your photo library. Regaind could also hide duplicates.
What a fun use of datasette to publish the data! Also probably worth shouting out Apache Tika, which is another tool for pulling out photo metadata.
posted by Going To Maine at 9:54 AM on June 15, 2020 [2 favorites]
posted by Going To Maine at 9:54 AM on June 15, 2020 [2 favorites]
I'm old enough to remember when this was work the NSA did in-house.
I know this is a joke, but isn’t Apple the one company on earth trying to do this kind of thing on your own device privately, rather than on its servers in order to accumulate data about you?
posted by snofoam at 10:38 AM on June 15, 2020 [7 favorites]
I know this is a joke, but isn’t Apple the one company on earth trying to do this kind of thing on your own device privately, rather than on its servers in order to accumulate data about you?
posted by snofoam at 10:38 AM on June 15, 2020 [7 favorites]
This is very cool. Now if they just made it read/write instead of read-only, you could have some real fun.
posted by Winnie the Proust at 11:06 AM on June 15, 2020
posted by Winnie the Proust at 11:06 AM on June 15, 2020
I'm old enough to remember when this was work the NSA did in-house.
I know this is a joke, but isn’t Apple the one company on earth trying to do this kind of thing on your own device privately, rather than on its servers in order to accumulate data about you?
There’s also an assumption here that labeling image content is malevolent. It’s not something I particularly like, but the idea that people don’t appreciate being able to search images based on content seems silly.
posted by Going To Maine at 11:24 AM on June 15, 2020 [3 favorites]
I know this is a joke, but isn’t Apple the one company on earth trying to do this kind of thing on your own device privately, rather than on its servers in order to accumulate data about you?
There’s also an assumption here that labeling image content is malevolent. It’s not something I particularly like, but the idea that people don’t appreciate being able to search images based on content seems silly.
posted by Going To Maine at 11:24 AM on June 15, 2020 [3 favorites]
Being able to search for automatically labeled photos of "protests", "riots", "police" in the cloud might possibly be useful to the current malevolent regime, especially when combined with face recognition.
posted by monotreme at 1:15 PM on June 15, 2020
posted by monotreme at 1:15 PM on June 15, 2020
Being able to search for automatically labeled photos of "protests", "riots", "police" in the cloud might possibly be useful to the current malevolent regime, especially when combined with face recognition.
Then it's a good thing everything this post talks about happens on the device and not in the cloud.
posted by sideshow at 1:20 PM on June 15, 2020 [7 favorites]
Then it's a good thing everything this post talks about happens on the device and not in the cloud.
posted by sideshow at 1:20 PM on June 15, 2020 [7 favorites]
> Then it's a good thing everything this post talks about happens on the device and not in the cloud.
Not so sure about that:
Absent from that list is Photos. So if you back up to iCloud, it's encrypted at rest, but the decryption keys are in their hands as well? Not sure how else icloud.com would work with Photos.
posted by pwnguin at 3:13 PM on June 15, 2020 [2 favorites]
Not so sure about that:
End-to-end encryption provides the highest level of data security. Your data is protected with a key derived from information unique to your device, combined with your device passcode, which only you know. No one else can access or read this data.
These features and their data are transmitted and stored in iCloud using end-to-end encryption:
Absent from that list is Photos. So if you back up to iCloud, it's encrypted at rest, but the decryption keys are in their hands as well? Not sure how else icloud.com would work with Photos.
posted by pwnguin at 3:13 PM on June 15, 2020 [2 favorites]
I mean yeah, obviously the photos themselves have be accessible*, otherwise stuff like iCloud.com wouldn't work.
I was talking about the subject of the post, which is about the photo analysis stuff, which stays on the device. Page 20 of this tech brief (PDF) lays it out.
Here is a key paragraph. emphasis mine:
Unlike other services that gather user photos and analyze them on their servers, learning everything about those photos in the process, Apple uses on-device intelligence. This means that the user’s photos never leave their device to be analyzed.
* "accessible" is doing a lot of heavy lifting here, the PDF I linked lays out what that means.
posted by sideshow at 4:18 PM on June 15, 2020 [4 favorites]
I was talking about the subject of the post, which is about the photo analysis stuff, which stays on the device. Page 20 of this tech brief (PDF) lays it out.
Here is a key paragraph. emphasis mine:
Unlike other services that gather user photos and analyze them on their servers, learning everything about those photos in the process, Apple uses on-device intelligence. This means that the user’s photos never leave their device to be analyzed.
* "accessible" is doing a lot of heavy lifting here, the PDF I linked lays out what that means.
posted by sideshow at 4:18 PM on June 15, 2020 [4 favorites]
That's part of the reason why the A13 has a monster ML accelerator on the die.
posted by Your Childhood Pet Rock at 4:56 PM on June 15, 2020 [4 favorites]
posted by Your Childhood Pet Rock at 4:56 PM on June 15, 2020 [4 favorites]
I'm guessing this AI was trained by poorly paid workers who had to look at thousands of photos per week and tag them. Is my guess right?
(Though it does sound like a better job than looking at thousands of photos per week that have been flagged for being horrible.)
posted by clawsoon at 5:49 PM on June 15, 2020
(Though it does sound like a better job than looking at thousands of photos per week that have been flagged for being horrible.)
posted by clawsoon at 5:49 PM on June 15, 2020
No. While AI/ML is not my department, but this kinda stuff is lightyears away from just clicking "this does/doesn't look like a dog" millions of times. It involves a lot of reading math papers with titles that sounds like gibberish, like "Uncertainty quantification for nonconvex tensor completion: Confidence intervals, heteroscedasticity and optimality".
But, some of the Apple work is published in journals, or otherwise made public. Here you can read about how the on-device face detection stuff works.
posted by sideshow at 6:48 PM on June 15, 2020 [1 favorite]
But, some of the Apple work is published in journals, or otherwise made public. Here you can read about how the on-device face detection stuff works.
posted by sideshow at 6:48 PM on June 15, 2020 [1 favorite]
I've attempted to read some of those papers and didn't understand a whole lot, but it did seem that a training set (the larger the better) was important for most of them. I know that there are some problems (e.g. grouping) that can be tackled with machine learning without needing a training set, but - for example - in the face detection article you linked to they used a training set. In an article a few weeks ago about training AI to spot image manipulation in research papers that was posted here on the blue, one of the problems they mentioned was not having a large enough training set for the AI to become any good at it.
If you had a large enough collection of images and comments attached to them I'm guessing you could turn the problem into an unsupervised learning problem where you're grouping words with images. As I understand it, that's part of how Google's image search works. It'd be interesting to know if Apple took a similar approach in this case. Do they have a corpus of images+comments they can do sentiment analysis on?
posted by clawsoon at 7:11 PM on June 15, 2020
If you had a large enough collection of images and comments attached to them I'm guessing you could turn the problem into an unsupervised learning problem where you're grouping words with images. As I understand it, that's part of how Google's image search works. It'd be interesting to know if Apple took a similar approach in this case. Do they have a corpus of images+comments they can do sentiment analysis on?
posted by clawsoon at 7:11 PM on June 15, 2020
I can virtually guarantee you that Apple has some sort of internal hot dog / not hot dog data set, plus probably some subcontracted ones, plus whatever state of the art public neutral network best serves their, plus probably some public data sets. That’s just kind of the way it goes.
posted by Going To Maine at 7:18 PM on June 15, 2020
posted by Going To Maine at 7:18 PM on June 15, 2020
How would those datasets have been created? People labeling pictures thousands at a time, or fully automated labeling?
posted by clawsoon at 7:25 PM on June 15, 2020
posted by clawsoon at 7:25 PM on June 15, 2020
Tangential half-baked theory: I've noticed how important the whites of my daughter's eyes are for me in teaching her. Since I know exactly what she's looking at, I can know exactly what word to teach her.
If I were smarter I'd turn this insight into a billion-dollar AI startup.
posted by clawsoon at 7:54 PM on June 15, 2020
If I were smarter I'd turn this insight into a billion-dollar AI startup.
posted by clawsoon at 7:54 PM on June 15, 2020
Here is what I was thinking of for preparing AI training sets:
Companies that lack the required resources or expertise can take advantage of a growing outsourcing industry to do it for them. A Chinese firm called mbh, for instance, employees more than 300,000 people to label endless pictures of faces, street scenes or medical scans so that they can be processed by machines. Mechanical Turk, another subdivision of Amazon, connects firms with an army of casual human workers who are paid a piece rate to perform repetitive tasks.posted by clawsoon at 8:11 PM on June 15, 2020
Cognilytica reckons that the third-party “data preparation” market was worth more than $1.5bn in 2019 and could grow to $3.5bn by 2024. The data-labelling business is similar, with firms spending at least $1.7bn in 2019, a number that could reach $4.1bn by 2024. Mastery of a topic is not necessary, says Ron Schmelzer, also of Cognilytica. In medical diagnostics, for instance, amateur data-labellers can be trained to become almost as good as doctors at recognising things like fractures and tumours. But some amount of what ai researchers call “domain expertise” is vital.
> While AI/ML is not my department
Well, it is mine =)
There are open data sets available for this sort of thing. And closed ones for sale. The Apple Camera team is likely supplementing those with some amount of human graded data, as well as spot checking the data they acquired, because Imagenet data is hardly clean. A number of AI firms have gotten improvements out of normal algorithms by just cleaning up the training datasets, because at the end of the day, that dataset is 14 million images graded by humans via Amazon MTurk. Mistakes will be made, and they will be (back) propagated.
posted by pwnguin at 8:33 PM on June 15, 2020 [3 favorites]
Well, it is mine =)
There are open data sets available for this sort of thing. And closed ones for sale. The Apple Camera team is likely supplementing those with some amount of human graded data, as well as spot checking the data they acquired, because Imagenet data is hardly clean. A number of AI firms have gotten improvements out of normal algorithms by just cleaning up the training datasets, because at the end of the day, that dataset is 14 million images graded by humans via Amazon MTurk. Mistakes will be made, and they will be (back) propagated.
posted by pwnguin at 8:33 PM on June 15, 2020 [3 favorites]
« Older leftovers of some occult ceremony or just a place... | You will make mistakes. Don't give up. Newer »
This thread has been archived and is closed to new comments
I'm old enough to remember when this was work the NSA did in-house.
posted by gimonca at 9:44 AM on June 15, 2020 [3 favorites]