Turing completeness strikes again
December 16, 2021 9:11 AM   Subscribe

JBIG2 doesn't have scripting capabilities, but when combined with a vulnerability, it does have the ability to emulate circuits of arbitrary logic gates operating on arbitrary memory. So why not just use that to build your own computer architecture and script that!? That's exactly what this exploit does. Using over 70,000 segment commands defining logical bit operations, they define a small computer architecture with features such as registers and a full 64-bit adder and comparator which they use to search memory and perform arithmetic operations. It's not as fast as Javascript, but it's fundamentally computationally equivalent. How does the NSO Group zero-click iMessage exploit (NSO Group previously) actually compromise an iOS device? Google's Project Zero security researchers dig in.
posted by figurant (47 comments total) 32 users marked this as a favorite
 
You’ve got to admire the determination and ingenuity involved in this exploit, also in producing the explanation.

With that said , fuck you NSO group, you’re a bunch of terrible people.
posted by WaterAndPixels at 9:20 AM on December 16, 2021 [11 favorites]


Is this more like the Omicron or Delta variant?
posted by mikeand1 at 9:27 AM on December 16, 2021


Truly majestic, straight out of a Gibson novel.
posted by Going To Maine at 9:29 AM on December 16, 2021 [8 favorites]


(From the blurb I mean, I haven’t delved into the blog post.)
posted by Going To Maine at 9:30 AM on December 16, 2021


"Any technology sufficiently advanced is indistinguishable from magic" - achievemnt unlocked
posted by briank at 9:30 AM on December 16, 2021 [1 favorite]


iPads for my sham friends, and one-time pads for my real friends
posted by gwint at 9:31 AM on December 16, 2021 [27 favorites]


Yeah, this was a fairly brilliant exploit. Daaaang.
posted by aramaic at 9:34 AM on December 16, 2021 [2 favorites]


In case anybody wanted a less-technical summary, I felt compelled to write one for my friends when reading through this.

It's a fascinating combination of things/realizations required to chain it all together.
1. GIFs get parsed outside the protective sandbox, to make them loop!
2. Whoops, instead of just copying the gif, it re-renders the gif to a new location ⚠️
3. The library that tries to detect if a file is a gif can be faked-out (because you can't just trust that a .gif extension means it's actually a gif, after all)
4. This means that there's 20+ other image formats which can be used (because it's going to the renderer, remember)
5. PDFs. They've been a security nightmare, but this time it's not for the usual reasons!
• Back in the 90's an image format was used for hyper-compressing black & white text from early scanners.
• One of its compression tricks was "find everything that looks like an e. Then instead of storing every e, you just store one e and a list of where they all go."
(this is how most compression works, for different types of "find thing, replace with list of thing", but this was particularly weird since it works on pixels & not text.)
• But what if you want a less-lossy version of this? Make every e feel distinct, remember which ones had some ink splatter or something. That's cool, store one base e, then save modifications. Y'know. XOR the pixels. But instead of doing it in one go, since that'd just save a parallel bit mask, do it as a set of steps. AND, OR, XOR, XNOR, the whole gang. ⚠️
6. In that code that collates all the es (or other characters), there's an integer overflow bug. 🚨
This is when a number can count too high & that isn't checked, and then later that number is used as an address to read/write memory past where the program is supposed to be able to reach.
7. Normally this would cause the program to crash, but tricky stuff is doable that makes it work. Skipping that part, since there's no de-technical-izing that.
8. Now. You've got some memory to work within as a staging/scaffolding area. But you've only got one shot at this, because it's a single-pass image parser. What do you do? Wait. Going back a bit. AND, OR, XOR, XNOR. That's logic. That's Turing-complete. This is a case where Turing-completeness is relevant. You can write a computer inside this PDF-compressor. 🚨

And that's where pt. 1 leaves off. They built a computer architecture inside a single-run through a decompressor, then scripted that. Pt. 2 (yet to be released) is "now you have your agent on the inside, how do you get the payload out?"
posted by CrystalDave at 9:41 AM on December 16, 2021 [56 favorites]


"find everything that looks like an e. Then instead of storing every e, you just store one e and a list of where they all go."

Is there a direct relationship to the Xerox photocopier nightmare here, or is it merely reminiscent?
posted by clawsoon at 10:37 AM on December 16, 2021 [6 favorites]


> • Back in the 90's an image format was used for hyper-compressing black & white text from early scanners.

Note that JBIG2 was also involved in the number-altering scanner issue that came to light in 2013.

edit: jinx
posted by Phssthpok at 10:40 AM on December 16, 2021 [7 favorites]


JBIG2 does have a sort of "failed '90s rapper" vibe to it.
posted by clawsoon at 10:49 AM on December 16, 2021 [5 favorites]


As I understand it, part of what makes this exploit possible is what is effectively a fix for the Xerox photocopier problem. (CrystalDave's step 5, 3rd bullet.) By avoiding replacing one glyph with a similar-looking one (e.g., a "c" with a smudge might get stored as a copy of an "e"), you store not only which glyph you're copying, but a set of steps to perform to recover the original glyph from the template. While this fixes the problem of photocopiers changing the photocopied document, those steps also provide a Turing-complete set of logical operations, which this exploit uses to build a virtual machine.
posted by biogeo at 11:00 AM on December 16, 2021 [6 favorites]


Though I don't know if that refinement to JBIG2 compression was literally a fix in response to the photocopier problem, or just an option for lossless compression that was already present but that the photocopiers failed to implement.
posted by biogeo at 11:14 AM on December 16, 2021


The article mentions that low enough thresholds on the JBIG2 lossiness were behind some of the numbers getting swapped for other numbers on scanning problem, so it's likely that, yes, this is directly related to the Xerox photocopier nightmare.
posted by straw at 11:27 AM on December 16, 2021 [2 favorites]


Note: Apple inform us that they have restricted the available ImageIO formats reachable from IMTranscoderAgent starting in iOS 14.8.1 (26 October 2021), and completely removed the GIF code path from IMTranscoderAgent starting in iOS 15.0 (20 September 2021), with GIF decoding taking place entirely within BlastDoor.


So, if you haven't already, update your iDevices. (Settings > General > Software Update)
posted by ChurchHatesTucker at 11:56 AM on December 16, 2021 [4 favorites]


I vaguely remember these guys being evil (I guess secretly developing security exploits is mostly evil anyway) but this looks like a fun project.
posted by grobstein at 11:59 AM on December 16, 2021 [1 favorite]


Cyberpunk sci-fi used to portray attacking secured systems as like a game of cat and mouse, where you break in but then the owners of the system like notice something is wrong and try to get you. That seems to have been wrong in like a million ways. (William Gibson has ruefully observed that his conception of hackers in the early 80's was "guys who type really fast.") I think the reality must be that the overwhelming majority of intrusions are not looked for and are never detected. Most vulnerabilities that are closed get closed because of worldwide patches, which are often issued because a vulnerability was published, not necessarily because it is known to have ever been exploited.

Y/N?
posted by grobstein at 12:05 PM on December 16, 2021 [3 favorites]


JBIG2 isn't inherently bad, just that the photocopier and exploitable phone implementations were bad. If only Apple's security smashing GIF code refused to render something that wasn't a GIF ...
posted by scruss at 12:07 PM on December 16, 2021 [1 favorite]


Going to Maine: Truly majestic, straight out of a Gibson novel.

I was thinking more like the rod logic in Neal Stephenson's book The Diamond Age. But yeah, pretty wild!
posted by wenestvedt at 12:16 PM on December 16, 2021 [4 favorites]


The path dependencies that made this a feasible exploit are baffling. We all want security, but we also want performance, and interoperability, and a sleek smoothing over of complexity, and we'd also like to use it as soon as possible and at reasonable cost. Somewhere within that maze of requirements someone is going to implement a Turing machine.
posted by dmh at 12:26 PM on December 16, 2021 [5 favorites]


The second-worst thing about working on this, after whatever remains of one's conscience keeping one awake at night contemplating the journalists and dissidents killed by your work, must be being unable to tell another living soul about the awesomely cool God-tier hacks you just came up with, and knowing that that is a secret you will carry with you to the grave; a burden yours to carry, alongside the deaths.
posted by acb at 12:37 PM on December 16, 2021 [15 favorites]


unable to tell another living soul

like a for-profit version of James Ellis, who proposed public-key encryption in 1970 while working at GCHQ, and died a month before the work was declassified in 1997.
posted by scruss at 1:16 PM on December 16, 2021 [4 favorites]


You can tell other people, like the FBI (or...) when they come knocking asking if you can jailbreak this bad guy's device. And you can charge them millions every time to do it. You can do the same favour for Saudi Princes who are looking for journalists to chop up scapegoat too, if you like.
posted by bonehead at 1:24 PM on December 16, 2021


I think the reality must be that the overwhelming majority of intrusions are not looked for and are never detected.

An interesting example is the Heartland Payment Systems breach. The company at the time processed payments for 175,000 businesses. They passed a security audit in May 2008. In April they were hacked (SQL injection, packet sniffers). In October 2008, Visa notified them of suspicious activity. They hired two different forensic companies who finally tracked down the installed malware in January 2009. By then, 100 million credit cards had been compromised. The CEO stated that, unlike his peers, he was admitting to the breach to publicize the problem.
posted by jabah at 1:25 PM on December 16, 2021 [6 favorites]


Outside of Apple, a lot of other companies (Microsoft, Mozilla, and so on) are now rewriting key components in Rust, whose ownership semantics (i.e., the compiler keeping track of what code has access to a piece of memory and refusing to compile any code where there is ambiguity), not to mention bounds-checking (which prevents array out-of-bounds access errors like the one used in this exploit) close a lot of security holes. I have not heard of Apple looking into Rust, which I suspect is because they have Swift as their next-generation language. I'm wondering whether there is a move in Apple to rewrite components that handle incoming data (such as file format parsers) in Swift, and/or to shore up Swift's memory ownership semantics (perhaps alongside the asynchronous capabilities), or else to adopt Rust for security-sensitive components.
posted by acb at 1:26 PM on December 16, 2021 [3 favorites]


One thing I don't quite understand is that they say both that the exploit can build a Turing-complete computation system and that the decoder only makes one pass. Doesn't that mean it can't have unbounded loops and thus can't be Turing complete? I don't know that the exploit necessarily needs Turing completeness, but since it is mentioned several times, I expected an explanation of how that was achieved.
posted by eruonna at 1:31 PM on December 16, 2021


One thing I don't quite understand is that they say both that the exploit can build a Turing-complete computation system and that the decoder only makes one pass. Doesn't that mean it can't have unbounded loops and thus can't be Turing complete?

I think it's: you get one pass, but what happens during that pass is determined by the type elements in your file, which turns out to mean the pass can be indefinitely long. "One pass" effectively means, you get one call of whatever you have packed into the file.
posted by grobstein at 1:58 PM on December 16, 2021 [1 favorite]


One thing I don't quite understand is that they say both that the exploit can build a Turing-complete computation system and that the decoder only makes one pass.

My weak understanding is that it wasn't a loop per se, but that when the code got to the end of the steps "within the loop," it overwrote the address that the Turing machine/parser used to determine where to go next (CrystalDave's #6). I think in Computer Science this would be "unrolling" a loop, but I haven't seen it described in those terms and at any rate I don't have the knowledge to understand a lot of this low-level stuff without metaphors and similes.

I've been wondering what conceptual influence that Minecraft CPU from years ago had on this.
posted by rhizome at 1:59 PM on December 16, 2021


One thing I don't quite understand is that they say both that the exploit can build a Turing-complete computation system and that the decoder only makes one pass.

Yeah, it's mind-boggling. The decoder gets fed 70,000 segment commands. This happens in a single pass. Each segment command changes bits in memory. After 70,000 commands, the bits in memory are laid out in such a way that it defines a tiny, evil computer architecture that can run instructions written in a tiny, evil machine language. The tiny, evil machine language gets you the loops and whatnot. Then all you need to do is bootstrap your big evil exploit to run on this tiny, evil computer. That's how I would do it my understanding, at least.
posted by dmh at 2:03 PM on December 16, 2021 [10 favorites]


But if you can write arbitrary bytes to memory and then execute them, you don't need to emulate a tiny, evil computer architecture, you can just run your evil code directly. They needed to do the computation to figure out where to modify memory to get their arbitrary code to execute, right?
posted by eruonna at 2:59 PM on December 16, 2021 [1 favorite]


So when are we getting Quake ported to JBIG2?
posted by clawsoon at 3:17 PM on December 16, 2021 [9 favorites]


The logical bit operations provided by the decoder give you the ability to write arbitrary bit values within the page buffer (rather than arbitrary bytes anywhere in memory). They use this ability to write a tiny emulator to the page buffer, and then bootstrap the evil off of that.
posted by dmh at 3:19 PM on December 16, 2021 [1 favorite]


I am too dumb to understand any of this. Should I not use iMessage?
posted by pelvicsorcery at 3:42 PM on December 16, 2021


Just install the iOS update on your phone. That's all anyone on this site needs to do.
posted by seanmpuckett at 3:54 PM on December 16, 2021 [3 favorites]


Should I not use iMessage?
The vulnerabilities in question have long-since been patched out, this is more "ok, now that it's safe to publicly talk about this..."
posted by CrystalDave at 3:55 PM on December 16, 2021 [2 favorites]


The logical bit operations provided by the decoder give you the ability to write arbitrary bit values within the page buffer (rather than arbitrary bytes anywhere in memory)

I wrote this above but it's wrong, since the page buffer is practically unbounded you can read/write arbitrary memory.

They needed to do the computation to figure out where to modify memory to get their arbitrary code to execute, right?

This is I think exactly right. I think 1) it's impossible to run executable code from the page buffer memory, and 2) since the address space layout is randomized you don't initially know where to poke/peek, and 3) you only have a single pass. So a tiny computer is used to find the addresses of memory locations that can be profitably modified. Something like that.
posted by dmh at 4:56 PM on December 16, 2021 [1 favorite]


So a tiny computer is used to find the addresses of memory locations that can be profitably modified. Something like that.

Think so, and, importantly, that job happens in the "one pass" of the JBIG2 decode. The pass doesn't terminate prematurely because you resize the buffer that's getting decoded. That gives you enough "tape" and time to be for your purposes a universal computer. Once you find the addresses you need, you inject your main payload there and let the decode terminate.
posted by grobstein at 5:34 PM on December 16, 2021


Most vulnerabilities that are closed get closed because of worldwide patches, which are often issued because a vulnerability was published, not necessarily because it is known to have ever been exploited.

That is true. The many of CVEs (the industry standard for tracking vulnerabilities) never have a public exploit published for them. And many, many small bug fixes are technically security issues but at such an unimportant level that they're unlikely to get noticed or used.

However, for important fixes, people will start reverse engineering patches to develop exploits, because plenty of targets can be counted on to not patch.

I think the reality must be that the overwhelming majority of intrusions are not looked for and are never detected.

An increasing percentage of software has at least basic security baked into its development at this point. Just using GitHub, for example, can get you an increasing amount of security checks with no effort. But automated, simple tests can't always find stuff like this.

Groups like the NSA will develop exploits that are unique to them but in some cases will burn the vulnerability if they detect another actor using it.
posted by Candleman at 6:26 PM on December 16, 2021 [3 favorites]


Was there ever any word on the minimum versions of (iOS, MacOS) that this affected? Or are we in "yeah, it's probably been there for 10+ years at this point" territory?
posted by vibratory manner of working at 1:20 AM on December 17, 2021


Was there ever any word on the minimum versions of (iOS, MacOS) that this affected? Or are we in "yeah, it's probably been there for 10+ years at this point" territory?

FWIW, my ancient iPad Air has been stuck forever at iOS 12.something.something, and, out of the blue, it received the same security patch earlier this year. Two of 'em, actually.
posted by Thorzdad at 3:59 AM on December 17, 2021 [1 favorite]


Pony request: Can the mods turn on images for this thread?
posted by sebastienbailard at 4:02 AM on December 17, 2021 [9 favorites]


so it is just not possible to have a truly secure system in von Neumann architecture, where data can be treated as instructions, is it?
posted by thelonius at 7:13 AM on December 17, 2021 [1 favorite]


FWIW, my ancient iPad Air has been stuck forever at iOS 12.something.something, and, out of the blue, it received the same security patch earlier this year. Two of 'em, actually.

They don't seem to be doing the same for the Mojave MacBook I preserved for the purpose of running 32-bit software. I've logged it out of iCloud and iMessage just in case and have moved general-purpose use to a new M1 MacBook, though am wary that there are almost certainly other vulnerabilities there.
posted by acb at 7:37 AM on December 17, 2021


JBIG2 previously on metafilter. Some technology just gives and gives.
posted by a snickering nuthatch at 8:33 PM on December 17, 2021


> so it is just not possible to have a truly secure system in von Neumann architecture, where data can be treated as instructions, is it?

fascinating. in the steps of the exploit described so far in this blog post, at no point is the attacker writing bytes somewhere that get executed by the cpu as machine code -- they're merely writing bytes into the data structures of the application, without modifying its code. the existing unmodified machine code of the decompressor algorithm lets them execute a turing machine purely within the data structures made available by that algorithm.

things already look a bit ominous: some of the data structures they can overwrite are objects with vtables, i.e. function pointers. and they have a mechanism that will let them attempt to write to arbitrary memory addresses, although if the operating system has locked the decompression process down it will presumably get abrubtly terminated it if tries to write to memory it isn't permitted to.

if the decoder process has been set up in a secure way, then none of the regions of memory it is using for data will have the execute permission, and none of the regions of memory it uses to hold machine code should have the write permission.

presumably part 2 will elaborate to explain how they bootstrap from this to escape the constraints of the decompressor process
posted by are-coral-made at 12:47 AM on December 18, 2021


Yeah, same WTF about data/code, r/rw segments. So far as part 1 goes it's not too terribly complex in hindsight. The thing that gets me is that they had the source (or more likely decompiled the code) to figure out not only the integer overflow, but also could craft that overflow to do the thing to open up the available r/w memory range. Then they figured out how to find the magic numbers at magic locations with which to do the math to find out where they are and where they need the decompression process to hit as a function call (and that that magic location (and that they can write to locations that can be executed).

The rest is "not that hard". I'm curious about the claim of Turing Machine and building a Computer (or at least the Algorithmic Logic Unit) is as true as say a CPU built in Minecraft. I think they're probably mixing up levels of abstraction in the explanations.

You're processing a stream of Segments in a loop to a section of memory that's like a big flat array (the bitmap) which has bounds and stuff it's just a sequence of Bit blit - Wikipedia.

That you can arrange to hit the integer overflow bug AND drop a big number into the data structure that limits the bit blit is using to constrain the blits and gain access to tons and tons of memory space is damn lucky. Then to find your location in memory and get your bearings and that you can do that is also neat.

So you can r/w at a whim with basic boolean operators and you need to do some math. That's not a problem or even hard and doesn't need a Turing machine at all. But similar. You have the memory spaces for all the registers you could need. Just think like doing binary math by hand on a sheet of graph paper where the teacher tells you to show all your work. You just need to un-parallelize the gates that do addition and those are in the stream of Segments that the decoder is decoding. It is sorta like fully un-rolled loops and no branching. The decompression process is processing bit blits frof the segments, the 'code' of this 'computer' isn't, it's just the Segments are doing calculations.

My guess..... they find their location and a certain object and calculate the address of a certain vtable (whatever) function pointer. Then the stream starts laying down machine code either there or some other location and the last part of the stream is just the 'payload' and maybe a final mess with object and then the stream of Segments ends normally and the decompression process is done and starts to clean up or call the next function and BOOOM you're now running the new payload created and set-up by all of those Segments that supposedly were only capable of doing some boolean operations on a bounded space (ha).

Then it's off to the races: NolaCon 2019 D 10 Waiter theres a compiler in my shellcode Josh Stone - YouTube
posted by zengargoyle at 5:54 AM on December 18, 2021 [1 favorite]


They don't seem to be doing the same for the Mojave MacBook I preserved for the purpose of running 32-bit software.

I got a patch (a set of three, actually) for my Mojave Mini at least a week ago (not sure how long it took me to notice it.)
posted by ChurchHatesTucker at 2:45 PM on December 20, 2021


« Older "threads the ends of her hair in like pouring a...   |   Scrabnite? Fortble? Newer »


This thread has been archived and is closed to new comments