[HN Gopher] Surprisingly Turing-Complete
       ___________________________________________________________________
        
       Surprisingly Turing-Complete
        
       Author : telekid
       Score  : 79 points
       Date   : 2020-04-11 01:32 UTC (21 hours ago)
        
 (HTM) web link (www.gwern.net)
 (TXT) w3m dump (www.gwern.net)
        
       | Complexicate wrote:
       | I love this...
       | 
       | "...mov, which copies data between the CPU & RAM, can be used to
       | implement a transport-triggered-architecture one instruction set
       | computer, allowing for playing Doom..."
       | 
       | Click on "Doom" link and read:
       | 
       | "The mov-only DOOM renders approximately one frame every 7 hours,
       | so playing this version requires somewhat increased patience."
        
       | greesil wrote:
       | Does finding weird and unexpected ways to do computation always
       | imply a security risk?
        
         | codeflo wrote:
         | I'm not sure why that's explicitly mentioned, I believe that
         | for analyzing security, TC is a bit of an academic red herring.
         | (Warning: not an expert.) There are two kinds of security
         | implication to consider here I think.
         | 
         | One, can certain input data cause a large usage of computing
         | resources (processing time or memory)? Non-TC mechanisms have
         | repeatedly failed this test: ZIP bombs and XML entity bombs are
         | notorious examples. On the other hand, you can easily put a
         | resource cap on an interpreter of a theoretically TC language
         | and be safe.
         | 
         | Two, can the untrusted code access resources that it shouldn't
         | (memory, files, sockets)? That's mostly a quality of
         | implementation issue, not one of Turing completeness.
         | JavaScript interpreters have certainly been vulnerable to
         | various exploits, but so have JPEG decoders. I don't think TC
         | is the issue here. (However, this is complicated a bit by side-
         | channel attacks a la Spectre. I'm not sure how TC factors into
         | those.)
         | 
         | In summary, I'm not convinced that Turing completeness is all
         | that relevant for security. Am I wrong here?
        
           | jdsully wrote:
           | Both can have exploits as you've shown, but you haven't
           | addressed the difficulty of securing one or the other. Turing
           | complete systems have a dramatically larger state space than
           | non turing complete implementations and so at a fundamental
           | level are more difficult to secure.
        
             | codeflo wrote:
             | I simply don't think Turing completeness is the deciding
             | factor in determining that difficulty. Hopefully someone
             | can define this more clearly, but in my opinion, the issue
             | is not abstract "state space", but implementation
             | complexity and attack surface.
             | 
             | For example, it's a lot easier to write a secure Brainfuck
             | interpreter (a very simple Turing complete language that
             | can only access STDIN and STDOUT) than a program that
             | securely extracts a ZIP file to disk.
        
         | pmiller2 wrote:
         | At least theoretically, it means certain inputs can cause
         | infinite loops, which, thanks to Turing completeness and the
         | undecidability of the halting problem, means there's no way to
         | detect in advance in all cases.
        
         | woodruffw wrote:
         | I do research in this space professionally.
         | 
         | The answer is not always, but sometimes: discovering unintended
         | states or transitions in the execution contract of a program is
         | a common building block for exploits. However, proving that the
         | execution contract of a program can be coerced into
         | representing computations in a TC language doesn't necessarily
         | prove that you can do anything interesting.
         | 
         | Complex formats like PDF are a good example of this: you can
         | probably contrive a PDF input such that the parser's state
         | represents a minimal (and TC) language while interpreting it
         | (e.g. a language with a few mathematical operators and a
         | "store" primitive), but that doesn't magically get you network
         | access or arbitrary memory read/writes. You need to show that
         | said language, when programmed in, can affect the overall
         | execution contract in a way that violates high-level
         | assumptions.
         | 
         | Some resources (FD: my company's) on the subject:
         | 
         | * https://blog.trailofbits.com/2019/11/01/two-new-tools-
         | that-t...
         | 
         | * https://blog.trailofbits.com/2018/10/26/the-good-the-bad-
         | and...
        
           | saagarjha wrote:
           | Just a heads up: I think the second post has a typo; the code
           | has named "new_item" but is referred to as "item" throughout.
           | (I'm also not sure I understand the safety added by
           | dynamic_cast.)
        
         | tupshin wrote:
         | It means you can't easily (or statically) reason about the
         | output given arbitrary (user) input. So yes, it makes it much
         | more likely that security bugs are introduced.
        
           | masklinn wrote:
           | Also it at least implies a guaranteed resource leak / DOS
           | capability, which is a problem though it may or may not be
           | considered a security issue.
        
             | saagarjha wrote:
             | Well, that depends; you can slap a rlimit on a process and
             | it will no longer DOS you, even if it's executing Turing
             | complete code. It does mean that _inside the Turing
             | machine_ you can 't really apply protections, but from the
             | outside a Turing-complete language cannot magically escape
             | its sandbox unless you give (or accidentally include) it a
             | tool to do so.
        
               | zokier wrote:
               | Unless I'm completely off my mark, any resource limited
               | system is not true universal Turing machine in the
               | theoretical sense. So depending on your level of
               | pedantry, TC can mean guaranteed DoS.
        
               | mattkrause wrote:
               | But...that's your whole computer too then.
        
               | zokier wrote:
               | Indeed.
        
       | saagarjha wrote:
       | > "return-into-libc attacks": software libraries provide pre-
       | packaged functions, each of which is intended to do one useful
       | thing; a fully TC 'language' can be cobbled out of just calls to
       | these functions and nothing else, which enables evasion of
       | security mechanisms since the attacker is not running any
       | recognizable code of his own.
       | 
       | Note that ROP attacks in general tend to jump into the middle of
       | functions because they have partially-cobbled together call
       | states. ROP "chains" join together a couple of instructions
       | followed by a return into something useful, but with "return-
       | into-libc" it's usually to just jump straight midway into system
       | and spawn a shell.
       | 
       | > Pokemon Yellow: "Pokemon Yellow Total Control Hack" outlines an
       | exploit of a memory corruption attack which allows one to write
       | arbitrary Game Boy assembler programs by repeated in-game walking
       | and item purchasing. (There are similar feats which have been
       | developed by speedrun aficionados, but I tend to ignore most of
       | them as they are 'impure': for example, one can turn the SNES
       | Super Mario World into an arbitrary game like Snake or Pong but
       | you need the new programs loaded up into extra hardware, so in my
       | opinion, it's not really showing SMW to be unexpectedly TC and is
       | different from the other examples.
       | 
       | I fail to see the difference; as far as I understood it, the
       | Sumer Mario World examples were done by just playing the game?
       | (By the way, I hear that Ocarina of Time has something like this
       | now, too.)
       | 
       | > This matters because, if one is clever, it provides an escape
       | hatch from system which is small, predictable, controllable, and
       | secure, to one which could do anything. It turns out that given
       | even a little control over input into something which transforms
       | input to output, one can typically leverage that control into
       | full-blown TC. This matters because, if one is clever, it
       | provides an escape hatch from system which is small, predictable,
       | controllable, and secure, to one which could do anything.
       | 
       | You can still prove sandboxing guarantees about executing Turing-
       | complete programs.
        
       | joe_the_user wrote:
       | _" Peano arithmetic: addition & multiplication on natural numbers
       | is enough to be TC;"_
       | 
       | My head swims when the situation is described with this level of
       | vagueness. I mean, sure the task of proving a theorem using the
       | modern version of the Peano postulates is undeciable and so I'd
       | assume a map from theorems in the Peano system to proofs of
       | theorems would be Turing complete.
       | 
       | But a computation system based on calculating the values of
       | simple arithmetic expressions isn't Turing complete. An express
       | involving just adding and multiplying constant integer values
       | will terminate.
        
       | pjscott wrote:
       | Stuck in an appendix is a fascinating mini-essay, "How many
       | computers are in your computer?"
       | 
       | https://www.gwern.net/Turing-complete#how-many-computers-are...
        
         | pjc50 wrote:
         | Indeed. Everything is a hetrogenous cluster now. Don't forget
         | the input devices and outputs; one of my more interesting jobs
         | was very tangentially being involved with
         | https://www.flatfrog.com/inglass . Every dot around that screen
         | has an ARM and a small DSP.
        
       | hirundo wrote:
       | > Turing-completeness (TC) is ... the property of a system being
       | able to ... compute any program of interest, including another
       | computer in some form.
       | 
       | So Turing completeness implies _recursive_ Turing completeness.
       | It is the theoretical threshold at which a device is capable of
       | reproduction, a sort Schwarzschild radius for complex, heritable
       | behavior, aka life.
        
       | moreati wrote:
       | Python pickle files are a sequence of op-codes that run on the
       | pickle VM. By default the VM allows calls to arbitrary Python
       | functions. I'm still puzzling whether Python pickles _without
       | access to Python globals_ (e.g. using
       | https://docs.python.org/3/library/pickle.html#restricting-gl...)
       | are Turing complete. I don't _think_ so, because the pickle VM
       | has no branching or looping, but it does have a stack and my
       | understanding of automata theory is not great.
       | 
       | My research/tinkering so far is
       | https://github.com/moreati/pickle-fuzz
        
       | waynecochran wrote:
       | Turing Complete is not a very high bar. Add a second stack to a
       | pushdown automata and its Turing Complete. Add two counters to a
       | NFA and it's Turing Complete. I don't think folks know what this
       | means.
        
       | zokier wrote:
       | > This matters because, if one is clever, it provides an escape
       | hatch from system which is small, predictable, controllable, and
       | secure, _to one which could do anything_. It's hard enough to
       | make a program do what it's supposed to do without giving anyone
       | in the world the ability to insert another program into your
       | program, which can then interfere with or take over its host
       | 
       | I take issue with the statement that TC means that something
       | could "do anything". It means it can theoretically _compute_
       | anything, but Turing completeness does not in itself give access
       | to any additional resources; being TC does not magically allow
       | something to talk to internet or write to disk or spawn new
       | processes, or possibly not even allocate new memory. Computers do
       | so much more than just compute; TC might be the first step
       | towards being able to  "do anything" but it certainly is not the
       | final step.
       | 
       | From security point of view, what TC can (but not necessarily!)
       | is open the host to denial of service by excessive resource
       | consumption, or by non-terminating program. But as another
       | comment noted, also non-TC systems can consume impractically
       | large amounts of resources even if their resource consumption is
       | not theoretically unbounded (as it is afaik with Turing
       | machines).
        
         | zrm wrote:
         | > Turing completeness does not in itself give access to any
         | additional resources; being TC does not magically allow
         | something to talk to internet or write to disk or spawn new
         | processes, or possibly not even allocate new memory. Computers
         | do so much more than just compute; TC might be the first step
         | towards being able to "do anything" but it certainly is not the
         | final step.
         | 
         | In many cases it _is_ the final step.
         | 
         | If you're trying to secure something that lacks any good reason
         | to access the internet, it shouldn't be able to. And yet so
         | many things that otherwise lack any good reason to access the
         | internet still have it.
         | 
         | This creates a problem when you have a program which is only
         | supposed to process some sensitive data and not export it off
         | to the attacker, because as soon as the attacker can execute
         | their own code, the process _already_ had access to the
         | sensitive data and to the internet. Or there is no sensitive
         | data but the process already had access to the internet, so now
         | the attacker is using your hardware to mine cryptocurrency or
         | route their network traffic through your IP address.
         | 
         | One solution to this is to stop giving network access to
         | processes that otherwise shouldn't need it, but that requires
         | overcoming the incumbent economic forces that use network
         | access for telemetry and advertising. So there are a lot of
         | people hoping that the alternative of making things that aren't
         | Turing-complete is easy. But it turns out to be pretty hard.
        
         | tdullien wrote:
         | Author of one of the cited papers here. The author of the post
         | falls into a common misconception of the weird machine
         | literature (which led me to write my paper): conflating TC (the
         | ability to compute any function worth computing) with the
         | ability to transition the victim machine into states that
         | should be unreachable via paths that should not exist ("weird
         | machine programming"). It is a bit unfortunate that this
         | misunderstanding is pervasive in early WM papers :-/ - this
         | ensures perpetuation of the misunderstanding.
        
         | zokier wrote:
         | I do want to add (as my parent comment was bit negative) that I
         | do think non-TC programming is an area that deserves more
         | research, as is the research around better characterizing TC
         | programs (http://raml.co/ being example of that)
        
       | bertr4nd wrote:
       | A related idea that I'm interested in but find a bit hard to
       | articulate is to describe "simple" Turing complete languages,
       | where simplicity is defined more by ease of reasoning for a human
       | than by any objective metric.
       | 
       | Basically, if I wanted to provide someone with a Turing complete
       | language, what's the simplest/easiest thing I could provide, that
       | would still be useful?
        
         | jfkebwjsbx wrote:
         | The simplest you could give someone is probably the Turing
         | machine itself, the Brainfuck language or the lambda calculus.
         | 
         | Simplifying, to have a TC programming language you need two
         | things: RAM and the ability to decide your next state based on
         | the memory contents.
        
       | tdullien wrote:
       | Author of one of the linked weird machine papers here. The use of
       | "Turing complete" in both the ROP and the weird machine
       | literature is both incorrect and misleading; I wrote some
       | comments on this here:
       | http://addxorrol.blogspot.com/2018/10/turing-completeness-we...
       | 
       | This does not detract from this post being a good, fun, and
       | interesting read, but for anyone that is puzzled why "Turing
       | complete" should imply "insecure": It doesn't.
        
       ___________________________________________________________________
       (page generated 2020-04-11 23:00 UTC)