[HN Gopher] A functioning Turing Machine using Notepad++ and its...
       ___________________________________________________________________
        
       A functioning Turing Machine using Notepad++ and its find/replace
       regex engine
        
       Author : MarcellusDrum
       Score  : 361 points
       Date   : 2022-02-22 08:18 UTC (14 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | kats wrote:
       | Not a Turing machine if the user has to press a button for each
       | step.
        
         | coldtea wrote:
         | That's not relevant as to whether it's a turing machine or not,
         | tho...
         | 
         | The possible calculations (and genericity) is what matters. The
         | "button for each step" could be analogous to powering the
         | turing machine, or turning some crank for Babbage's machine, or
         | whatever..
        
           | Banana699 wrote:
           | This is absolutely relevant, turing's original definition
           | described a completely autonomous and automatic machine,
           | that's why every turing machine has a halting state that
           | locks it in a loop when computation is finished.
           | 
           | >could be analogous to powering the turing machine, or
           | turning some crank for Babbage's
           | 
           | Those things are done once for those machines, you press
           | power-on or turn a crank for just one time and the machine
           | starts, this is not the case here, here the human is acting
           | as the control logic for the machine, repeatedly pulsing to
           | drive the computation.
           | 
           | Your computer is not a computer without a hardware clock, the
           | repeated pressing of a button is acting as a hardware clock
           | here.
        
           | floatboth wrote:
           | ...or clicking the On-Click Transitions in Microsoft(r)
           | PowerPoint(tm)...
        
         | KptMarchewa wrote:
         | Not a Turing machine if it's powered by human created
         | electricity.
        
           | Banana699 wrote:
           | Terrible and fallacious analogy, the repeated press of a
           | button doesn't power the search+replace process, it still
           | needs electricity to run, the repeated pressing is more like
           | a hardware clock that repeatedly pulses to advance the
           | computation, which does indeed makes the search+replace
           | process alone not turing complete, every turing machine has
           | control logic to automate the execution loop.
        
         | dzqhz wrote:
         | If you had to press different buttons you would have a point.
        
       | numeri wrote:
       | I did essentially this in vim once - although I made it a little
       | more "user friendly" with vim macros' ability to both call and
       | modify themselves, so it would stop itself if the Turing Machine
       | halted.
        
       | midjji wrote:
       | Neat, Though it does make me wonder if a python parser wont soon
       | become standard in editors. Its just too damned convenient for
       | many things, sure you might be able to come up with a regexp
       | search and replace that does the same thing, but odds are it will
       | take longer than coding a few loops. BASH sucked far too much,
       | and C++ and most languages lacked the convenient filesystem libs
       | required, but python just works and is easy to read and modify.
       | The downsides of python, like the atrocious speed and low
       | maintainability dont matter for scripts you will use just once.
        
         | vesinisa wrote:
         | That sounds like a felony abuse of Python. Despite, wasn't perl
         | originally a tragic attempt at such language? :)
        
         | 30f0fn wrote:
         | How about elisp?
        
         | tjoff wrote:
         | Python just works... Every now and then I'm still bitten by the
         | 2->3 transition. And I'd very much not like it to be a required
         | dependency in my systems either.
         | 
         | And no, regexps get a bad rep but they are for the easy 99% and
         | insanely quick to come up with. Learn basic syntax and you'll
         | be thankful for decades to come.
        
           | kadoban wrote:
           | Learn basic syntax and you'll spend the next few decades
           | wondering each time _which_ basic syntax is expected because
           | none of them ever say.
        
             | metalliqaz wrote:
             | I know the basic syntax and I never have that problem.
             | Perhaps you mean the advanced syntax? However I don't see
             | the problem, there either. I usually don't need it, and to
             | be honest, when I do, I find it more maintainable to use
             | multiple simpler expressions combined with some
             | programming.
        
               | pstoll wrote:
               | You must not have suffered enough, I mean used regexs
               | widely enough.
               | 
               | He meant "syntax" in the sense that different regex
               | engines have different syntax and capabilities - can I do
               | a negative look ahead assertion in engines Z, how do I do
               | a zero width lookaround in pcre, gnu, python, posix, etc.
               | 
               | Depending how far down the rabbit hole you want to go,
               | start here:
               | 
               | https://swtch.com/~rsc/regexp/regexp1.html
        
               | metalliqaz wrote:
               | I just don't count those cases as 'basic' syntax.
        
               | kadoban wrote:
               | Some by "regex" mean "globs" and you can just use "*" and
               | "?" for stand-ins for some number of characters.
               | 
               | Some allow "|", some allow backrefs, some allow "()", or
               | require them escaped with \, or allow them but not with *
               | after. Some are case insensitive, some not. Some allow
               | "{0-5}", some allow "[0-9]", some have handy things like
               | "\w".
               | 
               | It's just the guessing game of exactly what they want. It
               | should be required that each regex box has an example
               | next to it using as many allowed features as possible.
        
               | Semaphor wrote:
               | > Some by "regex" mean "globs" and you can just use "*"
               | and "?" for stand-ins for some number of characters.
               | 
               | What kind of scary tool is that?
        
               | RHSeeger wrote:
               | Glob is used in most shells, plus in several languages
               | (Tcl has both glob and regexp matching).
        
               | Semaphor wrote:
               | Yeah, but they don't call it "regex" which would be the
               | scary part.
        
               | RHSeeger wrote:
               | Ah, I interpreted the original statement as
               | 
               | > Some [people] by "regex" mean "globs"
        
               | _jal wrote:
               | Look for 'pcre' on the label. Accept no substitutes and
               | you'll be fine.
        
               | hexane360 wrote:
               | *almost always: https://blog.cloudflare.com/details-of-
               | the-cloudflare-outage...
               | 
               | >The Lua WAF uses PCRE internally and it uses
               | backtracking for matching and has no mechanism to protect
               | against a runaway expression.
               | 
               | https://blog.cloudflare.com/making-the-waf-40-faster/
               | 
               | >Back in July 2019, the WAF transitioned from using a
               | regular expression engine based on PCRE to one inspired
               | by RE2, which is based around using a deterministic
               | finite automaton (DFA) instead of backtracking
               | algorithms. This change came as a result of an outage
               | where an update added a regular expression which
               | backtracked enormously on certain HTTP requests,
               | resulting in exponential execution time.
               | 
               | >After the migration was finished, we saw no measurable
               | difference in CPU consumption at the edge, but noticed
               | execution time outliers in the 95th and 99th percentiles
               | decreased, something we expected given RE2's guarantees
               | of a linear time execution with the size of the input.
        
               | RHSeeger wrote:
               | > Some by "regex" mean "globs" and you can just use "*"
               | and "?" for stand-ins for some number of characters.
               | 
               | But that's _not_ regexp, it's glob; a totally different
               | pattern matching system. I mean, you can call a duck a
               | horse, but that doesn't mean you're right... or that
               | there's anything confusing about horses.
        
               | aasasd wrote:
               | Quick: which programs use '|' and which use '\|'?
        
         | oldsecondhand wrote:
         | You can record and edit beanshell macros in JEdit. It does
         | exactly what you described but you also have access to all the
         | Java Standard Library (including regex).
        
         | badsectoracula wrote:
         | I've needed such functionality often enough that i wrote a
         | small utility[0] that uses my LIL scripting language[1]
         | (similar to Tcl) to process some text and have its output. A
         | couple of examples are in the shots [2][3] (note that as this
         | is written in Lazarus it also works under Linux and Mac too,
         | though the site only has a Windows executable).
         | 
         | Before that i used to write scripts or even full programs (in
         | Free Pascal which has some simple string handling) to do
         | similar processing and i did find it much more cumbersome to go
         | through that route.
         | 
         | Of course this only works for editing/generated/transforming
         | text pieces (and i pretty much always use it via clipboard),
         | for processing files i still end up writing full scripts or
         | programs (depending on the case, if i need to preprocess stuff
         | i use another LIL-based tool, lip[4]).
         | 
         | [0] http://runtimeterror.com/tools/liteproc/
         | 
         | [1] http://runtimeterror.com/tech/lil/
         | 
         | [2] http://runtimeterror.com/tools/liteproc/shot.png
         | 
         | [3] http://runtimeterror.com/tools/liteproc/shot2.png
         | 
         | [4] http://runtimeterror.com/tools/lip/
        
           | throwanem wrote:
           | I used to have a Perl script, bound to a key chord, that
           | would pop a dialog box prompting for a s/// regexp-replace
           | expression and apply it to the clipboard contents.
           | 
           | Never got around to making it properly interactive, because I
           | stopped needing it when I switched to Emacs and had both its
           | native capabilities and C-u M-| available.
        
         | 29athrowaway wrote:
         | How do you sandbox python?
        
         | aasasd wrote:
         | If anyone whose life line says 'makes an editor' is reading the
         | above: I beg you to use Lua instead of Python. It's _a lot_
         | faster, and that matters in productivity apps. Speed can be the
         | difference between 'I wrote a script' and 'I could write a
         | script but didn 't'.
        
         | vidarh wrote:
         | I wouldn't be happy with Python, but I do agree with the
         | general premise that extensibility or automation ought to be
         | expected of editors.
         | 
         | My own editor is written in Ruby, and so all extension is done
         | by loading Ruby code into the running process, and I can drop
         | into the Pry debugger with a keypress, or another keypress
         | gives me a prompt to enter a single-line expression instead.
         | The latter is literally a one-line method. Adding a binding to
         | eval() a whole buffer would be equally trivial... Being able to
         | extend everything trivially in a language I'm comfortable with
         | (so not Emacs lisp) makes such a difference to usability.
         | 
         | Incidentally, the ability to interact with the open buffers
         | using a script also from _outside_ the editor is another thing
         | I love as an extension mechanism for editors - an idea I first
         | saw in FrexxEd (co-written by the founder of Curl) for the
         | Amiga, which exposed the open buffers in the filesystem (think
         | the Amiga equivalent of a FUSE filesystem), which would have
         | the added benefit of not being language specific. It doesn 't
         | need to involve any FUSE-like stuff either - just a command
         | line utility to "cat" an open buffer and to replace the open
         | buffer from stdin would be sufficient.
        
           | virchau13 wrote:
           | Out of curiosity, what do you actually use the Ruby
           | liveloading feature for? I use Neovim, which has a similar
           | (but less powerful) feature that allows one to live-execute
           | Lua code (or Vimscript, I suppose). But I almost never use it
           | outside of testing code for my configuration, because the
           | rest of the editor's feature set suffices fairly well for
           | text editing.
        
             | coliveira wrote:
             | Vim can run Ruby scripts, so it also provides this feature.
        
             | vidarh wrote:
             | I don't very often load code "live" other than on starting
             | a client in the form of "macros" which are pretty much
             | stuff I don't want to add into the editor core because it
             | e.g. might be short lived or is just too esoteric. Even
             | though my editor is mostly for myself, imagine anything you
             | might put in your .vimrc, except it's only self-discipline
             | that separates this from core editor code.
             | 
             | E.g. I have functions in there to insert headers in my
             | journal for example.
             | 
             | In terms of executing code "actually" live, it's ~80% for
             | debugging when I work on the editor itself. Pry provides
             | all of the plumbing so all the code needed to add that is
             | just to suspend the editors own input handling and call
             | pry, coupled with an exception handler that calls pry as
             | well, so there was no reason not to, basically.
             | 
             | But once I'd added it, the 20% left felt worthwhile as a
             | means of e.g. do complex searches, or generate tables or
             | otherwise do search and replace of content that requires
             | more (e.g. parsing timestamps and replacing them with
             | another format....) and any number of things I don't do
             | very often but that feels very comfortable when you do need
             | it. To be clear it's not like it does much you can't do
             | easily without it. E.g. after all I could just dump the
             | buffer to a file, load it in my repl of choice, manipulate
             | it and write it out again and reload it in my editor. It's
             | one of those small things that feels unimportant when you
             | don't have it there, that doesn't save you a huge amount of
             | time, but that just makes things feel nicer when you get
             | used to them.
        
       | vermilingua wrote:
       | I was under the impression that RegEx was not turing complete. Is
       | there something special about N++'s regex engine that allows
       | this?
        
         | detaro wrote:
         | No, nothing special about N++ (well, lookahead, but many regex
         | engines have that). Repeated Search+Replace is the key, and not
         | part of regex.
        
           | dspillett wrote:
           | I find myself wondering if anyone has done a proper analysis
           | to prove that human activity is Turing complete!
        
             | dTal wrote:
             | It is so, trivially; humans wrote the find/replace box in
             | Notepad++.
        
               | teawrecks wrote:
               | I mean, I can write a FSM that outputs the code for
               | find/replace in Notepad++.
        
             | assbuttbuttass wrote:
             | Turing's original idea of a "computer" was a human
             | manipulating symbols on paper, so I would say trivially yes
        
             | tartoran wrote:
             | We are guaranteed to halt, e.g lifespan. After death cells
             | enter apoptosis stage where cells do final stages upon
             | shutting down. I'd say it appears we are Turing complete,
             | or at least it appears so from this angle
        
               | skykooler wrote:
               | Given that Turing machines cannot be guaranteed to halt
               | (and it's trivial to write an infinitely-running Turing
               | machine), therefore humans are not, technically, Turing-
               | complete.
        
               | chhickman wrote:
               | Well, I would say that any physical implementation of a
               | Turing machine will eventually halt because you cannot
               | guarantee an infinite energy supply. So if we're willing
               | to ignore physical limitations then humans can be Turing
               | complete - if not, then no machine be either.
        
               | walnutclosefarm wrote:
               | Turing completeness is a theoretical construct that
               | ignores that errors in execution inevitably occur during
               | a sufficiently long program, due to the second law of
               | thermodynamics (entropy must increase in a closed
               | system). Any realization of a Turing complete engine
               | eventually fails. That's different than halting, though,
               | because it doesn't answer the question as to whether the
               | program of the machine eventually reaches a halt
               | instruction, when properly executed.
        
               | [deleted]
        
               | thehappypm wrote:
               | How about a human population?
        
               | tartoran wrote:
               | Oh, that too though we're looking at it from a different
               | angle. Most species so far have gone extinct though we're
               | a special kind, we may as well cause that without any
               | external factors.
        
             | vidarh wrote:
             | We can manually execute the steps of a Turing machine.
             | That's all that's needed.
        
           | vermilingua wrote:
           | Aaah of course, thanks
        
           | lupire wrote:
           | So, Notepad++ basic interface is Turing complete because it's
           | a place you can write a Turing machine and execute it.
        
           | lifthrasiir wrote:
           | And if you allow multiple pattern-replacement pairs, you
           | don't even need regex (the pattern can be a single fixed
           | string) as it is enough to simulate semi-Thue systems [1].
           | 
           | [1] https://en.wikipedia.org/wiki/Semi-Thue_system
        
         | [deleted]
        
         | Banana699 wrote:
         | Regular Expressions as defined mathematically is strictly less
         | powerful than turing machines. (by several levels, they are
         | strictly less powerful than general context-free grammars,
         | which are strictly less powerful than general context-sensitive
         | grammars, which are strictly less powerful than arbitary
         | grammars)
         | 
         | The 'regexes' in modern languages and tools, however, are not
         | actually regular in the original mathematical sense, they have
         | been augmented by several constructs that are not regular. Perl
         | is the leader in this field, its latest addition is regexes
         | which can reference itself recursively, making (presumably,
         | never seen a proof) it at least context-free. Here, the non-
         | regular constructs used is capture groups and arbitary forward
         | lookahead.
         | 
         | In addition to that, the author is doing something sneaky by
         | making the user press a button continuesly to advance the state
         | of the turing machine, so it's not actually search+replace that
         | is turing complete, it's search+replace+"repeatedly pressing
         | replace all". Unlike what some other people say in this thread,
         | repeatedly pressing a button to simulate the machine doesn't
         | count as 'turing complete' and is not comparable to plugging
         | the machine in power.
        
           | notahacker wrote:
           | Seems like you could use automated hardware to repeatedly
           | press the button to facilitate the computation without any
           | additional software or human interaction required (I'm
           | reminded of the episode of the Simpsons where Homer tries to
           | automate his job with a nodding bird!)
        
         | ptspts wrote:
         | It is proven that Turing machines can recognize a wider class
         | of languages than Chomsky regular expressions. However, the
         | article uses something more powerful than Chomsky regular
         | expressions, because they contain backreferences (as \2 and
         | \4), and also there is a repeated search-and-replace involved,
         | which also adds to their power.
        
           | louhike wrote:
           | Do you mean Chomsky hierarchy by Chomsky regex?
        
             | nsajko wrote:
             | Regular languages are an element included in Chomsky's
             | hierarchy: https://en.wikipedia.org/wiki/Chomsky_hierarchy
             | https://en.wikipedia.org/wiki/Formal_language
        
               | [deleted]
        
         | baryphonic wrote:
         | It seems Notepad++'s "search + replace all" feature replaces
         | text in place inside a search context rather than buffering the
         | changes and applying them outside a search context after the
         | search is complete. This makes search recursive.
         | 
         | When combined with enhanced regexes (backreferences and
         | lookahead), you have the ingredients for Turing completeness.
        
       | tromp wrote:
       | Cool demonstration of regular expression power! Note that as
       | implemented, this is a space bounded Turing Machine with a
       | hardcoded tape length limit:
       | 
       | > With the current implementation, the read/write head looks at
       | the 22nd element of the tape whenever we want to read the current
       | position. With complicated machines that utilize long lengths of
       | the tape, you would need to increase this number so that you
       | never delete a necessary tape element as you scroll along it.
       | This can be done by increasing the {20} that appears at the
       | beginning of the expression.
       | 
       | Embedding the tape head pointer ^ within the tape, just to the
       | left of the scanned bit, should remove this restriction.
        
       | NextHendrix wrote:
       | So what you're saying is I could use this to parse HTML?
        
       | dom111 wrote:
       | If you like this, you might also like Retina[0]!
       | 
       | [0]: https://github.com/m-ender/retina/tree/master/Examples
        
       | armchairhacker wrote:
       | Ah yes, regular expressions, my favorite Turing-complete
       | language.
        
       | JadeNB wrote:
       | I made a harder-to-understand investigation using Perl (non-
       | regular, of course) regexps a while back.
       | 
       | https://perlmonks.org/?node_id=809842
        
       | lupire wrote:
       | If you like this, you'll love Save Endo, one of the top two most
       | fun ICFP Programming Contests ever. (The other is Cult of the
       | Bound Variable)
       | 
       | https://save-endo.cs.uu.nl/
        
       | DeathArrow wrote:
       | Great, now someone should write a Javascript interpreter for it!
        
       | wjd2030 wrote:
       | "Yeah, but your scientists were so preoccupied with whether or
       | not they could, they didn't stop to think if they should" - Dr.
       | Ian Malcolm.
        
         | wjd2030 wrote:
         | jokes, this is neat.
        
       | fouronnes3 wrote:
       | Nice one to add to the list of accidentally Turing complete
       | systems. What's your favourite :)?
        
         | teawrecks wrote:
         | I wouldn't say that it's on accident. The theoretical
         | definition of Regular Expressions is specifically not Turing
         | complete. But Regex as a tool just isn't as useful in that
         | form, so it was deliberately extended to be Turing complete
         | with new operators.
        
         | Pompidou wrote:
         | A bit off topic but in the french internet there is an add,
         | these days, asking people to try the Turing test to find a
         | (good) job. I find it very funny. Imagine a job interview where
         | you have to be turing-complient to get the job : you play the
         | tape and the recruiter act like a scanning device...
        
         | giu wrote:
         | Still one of my favorites: On The Turing Completeness of
         | PowerPoint, https://www.youtube.com/watch?v=uNjxe8ShM-8
        
       | unixhero wrote:
       | So it can play Towers Of Hanoi?
        
         | tromp wrote:
         | Certainly; that's a space bounded computation, only needing
         | O(n) tape for n disks.
        
           | drexlspivey wrote:
           | But will it run Doom?
        
       | harpiaharpyja wrote:
       | The instruction sets are the remaining lines, formatted like
       | >C.I:WMN        C current instruction name.        I the input
       | from the current tape position. Either 0 or 1. Each instruction
       | has         execution parameters for both inputs.        W the
       | output to be written to the tape at the current position. Either
       | 0 or 1.        M the movement of the read/write head. A 0 moves
       | the head one position to the         left. A 1 moves it to the
       | right.        N the name of the next instruction to be executed
       | at the new tape position.
       | 
       | Would it have been clearer to use "<" or ">" to specify whether
       | to move left/right from the current position?
        
         | 0xTJ wrote:
         | As it's structured, that wouldn't work. To move, it replaces
         | the leading zero of the tape with:                 * If the
         | movement is "0": "00"       * If the movement is "1": ""
         | 
         | The choice of 1 to move right is arbitrary (but makes sense to
         | match the 0), it could be any other symbol, but the 0 is needed
         | to because that's what's prepended to the tape.
         | 
         | Someone could probably find a way around it, but it would
         | likely make it harder to understand.
        
       ___________________________________________________________________
       (page generated 2022-02-22 23:00 UTC)