[HN Gopher] Building the fastest Lua interpreter automatically
       ___________________________________________________________________
        
       Building the fastest Lua interpreter automatically
        
       Author : hr0m
       Score  : 83 points
       Date   : 2022-11-22 21:11 UTC (1 hours ago)
        
 (HTM) web link (sillycross.github.io)
 (TXT) w3m dump (sillycross.github.io)
        
       | tromp wrote:
       | Nice to see Haskell make an appearance in an article about Lua
       | and C/C++:
       | 
       | > For example, at LLVM IR level, it is trivial to make a function
       | use GHC calling convention (a convention with no callee-saved
       | registers)
       | 
       | This refers to the following section of [1]
       | 
       | "cc 10" - GHC convention This calling convention has been
       | implemented specifically for use by the Glasgow Haskell Compiler
       | (GHC). It passes everything in registers, going to extremes to
       | achieve this by disabling callee save registers. This calling
       | convention should not be used lightly but only for specific
       | situations such as an alternative to the register pinning
       | performance technique often used when implementing functional
       | programming languages. At the moment only X86 supports this
       | convention and it has the following limitations:
       | 
       | On X86-32 only supports up to 4 bit type parameters. No floating-
       | point types are supported. On X86-64 only supports up to 10 bit
       | type parameters and 6 floating-point parameters. This calling
       | convention supports tail call optimization but requires both the
       | caller and callee are using it.
       | 
       | [1] https://llvm.org/docs/LangRef.html#calling-conventions
        
       | JZL003 wrote:
       | wow this is a great article, so readable
        
       | Vt71fcAqt7 wrote:
       | Can this be done for javascript?
        
         | Ericson2314 wrote:
         | Yes
        
       | presheaf wrote:
       | This is very cool.
        
       | abecedarius wrote:
       | There's some prior art on DSLs for efficient interpreters, like
       | Anton Ertl's vmgen.
       | https://www.complang.tuwien.ac.at/anton/vmgen/
        
         | touisteur wrote:
         | Nice, do you know whether this DSL was ever used successfully
         | fir other endeavours (abstract interpretation, lowering to
         | Why3, any kind of symbolic execution? or a language server?).
         | I've been looking for a 'flex/bison with complete semantics'
         | and it might be a piece of the puzzle.
        
       | dingdingdang wrote:
       | Very very impressive.
        
       | ufo wrote:
       | Fascinating! I wonder if there's a way to keep this 100% inside
       | the C ecosystem, without having to reach for an LLVM
       | dependency...
        
         | IshKebab wrote:
         | No need to wonder - the article clearly explains why there
         | isn't. That's why they made this tool in the first place!
        
         | runevault wrote:
         | In general, probably, you'd have to replace the part where he
         | writes the LLVM IR directly to say GCC's IR. If you mean no IR
         | at all it doesn't sound like it based on the part about
         | replacing the assembly from LuaJIT.
        
           | kryptiskt wrote:
           | GCC is written in C++ these days, so something like
           | QBE(https://c9x.me/compile/) would be needed.
        
       | gatane wrote:
       | >More importantly, it is the world's fastest Lua interpreter to
       | date, outperforming LuaJIT's interpreter by 28% and the official
       | Lua interpreter by 171% on average on a variety of benchmarks
       | 
       | Wait what the hell
        
         | hinkley wrote:
         | Faster than the LuaJIT's _interpreter_.
         | 
         | People who focus on JIT often focus on the JIT and not the
         | interpreter. Which is a shame because if you make the uncommon
         | paths cheaper then you can tune your code for the hot paths a
         | bit more aggressively. You get fewer situations where you are
         | sacrificing Best Case for Average Case performance.
        
       | ufo wrote:
       | One caveat to pay attention to... Ever since LuaJIT and PUC-Lua
       | diverged, PUC lua has made some significant changes to the
       | garbage collector. I wonder if that might affect the comparison
       | vs LuaJIT for memory intensive benchmarks such as binarytrees and
       | tablesort.
        
       | gary_0 wrote:
       | I think there's a typo where the Add() example code goes
       | `lhs.As<tDouble>() && rhs.As<tDouble>()`. I'm assuming that
       | should be a `+`?
        
       ___________________________________________________________________
       (page generated 2022-11-22 23:00 UTC)