[HN Gopher] Stack Based Virtual Machines (2015)
       ___________________________________________________________________
        
       Stack Based Virtual Machines (2015)
        
       Author : signa11
       Score  : 50 points
       Date   : 2020-12-27 13:12 UTC (2 days ago)
        
 (HTM) web link (andreabergia.com)
 (TXT) w3m dump (andreabergia.com)
        
       | wokkel wrote:
       | Something overlooked in this (and many other articles about stack
       | based VMs): if the stack based VM requires a well balanced stack
       | (as is the case for JVM), the bytecode can be trivially rewritten
       | to a register based VM: number each stack location and trace all
       | basic-blocks to see which stackslot they hit. Then just replace
       | all the push 10, pop with "move 10 to slot/register-1", "load
       | slot/register-1" etc. The converse is much harder, so I really
       | prefer to work with stack based VMs as they are more condense and
       | if speed is needed a linear (in function size) scan can rewrite
       | it to make the mapping to cpu registers much easier.
        
         | yodsanklai wrote:
         | > a well balanced stack
         | 
         | what is a well-balanced stack?
        
           | abecedarius wrote:
           | They mean the stack depth is guaranteed to be statically
           | apparent at each point in the control flow. In the JVM the
           | class loader checks this with an analysis phase during
           | loading.
           | 
           | (The JVM design makes this harder than it really needs to be,
           | since it doesn't insist on reducible control flow, iirc. You
           | could say OTOH that irreducible control flow is more valuable
           | than trivial analyzability, which is a reasonable take, but
           | conflicts with the above comment's design point of a stack
           | machine being both very simple for an interpreter and not an
           | obstacle to quickly compiling to decent register-machine
           | code. Needing a fancier stack-depth analysis would be that
           | sort of obstacle.)
        
           | RodgerTheGreat wrote:
           | The stack is balanced if everything pushed at a given level
           | of scope is popped as that scope closes.
           | 
           | This precludes variable-stack-effect sequences as one might
           | have in a Forth definition like:                   : ?drop
           | dup if drop then ;
        
       | ufo wrote:
       | > Closely related to stack based VM are register based virtual
       | machines. They are also interpreters of bytecode, but their
       | design is quite different, since they don't use the stack for the
       | operands but rather a set of registers.
       | 
       | This is a common source of confusion but register based VMs also
       | use a stack.
       | 
       | In a stack based VM, the arguments and results to all operations
       | are implicitly pushed and popped from the top of the stack. For
       | example, the instructions to add 10 plus 20 would look similar to
       | this:                   PUSH 10         PUSH 20         ADD
       | 
       | In a register based VM, the arguments and results are stored in
       | virtual registers. For example, to do the same as before the
       | instructions might look more like the following:
       | STORE R1 10         STORE R2 20         ADD R3 R1 R2
       | 
       | However, those registers are also part of the stack! The way it
       | works is that the interpreter maintains a stack of values, just
       | as in a stack based interpreter. When a function starts running,
       | it grows the stack by a fixed amount, allocating an "activation
       | record". When it executes, the registers in the instructions
       | refer to slots in this activation record, which is part of the
       | stack. It's not like in physical computer, where the registers
       | and the stack are stored in separate places.
       | 
       | The main interpreter loop of a register based VM is actually
       | quite similar to the main loop of a stack based VM. The part that
       | is more different is the bytecode generation.
        
         | jdmichal wrote:
         | Follow-up question: Is the difference between a stack + locals
         | and registers just the instruction set? I imagine locals are
         | allocated to and stored on the stack, exactly like you discuss
         | for the registers. But in a stack VM, you can't work on them
         | directly. Instead you have to load them into this special other
         | place (the stack) and work on them there. Instruction set is
         | simpler but at the cost of requiring additional operations.
        
           | ufo wrote:
           | Yes, the main difference is the instruction set.
           | 
           | One common way to design a stack-based instruction set is to
           | refer to the local variables using stack offsets, just as in
           | a register-based instruction set. The operation for reading a
           | local variable makes a copy of the stack slot at the given
           | offset and pushes it into the top of the stack, where it may
           | be used as the input for subsequent operations. For example,
           | to implement "X = X + Y" it might use the following code:
           | LOAD 0 (pushes the contents of X to the stack)        LOAD 1
           | (pushes the contents of Y to the stack)        ADD    (pops
           | two values and pushes the result)        STORE 0 (pops one
           | value, and stores it into X)
           | 
           | Meanwhile, in a register-based VM the ADD operation
           | manipulates the relevant stack slots directly:
           | ADD R0 R0 R1 (compute R0 + R1 and store in R0)
           | 
           | > Instruction set is simpler but at the cost of requiring
           | additional operations.
           | 
           | Well put. This is one of the key differences between a stack
           | VM and a register VM. In register VM the instructions are
           | larger, because they need to specify all their arguments.
           | However, a stack VM may need to generate more instructions to
           | fiddle with the top of the stack.
           | 
           | If you want to read more, one good example of a stack-based
           | VM is the one in the Crafting Interpreters book. For a good
           | example of a register based VM my suggestion would be the
           | paper about the implementation of Lua (section 7 in
           | particular).
           | 
           | https://craftinginterpreters.com/local-variables.html
           | 
           | http://www.jucs.org/jucs_11_7/the_implementation_of_lua
        
         | titzer wrote:
         | > However, those registers are also part of the stack!
         | 
         | This is a confusion of two kinds of stacks. There is the _call_
         | stack, and the _operand_ stack. The call stack stores frames
         | for executing procedures (functions) and the operand stack (if
         | any) is used for local data flow. The operand stack, again, if
         | any, may be nested inside of the storage of the call stack
         | (eliminating one of two stack pointers), but need not
         | necessarily be.
        
           | CalChris wrote:
           | Yes, and Forth had these dual stacks, operand stack and
           | return stack, back in 1970. The abstract SECD machine had
           | dual stacks in 1964.
        
       | jermaustin1 wrote:
       | This reminds me of my teenage years in the 00s. I was a member of
       | this forum called pagemac, and what drew all of us kids together
       | was the love of building small esoteric languages. A lot of
       | brainfuck derivatives (one called stackfuck) and assembly like
       | languages, and bytecodes/VMs.
        
       | transfire wrote:
       | Beam is a register based VM that's probably used more than LUA.
        
         | HappyJoy wrote:
         | SQLite implements a register based VM too.
        
         | toolslive wrote:
         | I guess the most popular one is either the JVM or the .Net CLR
         | ?
        
           | pansa2 wrote:
           | Those are both stack-based.
        
             | toolslive wrote:
             | right. I was misreading this. So for a register based VM, I
             | would have to refer to the Parrot virtual machine then.
        
             | titzer wrote:
             | The JVM used an operand stack for local data flow, while
             | the CLR uses registers.
        
       ___________________________________________________________________
       (page generated 2020-12-29 23:02 UTC)