[HN Gopher] Stack Based Virtual Machines (2015) ___________________________________________________________________ Stack Based Virtual Machines (2015) Author : signa11 Score : 50 points Date : 2020-12-27 13:12 UTC (2 days ago) (HTM) web link (andreabergia.com) (TXT) w3m dump (andreabergia.com) | wokkel wrote: | Something overlooked in this (and many other articles about stack | based VMs): if the stack based VM requires a well balanced stack | (as is the case for JVM), the bytecode can be trivially rewritten | to a register based VM: number each stack location and trace all | basic-blocks to see which stackslot they hit. Then just replace | all the push 10, pop with "move 10 to slot/register-1", "load | slot/register-1" etc. The converse is much harder, so I really | prefer to work with stack based VMs as they are more condense and | if speed is needed a linear (in function size) scan can rewrite | it to make the mapping to cpu registers much easier. | yodsanklai wrote: | > a well balanced stack | | what is a well-balanced stack? | abecedarius wrote: | They mean the stack depth is guaranteed to be statically | apparent at each point in the control flow. In the JVM the | class loader checks this with an analysis phase during | loading. | | (The JVM design makes this harder than it really needs to be, | since it doesn't insist on reducible control flow, iirc. You | could say OTOH that irreducible control flow is more valuable | than trivial analyzability, which is a reasonable take, but | conflicts with the above comment's design point of a stack | machine being both very simple for an interpreter and not an | obstacle to quickly compiling to decent register-machine | code. Needing a fancier stack-depth analysis would be that | sort of obstacle.) | RodgerTheGreat wrote: | The stack is balanced if everything pushed at a given level | of scope is popped as that scope closes. | | This precludes variable-stack-effect sequences as one might | have in a Forth definition like: : ?drop | dup if drop then ; | ufo wrote: | > Closely related to stack based VM are register based virtual | machines. They are also interpreters of bytecode, but their | design is quite different, since they don't use the stack for the | operands but rather a set of registers. | | This is a common source of confusion but register based VMs also | use a stack. | | In a stack based VM, the arguments and results to all operations | are implicitly pushed and popped from the top of the stack. For | example, the instructions to add 10 plus 20 would look similar to | this: PUSH 10 PUSH 20 ADD | | In a register based VM, the arguments and results are stored in | virtual registers. For example, to do the same as before the | instructions might look more like the following: | STORE R1 10 STORE R2 20 ADD R3 R1 R2 | | However, those registers are also part of the stack! The way it | works is that the interpreter maintains a stack of values, just | as in a stack based interpreter. When a function starts running, | it grows the stack by a fixed amount, allocating an "activation | record". When it executes, the registers in the instructions | refer to slots in this activation record, which is part of the | stack. It's not like in physical computer, where the registers | and the stack are stored in separate places. | | The main interpreter loop of a register based VM is actually | quite similar to the main loop of a stack based VM. The part that | is more different is the bytecode generation. | jdmichal wrote: | Follow-up question: Is the difference between a stack + locals | and registers just the instruction set? I imagine locals are | allocated to and stored on the stack, exactly like you discuss | for the registers. But in a stack VM, you can't work on them | directly. Instead you have to load them into this special other | place (the stack) and work on them there. Instruction set is | simpler but at the cost of requiring additional operations. | ufo wrote: | Yes, the main difference is the instruction set. | | One common way to design a stack-based instruction set is to | refer to the local variables using stack offsets, just as in | a register-based instruction set. The operation for reading a | local variable makes a copy of the stack slot at the given | offset and pushes it into the top of the stack, where it may | be used as the input for subsequent operations. For example, | to implement "X = X + Y" it might use the following code: | LOAD 0 (pushes the contents of X to the stack) LOAD 1 | (pushes the contents of Y to the stack) ADD (pops | two values and pushes the result) STORE 0 (pops one | value, and stores it into X) | | Meanwhile, in a register-based VM the ADD operation | manipulates the relevant stack slots directly: | ADD R0 R0 R1 (compute R0 + R1 and store in R0) | | > Instruction set is simpler but at the cost of requiring | additional operations. | | Well put. This is one of the key differences between a stack | VM and a register VM. In register VM the instructions are | larger, because they need to specify all their arguments. | However, a stack VM may need to generate more instructions to | fiddle with the top of the stack. | | If you want to read more, one good example of a stack-based | VM is the one in the Crafting Interpreters book. For a good | example of a register based VM my suggestion would be the | paper about the implementation of Lua (section 7 in | particular). | | https://craftinginterpreters.com/local-variables.html | | http://www.jucs.org/jucs_11_7/the_implementation_of_lua | titzer wrote: | > However, those registers are also part of the stack! | | This is a confusion of two kinds of stacks. There is the _call_ | stack, and the _operand_ stack. The call stack stores frames | for executing procedures (functions) and the operand stack (if | any) is used for local data flow. The operand stack, again, if | any, may be nested inside of the storage of the call stack | (eliminating one of two stack pointers), but need not | necessarily be. | CalChris wrote: | Yes, and Forth had these dual stacks, operand stack and | return stack, back in 1970. The abstract SECD machine had | dual stacks in 1964. | jermaustin1 wrote: | This reminds me of my teenage years in the 00s. I was a member of | this forum called pagemac, and what drew all of us kids together | was the love of building small esoteric languages. A lot of | brainfuck derivatives (one called stackfuck) and assembly like | languages, and bytecodes/VMs. | transfire wrote: | Beam is a register based VM that's probably used more than LUA. | HappyJoy wrote: | SQLite implements a register based VM too. | toolslive wrote: | I guess the most popular one is either the JVM or the .Net CLR | ? | pansa2 wrote: | Those are both stack-based. | toolslive wrote: | right. I was misreading this. So for a register based VM, I | would have to refer to the Parrot virtual machine then. | titzer wrote: | The JVM used an operand stack for local data flow, while | the CLR uses registers. ___________________________________________________________________ (page generated 2020-12-29 23:02 UTC)