[HN Gopher] Robot Game: Comparing 6502 C, Assembly, and Forth
       ___________________________________________________________________
        
       Robot Game: Comparing 6502 C, Assembly, and Forth
        
       Author : druzyek
       Score  : 73 points
       Date   : 2020-07-12 15:47 UTC (7 hours ago)
        
 (HTM) web link (calc6502.com)
 (TXT) w3m dump (calc6502.com)
        
       | siraben wrote:
       | This nicely provides empircal evidence on the slowdown of using
       | Forth, which seems to be around 10 to 20 times slower than
       | optimized assembly. I was initially surprised by how poorly Forth
       | performed on all counts of speed, memory usage and development
       | time. For something as complex as a game the lack of a type
       | system and postfix nature make Forth quite unsuitable.
       | 
       | When I use Forth, it's often for high-level applications[0] with
       | the performance critical words written in assembly. So this keeps
       | the complexity relatively low as opposed to writing the whole
       | thing in Forth.
       | 
       | [0]
       | https://github.com/siraben/zkeme80/blob/master/src/bootstrap...
        
         | asguy wrote:
         | If you program Forth like it's C, and then use a non-optimizing
         | compiler, you're going to have a bad time (e.g. 10/20x
         | slowdowns).
         | 
         | I've written Forth compilers that generated output that was
         | ~25% slower than optimized assembly, without putting that much
         | work into it. There are companies that will sell you a much
         | better compiler [0], at prices even approachable to hobbyists.
         | 
         | [0] - https://www.mpeforth.com/software/pc-systems/vfx-forth-
         | for-l...
        
         | coliveira wrote:
         | I think the writer use case takes no benefit from Forth. He
         | already has some code and is trying to translate and optimize
         | it. In this case, none of the advantages of Forth matter
         | because he is not doing the development interactively. He was
         | also tied to an existing code architecture that is alien to
         | Forth. A Forth programmer starting from scratch would certainly
         | design things in a way that makes more sense to Forth. Instead,
         | his design clearly benefits a C implementation and goes against
         | the way Forth code is structured.
        
           | druzyek wrote:
           | OP here. This is a really interesting point. My strategy was
           | to keep the highest level functions/words like DrawTile or
           | DrawMenu while writing the underlying Forth and subwords from
           | scratch focusing on performance. Do you have any suggestions
           | that would make "more sense to Forth" as you say?
        
             | astrobe_ wrote:
             | Although it does not directly help, it gives the general
             | idea:
             | 
             | http://www.ultratechnology.com/levels.htm
        
             | asguy wrote:
             | Not the GP, but I'd recommend the latest edition of
             | _Thinking Forth_[0]; there's a PDF but the soft-cover is
             | worth picking up. It helped change the way I think about
             | Forth (hah), and really, programming in general.
             | 
             | [0] - http://thinking-forth.sourceforge.net/
        
         | zozbot234 wrote:
         | Yes, FORTH is an elegant alternative to BASIC on very small
         | systems, not something you should expect high performance from.
        
           | MaxBarraclough wrote:
           | To mirror asguy's comment, this only tells us about the
           | performance of this particular FORTH engine, it doesn't tell
           | us about the performance inherit to FORTH (which of course
           | doesn't exist, it's down to the engine).
           | 
           | Whether anyone has made a sophisticated optimising FORTH for
           | 6502, I don't know.
        
             | Dr_Jefyll wrote:
             | >this only tells us about the performance of this
             | particular FORTH engine
             | 
             | Yes, exactly. Whatever its other qualities may be, I
             | suspect this particular Forth has overlooked some pretty
             | obvious low hanging fruit, performance-wise. In this[1]
             | post we learn that LOOP puts a 1 on the stack then falls
             | into +LOOP. Although there's elegance (and a memory saving)
             | to that approach, I'm startled that they didn't provide a
             | dedicated definition for LOOP instead. AIUI, implementing
             | LOOP as an instance of +LOOP substantially and needlessly
             | increases the complexity of what gets executed. Yes, I know
             | premature optimization should be viewed with suspicion, but
             | if profiling were performed it's hard to believe LOOP
             | _wouldn 't_ be a hot spot! So, I constructively suggest
             | that in this respect at least (and perhaps there are
             | others) this Forth engine could benefit from some tuning
             | up.
             | 
             | [1] http://forum.6502.org/viewtopic.php?p=76849#p76849
        
       | orphean wrote:
       | Interesting that OP ignored ca65 and went to NASM for the second
       | assemble option especially since they were already using cc65.
       | 
       | Nice use of High Level Assembly with the macros!
        
       | Marazan wrote:
       | I don't know Forth on the 6502 but I think the statement that STC
       | is the fastest form of Forth is not always true. But it will lead
       | to larger programs.
        
       | Koshkin wrote:
       | Unfortunately, it seems that for all practical purposes, and as
       | much as we love it, 6502 in all its (still existing) incarnations
       | is all but dead.Even MIPS has been dying, albeit slowly and
       | painfully. The unpleasant truth is that as we now all know that
       | ARM is currently the only sane choice for a hobbyist project
       | (see, for example, Color Maximite 2), and we can only hope that
       | RISC-V will someday replace it in the hands of CPU aficionados,
       | but in the times when for a few pennies we can have an 8-pin ARM
       | microcontroller, this is a no-brainer. (ESP32, being quite
       | popular, due to its highly integrated nature and the focus on IoT
       | is a somewhat different proposition.)
        
         | the_af wrote:
         | I suppose it depends on the kind of hobbyist. If you want to
         | learn how to code for the 6502, for nostalgia's sake -- a
         | powerful motivator for many hobbyists -- then learning to code
         | for ARM will not scratch that itch.
         | 
         | You can always emulate the 6502, it doesn't have to be the real
         | hardware.
        
         | LIV2 wrote:
         | "What is sane for a hobbyist project" depends on the project
         | and the wills of the designer.
         | 
         | I've built a couple of 6502 based systems now and added VGA
         | output and floppy support which probably would've been easier
         | with an ESP32 or something but my goal was to learn how to do
         | this the hard way.
        
         | chongli wrote:
         | A big part of the appeal of the 6502 is with the retro
         | community. If your goal is write an NES game (and there's quite
         | an active community doing just that) then you have no other
         | choice than to learn the 6502.
         | 
         | That's fine though. The 6502 is a classic CISC processor with a
         | fairly high level instruction set. This makes it fun for humans
         | to program. I think where people get into trouble is when they
         | try to bring up these complicated tool chains and high level
         | languages. That's a mistake. Program the 6502 directly in
         | assembly with a nice macro assembler. It's the best way to go.
        
       | albertzeyer wrote:
       | It's kind of a funny coincidence that I also wrote a very similar
       | looking game which I also called "Robot Game" at some point, and
       | I also rewrote it at multiple points in time in different
       | languages:
       | 
       | * Starting in Visual Basic 3 in 1999
       | (http://www.az2000.de/projects/robot/),
       | 
       | * then later in Object Pascal / Lazarus in 2006
       | (http://www.az2000.de/projects/robot2/),
       | 
       | * and recently in Python in 2017
       | (https://github.com/albertz/PyOverheadGame).
        
       | astrobe_ wrote:
       | > This brings us to one of the main shortcomings of Forth. If you
       | need to access more than three local variables at once (in the
       | body of a loop for example) there is just no convenient way to do
       | so. [...] Consider the variables used in this C function that
       | draws 1 bit-per-pixel images with optional rounded corners: > int
       | DrawTile1bpp
       | 
       | Looking at the C version, the argument "tile" is used once in the
       | function, to get the pointer to the tile. The pointer could be
       | passed directly: one less local. x and y are passed only to
       | calculate an address, this address could be passed directly -
       | there are only three calls to this function, it might be worth
       | the DRY exception.
       | 
       | t0 seems to be always 0, except when it is undefined.
       | 
       | The t_height, t_with are just offsets in the tile structure. This
       | locals can go away too - probably eating the cost of fetching
       | them each time is more effective than the cost of all the stack
       | juggling done to avoid it. This reminds me a word from a Chess
       | great master: "don't waste time trying to save one".
       | 
       | trans_row only exists to set skip_pixel, it seems one of them can
       | go away. The logic seems to be "unless trans_row and something,
       | do something". The C version might actually be more verbose than
       | necessary.
       | 
       | Finally, the edge_style complicates a lot the logic. Using one
       | function to do different things because it is so simple to "just
       | add another parameter" is typical in many HLL, and often result
       | in awful spaghetti code.
       | 
       | What a Forth programmer would do is to write one function for
       | each edge style and see what the have in common. More often than
       | not, they share a lot of subwords, so that having one function
       | for each case is not more expensive than one function for all
       | cases.
       | 
       | I suggest doing that with the C version and see what happens.
       | 
       | > Without parenthesis or commas, there is no way to tell just by
       | looking at the code which of those are functions, which ones are
       | variables, which of them are inputs to the others, or what any of
       | them return. Forth fans will say that that information is
       | available in the stack comment for the word's declaration, but
       | that assumes that the programmer bothered to create one. Even if
       | they did, you have to search in several places in the file to
       | figure out what is instantly obvious if the same code were
       | written in a language like C or Python. It's easy to see why
       | people criticize Forth for being "write only." If you translated
       | the line above into C, it could be doing any of the following:
       | 
       | Actually even in C some follow certain naming conventions - like
       | #defines being all caps, member things being prefixed by m_, etc.
       | And you can do that in Forth too. Once again, "write only" has
       | more to do with the author than wit the language.
        
       | compiler-guy wrote:
       | The 6502 is still doing hundreds of millions of units a year, and
       | is still a terrific introductory processor that is comparatively
       | easy to wrap your head around, especially if you are interested
       | in board bring up from scratch. (Compare the requirements to boot
       | a 68000 to a 6502 into a nop loop, for example.)
       | 
       | The 6502 is far from dead.
        
         | chris_st wrote:
         | Glad to hear it! I first learned assembly for 6502 on the Apple
         | ][, and found it indeed easy to wrap my head around. Great
         | machine, great architecture, TED II Editor/assembler was
         | amazing...
        
       ___________________________________________________________________
       (page generated 2020-07-12 23:00 UTC)