hngopher.com

       [HN Gopher] What is a flop?
       ___________________________________________________________________
        
       What is a flop?
        
       Author : RafelMri
       Score  : 43 points
       Date   : 2023-09-05 08:50 UTC (14 hours ago)
        
 (HTM) web link (nhigham.com)
 (TXT) w3m dump (nhigham.com)
        
       | paulddraper wrote:
       | Floating Point Operations Per
        
       | nightmonkey wrote:
       | My last two startups.
        
         | HPsquared wrote:
         | You could use it as a unit of accounting for investing losses.
         | A gigaflop would be a loss of 1 billion, etc.
        
       | [deleted]
        
       | yujian wrote:
       | this
        
       | jjgreen wrote:
       | I'm usually a fan of Higham, but the last few posts have been
       | weak.
        
         | liotier wrote:
         | Flops ?
        
           | jjgreen wrote:
           | Very witty Wilde ...
        
       | MattyDub wrote:
       | Can somebody explain why a square root is also considered a flop?
       | Surely that involves more work than the other four operations the
       | article listed. Is there some hardware algorithm for the square
       | root that is as fast as (e.g.) division?
        
       | AlbertCory wrote:
       | Disappointed this is not about basketball.
        
       | dragontamer wrote:
       | My personal notes on this subject:
       | 
       | * MIPS was perhaps the integer equivalent to FLOP, still used in
       | modern microcontrollers because the 8051 at 12MHZ would only
       | execute 1MIPS (12 clocks per instruction). Modern 8051 chips
       | obviously have sped up to 1 clock per instruction, but MIPS (and
       | Dhrystone MIPS in particular) are still a common benchmark today.
       | 
       | * FLOPs is very difficult to calculate in theory because modern
       | CPUs have vector units, and multiple pipelines per core. You
       | could have 3x AVX512 instructions in parallel on today's CPUs on
       | a single core.
       | 
       | * FLOPs we're traditionally a 64-bit operation for the
       | supercomputer community. Today, most FLOPs are 32-bit for video
       | games. Finally, the deep learning / neural net guys have
       | popularized 16-bit flops, and even 8-bit iops.
       | 
       | * 'The' flop is a misnomer because it's almost always the
       | multiply-and-accumulate instruction: X = A + B * C. Which... Is
       | two operations per instruction (per shader/SIMD lane). Eeehhh
       | whatever. Who cares about these details?
       | 
       | * As 'Dhrystone' is the benchmark for MIPS, the benchmark for
       | 64-bit flops is Linpack.
        
         | gumby wrote:
         | > MIPS was perhaps the integer equivalent to FLOP, still used
         | in modern microcontrollers because the 8051 at 12MHZ would only
         | execute 1MIPS (12 clocks per instruction)
         | 
         | Actually the origin of this term was VAX MIPS (VAX 780
         | specifically) because that was a ubiquitous, pretty fast for
         | its time minicomputer. There were faster machines, and slower
         | mainframes still being built, but that was what the late 70s
         | were like.
         | 
         | When the 8051 was released in 1980 it surely didn't run at 12
         | MHz! Back then the Z80 sold because it could run 8080 code at a
         | blistering 2 MHz.
         | 
         | BTW the benchmark for FLOPS in those days was Whetstone, hence
         | the otherwise weird name "Dhrystone"
        
           | dragontamer wrote:
           | The 1981 manual for the 8051 contains numerous references to
           | 12MHz.
           | 
           | http://bitsavers.informatik.uni-
           | stuttgart.de/components/inte...
           | 
           | It was 12T clocked: even though the clock was 12MHz, it would
           | only operate at 1MHz / 1MIPS, because it took 12-clock-ticks
           | to even perform one addition.
           | 
           | IIRC, there was a standard crystal (11.0592 MHz crystal?? I
           | forget exactly) for the communications at the time. So going
           | just above 11 MHz (or really, just above 11.0592 MHz) was
           | needed for reliable serial comms.
        
             | gumby wrote:
             | Wow, right on page 1-2! I'm surprised -- I don't remember
             | anything running that fast back then. Thanks.
             | 
             | (Love those old Intel books too)
             | 
             | Nevertheless, FWIW, MIPS started out as Vax MIPS, and at
             | first people often used to write "VAX MIPS".
        
         | segfaultbuserr wrote:
         | > 'The' flop is a misnomer because it's almost always the
         | multiply-and-accumulate instruction: X = A + B * C. Which... Is
         | two operations per instruction (per shader/SIMD lane). Eeehhh
         | whatever. Who cares about these details?
         | 
         | If FMA is supported, it can either be counted as one or two
         | operations, depending on the rule of the benchmark involved or
         | the marketing of the processor. The marketing specification of
         | a processor's theoretical peak performance sometimes counts a
         | single-instruction FMA as two operations. On the other hand,
         | for the purpose of code profiling, counting FMA as one
         | operation is more realistic... As you said, who cares about
         | these details?
        
           | fluoridation wrote:
           | Given that the point of the FLOPS unit is to compare
           | processors, it does make more sense to count complex
           | instructions as more than a single floating-point operation.
           | If one CPU could multiply a 4x4 matrix by a vector in a
           | single instruction that can run a million times per second,
           | and another CPU needed ~32 instructions and so can only
           | multiply 500k matrices per second but retires 16 million
           | instructions in that same second, it would be silly to
           | compare instructions instead of multiplications and
           | additions.
        
             | dragontamer wrote:
             | As a computer-engineer, the circuit design needed to make a
             | fast multiplication operation (ie: Wallace Tree, and
             | similar) are an order-of-magnitude larger than the circuit
             | design needed for fast addition (ie: a Kogge-Stone Carry
             | lookahead Adder).
             | 
             | This idea that additions and multiplications can be
             | combined like this as "equivalent operations" is kinda
             | bullshit. But hey, if its "how its done" (and its done this
             | way because multiply-then-add is how you do matrix-
             | multiplications...) then so be it.
             | 
             | Just remember that this is an arbitrary subdivision of a
             | matrix multiplication operation, that may not have much
             | relevance as a benchmark outside of matrix multiplications.
        
               | fluoridation wrote:
               | It was just an example, not necessarily a realistic one.
               | The point is that we want to compare how quickly a
               | processor will compute our problem, not how many
               | instructions it's going to execute. If it was a car you
               | want to compare things like its top speed and
               | acceleration, not something inane like engine revolutions
               | per kilometer. You measure and compare things that are
               | relevant to the user, not implementation details.
        
       | namirez wrote:
       | Floating point operations per ...?
        
         | GenericDev wrote:
         | Floating Point Operations Per Second [1]
         | 
         | [1] https://academickids.com/encyclopedia/index.php/FLOPS
        
           | Zambyte wrote:
           | > One should speak in the singular of a FLOPS and not of a
           | FLOP, although the latter is frequently encountered. The
           | final S stands for second and does not indicate a plural.
           | 
           | The author of this post seems to have fallen for this error.
        
           | dataflow wrote:
           | I think their point was that the p stood for "per", not for
           | the second letter of "operation".
        
       ___________________________________________________________________
       (page generated 2023-09-05 23:00 UTC)