[HN Gopher] Are your memory-bound benchmarking timings normally ...
       ___________________________________________________________________
        
       Are your memory-bound benchmarking timings normally distributed?
        
       Author : mfiguiere
       Score  : 18 points
       Date   : 2023-04-06 20:47 UTC (2 hours ago)
        
 (HTM) web link (lemire.me)
 (TXT) w3m dump (lemire.me)
        
       | sliken wrote:
       | I've been writing micro memory benchmarks and have been rather
       | surprised how hard something as simple as quantifying latency and
       | bandwidth under multicore loads can be. The memory hierarchy is
       | getting ever more complex. Cacheline sizes, prefetch, 3 levels of
       | cache, TLB effects, page alignments, cache associativity, etc.
       | Also have to be careful that the compiler doesn't optimize away
       | parts of your code. It's quite tricky to get a nice clean array
       | size vs latency graph, doubly so when multiple cores are
       | involved.
       | 
       | Some of my assumptions about latency were wrong. One thing I
       | didn't realize is it takes about half the latency to main memory
       | to get miss through L1, L2, and L3. Also that you need to have
       | around 2x the memory references pending to keep the memory system
       | busy. It makes sense in retrospect, you want 16 pending memory
       | references to keep 8 memory channels busy, otherwise a memory
       | channel will return a cache line, and there won't be any L3 cache
       | misses pending for that channel.
       | 
       | Generally I like to keep a small histogram of cycle counters to
       | make sure I'm seeing the distribution I expect, seeing an unusual
       | distribution is key for tracking down something you didn't
       | account for.
        
       | darksaints wrote:
       | You might want to check out the gamma distribution. It is also
       | zero-bounded just like the log normal distribution, but it was
       | originally created to model waiting times within queue theory,
       | which is actually an excellent parallel to the idea of measuring
       | compute latency.
        
         | pclmulqdq wrote:
         | Gamma and delta distributions have been very helpful to me in
         | performance work, as well as non-parametric statistical tests.
         | However, when you try to tell a lot of other engineers about
         | them, they don't really understand why a t-test and a standard
         | deviation doesn't work.
        
       | kelseyfrog wrote:
       | I mean it's physically impossible for the generating process of
       | positive timings to be normally distributed. The normal
       | distribution has support x [?] R.
        
       | ericpauley wrote:
       | Great article!
       | 
       | Statistical fallacies are rampant in performance eval, even in
       | academic settings. When designing statistical tests for
       | performance, the keyword you want to use here is non-parametric.
       | I.e., a U-test is a non-parametric analog to the t-test. It just
       | looks at the rank statistics of results instead of their value,
       | thus eliminating dependence on the underling distribution.
       | 
       | Another issue that pops up is sample independence. Statistical
       | tests are often predicated on each sample being independent and
       | identically distributed (i.i.d.), but in reality this is often
       | not the case. For instance, running all the tests of one group
       | and then all the tests of the other could heat the CPU and cause
       | reduced performance in the second trial.
        
         | zX41ZdbW wrote:
         | We use non-parametric statistics for performance testing in
         | ClickHouse[1], picked from the article "A Randomized Design
         | Used in the Comparison of Standard and Modified Fertilizer
         | Mixtures for Tomato Plants".
         | 
         | [1] https://clickhouse.com/blog/testing-the-performance-of-
         | click...
        
       ___________________________________________________________________
       (page generated 2023-04-06 23:00 UTC)