[HN Gopher] Parallel Programming in Multicore OCaml
       ___________________________________________________________________
        
       Parallel Programming in Multicore OCaml
        
       Author : pjmlp
       Score  : 76 points
       Date   : 2020-07-05 18:41 UTC (4 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | sadiq wrote:
       | (Multicore OCaml hacker here - speaking for myself though)
       | 
       | Nice to see people having a look at this. It's an early draft
       | chapter that Sudha, one of the Multicore team, has been working
       | on.
       | 
       | If you have feedback or suggestions I'm sure she'd welcome it on
       | the draft's PR: https://github.com/prismlab/parallel-programming-
       | in-multicor...
       | 
       | If you're new to Multicore OCaml the repo is at
       | https://github.com/ocaml-multicore/ocaml-multicore . It has
       | installation instructions near the bottom (easiest way is via
       | OPAM).
       | 
       | The latest June update of our progress is available at
       | https://discuss.ocaml.org/t/multicore-ocaml-june-2020/6047
       | 
       | For a technical deep dive on how Multicore OCaml works our recent
       | ICFP paper has a lot of detail: https://arxiv.org/abs/2004.11663
        
       | badfrog wrote:
       | How is Domain different/better than Deferred?
       | http://dev.realworldocaml.org/concurrent-programming.html
        
         | sadiq wrote:
         | Domain is a unit of parallelism in Multicore OCaml - it's
         | effectively a heavyweight thread. The intention is that you
         | tend to have as many Domains as you have cores you want to use
         | in the computation.
         | 
         | Deferred relates to concurrency rather than parallelism, that
         | you can have multiple overlapping computations. You can have
         | concurrency without parallelism. OCaml has ways you can have
         | parallel computations going on that don't hold the interpreter
         | lock and I think some of the concurrent libraries can utilise
         | this.
         | 
         | Multicore OCaml splits parallelism and concurrency, the former
         | is via Domains and the latter is with Fibers. The paper I
         | linked in my other comment on this thread touches on this a
         | little but kc also has a short write-up of how you can use
         | effects to write a scheduler for multicore's fibers:
         | https://kcsrk.info/ocaml/multicore/2015/05/20/effects-multic...
        
       | Athas wrote:
       | Some months ago, I used my lockdown time to make a repository
       | containing implementations of a (bad) ray tracer in various
       | parallel functional programming languages[0]. I was quite happy
       | that someone contributed a multicore OCaml implementation[1]. I
       | think the most interesting part was the series of pull requests
       | where an expert (I think even a multicore OCaml contributor)
       | significantly improved the performance of the ray tracer [2,3,4].
       | I think being able to see the individual steps is valuable,
       | because functional programming has long promised "effortless
       | parallelism", and hence it is enlightening to see how different
       | well-optimised parallel functional code is from an original more
       | or less idiomatic sequential implementation. In OCaml's case, I
       | don't think the difference is all that great. Most of the
       | improvements came from using a library that provides a proper
       | task pool, since "multicore OCaml" on its own seems to just
       | provide relatively low-level (but very well implemented)
       | primitives. It's a little bothersome that the programmer has to
       | manually provide a chunk size, rather than letting the runtime
       | figure such things out on its down, but it's not terrible. It was
       | far more annoying that multicore OCaml is (or was) missing
       | functions for measuring wall clock time...
       | 
       | Although there was a later PR that improved performance 40% just
       | by moving code around to minimise allocations... [5] Unfortunate
       | that such minor changes can have such dramatic impact.
       | 
       | [0]: https://github.com/athas/raytracers [1]:
       | https://github.com/athas/raytracers/tree/master/ocaml [2]:
       | https://github.com/athas/raytracers/pull/4 [3]:
       | https://github.com/athas/raytracers/pull/5 [4]:
       | https://github.com/athas/raytracers/pull/6 [5]:
       | https://github.com/athas/raytracers/pull/7
        
       | wk_end wrote:
       | > The Multicore OCaml compiler comes with two variants of Garbage
       | Collector, namely a concurrent minor collector (ConcMinor) and a
       | stop-the-world parallel minor collector (ParMinor). Our
       | experiments have shown us that ParMinor performs better than
       | ConcMinor in majority of the cases. ParMinor also does not need
       | any changes in the C API of the compiler, unlike ConcMinor which
       | breaks the C API. So, the consensus is to go forward with
       | ParMinor during up- streaming of the Domains-only Multicore.
       | 
       | Is there any use for ConcMinor at this point or is it at a dead-
       | end?
        
         | sadiq wrote:
         | The plans, and implementation effort, are focused on getting
         | ParMinor upstreamed so I suspect ConcMinor will stay at version
         | 4.06.
         | 
         | I think we were all surprised by how well ParMinor performed.
         | There's work on-going to see just how far ParMinor can scale
         | and there's also a few tricks we might be able to do to make it
         | scale even better.
        
       | sigrlami wrote:
       | Is C Api changes stable? Last time I checked there was some
       | breaking changes
        
       ___________________________________________________________________
       (page generated 2020-07-05 23:00 UTC)