[HN Gopher] Parallel Programming in Multicore OCaml ___________________________________________________________________ Parallel Programming in Multicore OCaml Author : pjmlp Score : 76 points Date : 2020-07-05 18:41 UTC (4 hours ago) (HTM) web link (github.com) (TXT) w3m dump (github.com) | sadiq wrote: | (Multicore OCaml hacker here - speaking for myself though) | | Nice to see people having a look at this. It's an early draft | chapter that Sudha, one of the Multicore team, has been working | on. | | If you have feedback or suggestions I'm sure she'd welcome it on | the draft's PR: https://github.com/prismlab/parallel-programming- | in-multicor... | | If you're new to Multicore OCaml the repo is at | https://github.com/ocaml-multicore/ocaml-multicore . It has | installation instructions near the bottom (easiest way is via | OPAM). | | The latest June update of our progress is available at | https://discuss.ocaml.org/t/multicore-ocaml-june-2020/6047 | | For a technical deep dive on how Multicore OCaml works our recent | ICFP paper has a lot of detail: https://arxiv.org/abs/2004.11663 | badfrog wrote: | How is Domain different/better than Deferred? | http://dev.realworldocaml.org/concurrent-programming.html | sadiq wrote: | Domain is a unit of parallelism in Multicore OCaml - it's | effectively a heavyweight thread. The intention is that you | tend to have as many Domains as you have cores you want to use | in the computation. | | Deferred relates to concurrency rather than parallelism, that | you can have multiple overlapping computations. You can have | concurrency without parallelism. OCaml has ways you can have | parallel computations going on that don't hold the interpreter | lock and I think some of the concurrent libraries can utilise | this. | | Multicore OCaml splits parallelism and concurrency, the former | is via Domains and the latter is with Fibers. The paper I | linked in my other comment on this thread touches on this a | little but kc also has a short write-up of how you can use | effects to write a scheduler for multicore's fibers: | https://kcsrk.info/ocaml/multicore/2015/05/20/effects-multic... | Athas wrote: | Some months ago, I used my lockdown time to make a repository | containing implementations of a (bad) ray tracer in various | parallel functional programming languages[0]. I was quite happy | that someone contributed a multicore OCaml implementation[1]. I | think the most interesting part was the series of pull requests | where an expert (I think even a multicore OCaml contributor) | significantly improved the performance of the ray tracer [2,3,4]. | I think being able to see the individual steps is valuable, | because functional programming has long promised "effortless | parallelism", and hence it is enlightening to see how different | well-optimised parallel functional code is from an original more | or less idiomatic sequential implementation. In OCaml's case, I | don't think the difference is all that great. Most of the | improvements came from using a library that provides a proper | task pool, since "multicore OCaml" on its own seems to just | provide relatively low-level (but very well implemented) | primitives. It's a little bothersome that the programmer has to | manually provide a chunk size, rather than letting the runtime | figure such things out on its down, but it's not terrible. It was | far more annoying that multicore OCaml is (or was) missing | functions for measuring wall clock time... | | Although there was a later PR that improved performance 40% just | by moving code around to minimise allocations... [5] Unfortunate | that such minor changes can have such dramatic impact. | | [0]: https://github.com/athas/raytracers [1]: | https://github.com/athas/raytracers/tree/master/ocaml [2]: | https://github.com/athas/raytracers/pull/4 [3]: | https://github.com/athas/raytracers/pull/5 [4]: | https://github.com/athas/raytracers/pull/6 [5]: | https://github.com/athas/raytracers/pull/7 | wk_end wrote: | > The Multicore OCaml compiler comes with two variants of Garbage | Collector, namely a concurrent minor collector (ConcMinor) and a | stop-the-world parallel minor collector (ParMinor). Our | experiments have shown us that ParMinor performs better than | ConcMinor in majority of the cases. ParMinor also does not need | any changes in the C API of the compiler, unlike ConcMinor which | breaks the C API. So, the consensus is to go forward with | ParMinor during up- streaming of the Domains-only Multicore. | | Is there any use for ConcMinor at this point or is it at a dead- | end? | sadiq wrote: | The plans, and implementation effort, are focused on getting | ParMinor upstreamed so I suspect ConcMinor will stay at version | 4.06. | | I think we were all surprised by how well ParMinor performed. | There's work on-going to see just how far ParMinor can scale | and there's also a few tricks we might be able to do to make it | scale even better. | sigrlami wrote: | Is C Api changes stable? Last time I checked there was some | breaking changes ___________________________________________________________________ (page generated 2020-07-05 23:00 UTC)