[HN Gopher] Teleforking a Process onto a Different Computer ___________________________________________________________________ Teleforking a Process onto a Different Computer Author : kaladin-jasnah Score : 56 points Date : 2022-05-31 20:37 UTC (2 hours ago) (HTM) web link (thume.ca) (TXT) w3m dump (thume.ca) | cl0ckt0wer wrote: | This sounds like Kafka, but more low level. | latenightcoding wrote: | Not even close | basementcat wrote: | Both MOSIX and openMOSIX supported fork()ing to another node on | the network. https://en.m.wikipedia.org/wiki/MOSIX | AaronFriel wrote: | The Cloud Haskell project and language is likely the only one to | get this right, thanks to strictly enforced purity. It's much | simpler to understand, absence global mutable state, whether it's | safe and possible to serialize a closure and run it somewhere | else. (fork(2) being a closure by another name.) | | In almost all other languages there's just no way to know if a | closure is holding on to a file descriptor. | | Critics may say the Haskell closures could contain | `unsafePerformIO`, but as the saying goes: now you have two | problems. | gpderetta wrote: | Isn't fork more of a continuation? | | /pedantic | gnufx wrote: | Is Cloud [sigh] Haskell still alive? | | For comparison, two old distributed lexical scope systems were | Cardelli's Obliq and Kelsey's(?) Kali Scheme. From what I | remember, not like remote forking, though. | fleddr wrote: | In the late 90s I attended a tour at Holland Signaal, a dutch | defense company producing radar and anti-missile systems. | | I remember vividly how they demonstrated an unbreakable process. | They had a computer running a process and no matter what happened | to that computer, the next one would flawlessly continue the | process down to the cycle, with no change of corruption or | skipping a beat. | | It may very well be that this is actually not very difficult, but | it seemed difficult and impressive. | | Perhaps more shocking were ultra high resolution radar screens, | some 3 generations ahead of anything I had seen in the consumer | space, showing an incredible visualization of the air space, | live. Showing exactly which plane is where, the model/type, age, | fuel on board, hostile/friendly, all of it. | | They even had a "situation room" with a holodeck chair in the | middle, full of controls. The entire room was covered in wall- | size screens basically showing the air space of the entire | country, being live analyzed. | | Sounds very 2022, not 1998. | a-dub wrote: | this is a fun hack. it would be interesting to look at some real | world workloads and compare whether this sort of init once ship | initialized memory image everywhere style is faster than just | initializing everywhere. | tenken wrote: | Doesn't Erlang support these ideas of distributed computing .... | And if I recall correctly Clipper supported remote execution of | objects, or sharing object code in a distributed fashion. | mghfreud wrote: | isn't this exactly what the vm migration in cloud is? | mlyle wrote: | No. VM migration moves entire virtual computers. Forking makes | a copy of a process with the current state; this moves that | single duplicated process to a different machine. | mghfreud wrote: | Virtual computer is a bunch of processes. | speed_spread wrote: | And a kernel. And drivers. And devices. And busses. And | interrupts. | Animats wrote: | That used to be in some UNIX variants, such as UCLA Locus and the | IBM derivatives of that. But it never got to be a Linux thing. | Fnoord wrote: | Was VMS capable of achieving this as well? | jonathaneunice wrote: | Congratulations! You have just reinvented the core idea of UCLA's | LOCUS distributed computing project from 1979. | https://en.wikipedia.org/wiki/LOCUS | | Reinventing LOCUS also has a strong heritage. Bell Lab's Plan 9, | for example, did so in part in the late 1980s. | | While never a breakout commercial success, tele-forking and its | slightly more advanced cousins machine-to-machine process | migration and cluster-wide process pools intrigued some of the | best minds in distributed computing for 20+ years. | | Unfortunately "it's complicated" to implement well, especially | when you try to tele-spawn and manage resources beyond compute | cycles (network connections, files, file handles, ...) that are | important to scale up the idea. | zozbot234 wrote: | > Unfortunately "it's complicated" to implement well, | especially when you try to tele-spawn and manage resources | beyond compute cycles (network connections, files, file | handles, ...) | | Aren't all of these resources namespaced/containerized in | modern Linux? This should make it feasible to checkpoint and | restore them on the same machine (via, e.g. the CRIU patchset) | and true location-independence is not _that_ much harder. One | of the hardest parts (not even implemented in plan9, AFAICT) is | distributed shared memory (allowing for sharing a _single_ | virtual address space across cluster nodes), but even that AIUI | has some research-level implementations. | gowld wrote: | Also , Haskell Control.Distributed.Fork | | > This module provides a common interface for offloading an IO | action to remote executors. | | > It uses StaticPointers language extension and distributed- | closure library for serializing closures to run remotely. This | blog post[1] is a good introduction for those. | | > In short, if you you need a Closure a: | | > One important constraint when using this library is that it | assumes the remote environment is capable of executing the | exact same binary. On most cases, this requires your host | environment to be Linux. | | https://hackage.haskell.org/package/distributed-fork-0.0.1.3... | | [1] https://blog.ocharles.org.uk/blog/guest- | posts/2014-12-23-sta... | felixgallo wrote: | Always worth a reread: | https://joearms.github.io/published/2013-11-21-My-favorite-e... | eointierney wrote: | <About a year later I had to write a paper. One of the | disadvantages of being a researcher is that in order to get | money you have to write a paper about something or other, the | paper is never about what currently interests you at the | moment, but must be about what the project that financed your | research expects to read about. | | Well I had my gossip network setup on planet lab and I could | tell it to become anything, so I told it to become a content | distribution networks and used a gossip algorithm to make | copies of the same file on all machine on the network and wrote | a paper about it and everybody was happy.> | | I miss Joe, not that I ever met him, but his attitude and good | humour are inspiring. | gnufx wrote: | For what it's worth, at least for HPC-ish distributed computing, | this sort of thing turns out not to be terribly worthwhile. We | have a standard for distribution of computation, shared memory, | i/o, and process starting in MPI (and, for instance, DMTCP to | migrate the distributed application if necessary, though I think | DMTCP needs a release). | | I don't know what its current status is, but the HPC-ish Bproc | system has/had an rfork [1]. Probably the most HPC-oriented SSI | system, Kerrighed died, as did the Plan 9-ish xcpu, though that | was a bit different. | | 1. | https://www.penguinsolutions.com/computing/documentation/scy... | zozbot234 wrote: | The biggest benefit is arguably that codes that are designed | for "telefork" and perhaps remote threads can also be scaled | _down_ to a single shared-memory machine, and run way more | efficiently than if they had been coded using the MPI approach. | Whilst you don 't really add much of any overhead when running | in a cluster, assuming that the codes are designed properly. | daenz wrote: | Lambda/Cloud Functions are starting to converge on this idea. It | will eventually get streamlined and ergonomic enough that it | appears you're executing an expensive or async function locally, | except you aren't. ___________________________________________________________________ (page generated 2022-05-31 23:00 UTC)