[HN Gopher] Teleforking a Process onto a Different Computer
       ___________________________________________________________________
        
       Teleforking a Process onto a Different Computer
        
       Author : kaladin-jasnah
       Score  : 56 points
       Date   : 2022-05-31 20:37 UTC (2 hours ago)
        
 (HTM) web link (thume.ca)
 (TXT) w3m dump (thume.ca)
        
       | cl0ckt0wer wrote:
       | This sounds like Kafka, but more low level.
        
         | latenightcoding wrote:
         | Not even close
        
       | basementcat wrote:
       | Both MOSIX and openMOSIX supported fork()ing to another node on
       | the network. https://en.m.wikipedia.org/wiki/MOSIX
        
       | AaronFriel wrote:
       | The Cloud Haskell project and language is likely the only one to
       | get this right, thanks to strictly enforced purity. It's much
       | simpler to understand, absence global mutable state, whether it's
       | safe and possible to serialize a closure and run it somewhere
       | else. (fork(2) being a closure by another name.)
       | 
       | In almost all other languages there's just no way to know if a
       | closure is holding on to a file descriptor.
       | 
       | Critics may say the Haskell closures could contain
       | `unsafePerformIO`, but as the saying goes: now you have two
       | problems.
        
         | gpderetta wrote:
         | Isn't fork more of a continuation?
         | 
         | /pedantic
        
         | gnufx wrote:
         | Is Cloud [sigh] Haskell still alive?
         | 
         | For comparison, two old distributed lexical scope systems were
         | Cardelli's Obliq and Kelsey's(?) Kali Scheme. From what I
         | remember, not like remote forking, though.
        
       | fleddr wrote:
       | In the late 90s I attended a tour at Holland Signaal, a dutch
       | defense company producing radar and anti-missile systems.
       | 
       | I remember vividly how they demonstrated an unbreakable process.
       | They had a computer running a process and no matter what happened
       | to that computer, the next one would flawlessly continue the
       | process down to the cycle, with no change of corruption or
       | skipping a beat.
       | 
       | It may very well be that this is actually not very difficult, but
       | it seemed difficult and impressive.
       | 
       | Perhaps more shocking were ultra high resolution radar screens,
       | some 3 generations ahead of anything I had seen in the consumer
       | space, showing an incredible visualization of the air space,
       | live. Showing exactly which plane is where, the model/type, age,
       | fuel on board, hostile/friendly, all of it.
       | 
       | They even had a "situation room" with a holodeck chair in the
       | middle, full of controls. The entire room was covered in wall-
       | size screens basically showing the air space of the entire
       | country, being live analyzed.
       | 
       | Sounds very 2022, not 1998.
        
       | a-dub wrote:
       | this is a fun hack. it would be interesting to look at some real
       | world workloads and compare whether this sort of init once ship
       | initialized memory image everywhere style is faster than just
       | initializing everywhere.
        
       | tenken wrote:
       | Doesn't Erlang support these ideas of distributed computing ....
       | And if I recall correctly Clipper supported remote execution of
       | objects, or sharing object code in a distributed fashion.
        
       | mghfreud wrote:
       | isn't this exactly what the vm migration in cloud is?
        
         | mlyle wrote:
         | No. VM migration moves entire virtual computers. Forking makes
         | a copy of a process with the current state; this moves that
         | single duplicated process to a different machine.
        
           | mghfreud wrote:
           | Virtual computer is a bunch of processes.
        
             | speed_spread wrote:
             | And a kernel. And drivers. And devices. And busses. And
             | interrupts.
        
       | Animats wrote:
       | That used to be in some UNIX variants, such as UCLA Locus and the
       | IBM derivatives of that. But it never got to be a Linux thing.
        
         | Fnoord wrote:
         | Was VMS capable of achieving this as well?
        
       | jonathaneunice wrote:
       | Congratulations! You have just reinvented the core idea of UCLA's
       | LOCUS distributed computing project from 1979.
       | https://en.wikipedia.org/wiki/LOCUS
       | 
       | Reinventing LOCUS also has a strong heritage. Bell Lab's Plan 9,
       | for example, did so in part in the late 1980s.
       | 
       | While never a breakout commercial success, tele-forking and its
       | slightly more advanced cousins machine-to-machine process
       | migration and cluster-wide process pools intrigued some of the
       | best minds in distributed computing for 20+ years.
       | 
       | Unfortunately "it's complicated" to implement well, especially
       | when you try to tele-spawn and manage resources beyond compute
       | cycles (network connections, files, file handles, ...) that are
       | important to scale up the idea.
        
         | zozbot234 wrote:
         | > Unfortunately "it's complicated" to implement well,
         | especially when you try to tele-spawn and manage resources
         | beyond compute cycles (network connections, files, file
         | handles, ...)
         | 
         | Aren't all of these resources namespaced/containerized in
         | modern Linux? This should make it feasible to checkpoint and
         | restore them on the same machine (via, e.g. the CRIU patchset)
         | and true location-independence is not _that_ much harder. One
         | of the hardest parts (not even implemented in plan9, AFAICT) is
         | distributed shared memory (allowing for sharing a _single_
         | virtual address space across cluster nodes), but even that AIUI
         | has some research-level implementations.
        
         | gowld wrote:
         | Also , Haskell Control.Distributed.Fork
         | 
         | > This module provides a common interface for offloading an IO
         | action to remote executors.
         | 
         | > It uses StaticPointers language extension and distributed-
         | closure library for serializing closures to run remotely. This
         | blog post[1] is a good introduction for those.
         | 
         | > In short, if you you need a Closure a:
         | 
         | > One important constraint when using this library is that it
         | assumes the remote environment is capable of executing the
         | exact same binary. On most cases, this requires your host
         | environment to be Linux.
         | 
         | https://hackage.haskell.org/package/distributed-fork-0.0.1.3...
         | 
         | [1] https://blog.ocharles.org.uk/blog/guest-
         | posts/2014-12-23-sta...
        
       | felixgallo wrote:
       | Always worth a reread:
       | https://joearms.github.io/published/2013-11-21-My-favorite-e...
        
         | eointierney wrote:
         | <About a year later I had to write a paper. One of the
         | disadvantages of being a researcher is that in order to get
         | money you have to write a paper about something or other, the
         | paper is never about what currently interests you at the
         | moment, but must be about what the project that financed your
         | research expects to read about.
         | 
         | Well I had my gossip network setup on planet lab and I could
         | tell it to become anything, so I told it to become a content
         | distribution networks and used a gossip algorithm to make
         | copies of the same file on all machine on the network and wrote
         | a paper about it and everybody was happy.>
         | 
         | I miss Joe, not that I ever met him, but his attitude and good
         | humour are inspiring.
        
       | gnufx wrote:
       | For what it's worth, at least for HPC-ish distributed computing,
       | this sort of thing turns out not to be terribly worthwhile. We
       | have a standard for distribution of computation, shared memory,
       | i/o, and process starting in MPI (and, for instance, DMTCP to
       | migrate the distributed application if necessary, though I think
       | DMTCP needs a release).
       | 
       | I don't know what its current status is, but the HPC-ish Bproc
       | system has/had an rfork [1]. Probably the most HPC-oriented SSI
       | system, Kerrighed died, as did the Plan 9-ish xcpu, though that
       | was a bit different.
       | 
       | 1.
       | https://www.penguinsolutions.com/computing/documentation/scy...
        
         | zozbot234 wrote:
         | The biggest benefit is arguably that codes that are designed
         | for "telefork" and perhaps remote threads can also be scaled
         | _down_ to a single shared-memory machine, and run way more
         | efficiently than if they had been coded using the MPI approach.
         | Whilst you don 't really add much of any overhead when running
         | in a cluster, assuming that the codes are designed properly.
        
       | daenz wrote:
       | Lambda/Cloud Functions are starting to converge on this idea. It
       | will eventually get streamlined and ergonomic enough that it
       | appears you're executing an expensive or async function locally,
       | except you aren't.
        
       ___________________________________________________________________
       (page generated 2022-05-31 23:00 UTC)