hngopher.com

       [HN Gopher] Powerful AI Can Now Be Trained on a Single Computer
       ___________________________________________________________________
        
       Powerful AI Can Now Be Trained on a Single Computer
        
       Author : MindGods
       Score  : 61 points
       Date   : 2020-07-17 20:48 UTC (2 hours ago)
        
 (HTM) web link (spectrum.ieee.org)
 (TXT) w3m dump (spectrum.ieee.org)
        
       | rgovostes wrote:
       | Is SLIDE being used anywhere, or were flaws discovered? It was
       | supposed to massively accelerate training on CPUs.
       | 
       | https://www.hpcwire.com/off-the-wire/rice-researchers-algori...
        
       | RcouF1uZ4gsC wrote:
       | > His group took advantage of working on a single machine by
       | simply cramming all the data to shared memory where all processes
       | can access it instantaneously.
       | 
       | If you can get all your data into RAM on a single computer, you
       | can have a huge speedup, even over a cluster that has in
       | aggregate more resources.
       | 
       | Frank McSherry has some more about this, though not directly
       | about ML training.
       | 
       | http://www.frankmcsherry.org/graph/scalability/cost/2015/01/...
        
       | ladberg wrote:
       | So this basically boils down to keeping your training data in
       | memory? Is there something else I missed?
        
         | dan-robertson wrote:
         | It looks obvious when you write it like that but I think many
         | people are surprised by just how much slower distributed
         | computations can be compared to non distributed systems. Eg the
         | COST paper [1]
         | 
         | [1]
         | https://www.usenix.org/system/files/conference/hotos15/hotos...
        
       | datameta wrote:
       | The machine used is a 36-core + single gpu. So not quite a home
       | computer yet but this is some serious progress!
       | 
       | Paper: https://arxiv.org/abs/2006.11751
       | 
       | Source: https://github.com/alex-petrenko/sample-factory
        
         | cheez wrote:
         | That's my machine!
        
         | cbozeman wrote:
         | I dunno... $8000 builds a 64c/128t 256 GB RAM workstation with
         | the same GPU these researchers used
         | (https://pcpartpicker.com/list/P6WTL2). That's arguably in the
         | realm of home computer for just about anyone making $90,000 and
         | above, I would think; I would also think anyone working in
         | those fields could command at least that salary or greater,
         | unless they're truly entry level positions. Seems it would be a
         | reasonable investment for someone actively working in the area
         | of machine learning / artificial intelligence.
        
           | eanzenberg wrote:
           | Anyone that can afford a car can afford this.
        
           | nqzero wrote:
           | what's the per-hour cost spot price of this machine on AWS ?
        
       | rbanffy wrote:
       | It always could be trained on a single computer. It was just a
       | matter of physical size versus time.
        
       | dan-robertson wrote:
       | Lots of people are focusing on this being done on a particularly
       | powerful workstation, but the computer described seems to have
       | power at a similar order of magnitude to the many servers which
       | would be clustered together in a more traditional large ML
       | computation. Either those industrial research departments could
       | massively cut costs/increase output by just "magically keeping
       | things in ram," or these researchers have actually found a way to
       | reduce the computational power that is necessary.
       | 
       | I find the efforts of modern academics to do ML research on
       | relatively underpowered hardware by being more clever about it to
       | be reminiscent of soviet researchers who, lacking anything like
       | the access to computation of their American counterparts, were
       | forced to be much more thorough and clever in their analysis of
       | problems in the hope of making them tractable.
        
       | fxtentacle wrote:
       | I'm surprised that this is IEEE worthy and not just common sense.
       | Of course there'll be huge speedups if, and only if, your dataset
       | fits into main RAM and your model fits into the GPU RAM.
       | 
       | But for most state of the art models (think gpt with billions of
       | parameters) that is far from being the case.
        
       | softwrdethknell wrote:
       | Apple's in-house SOC is the future.
       | 
       | I suspect the cloud has a decade, maybe less, of hype to grift
       | on.
       | 
       | Huge data sets on a personal computer and opt-in data sharing
       | with business and healthcare, etc will be the new norm.
       | 
       | Further out, software as we know it will cease to exist as
       | entirely custom chips per application are the norm. IN TIME.
       | 
       | New hardware wars to capture consumer attention incoming.
        
         | rbanffy wrote:
         | I don't need to own a fast workstation unless I want to
         | continuously train my models. I can, however, quickly get a
         | cloud instance that's much larger than that and train the model
         | at a fraction of the time and cost of a desktop workstation.
        
       | vz8 wrote:
       | How much RAM did their test workstation have? I can't seem to
       | spot it.
        
         | neatze wrote:
         | According to article System 1 had 128 GB DDR4 and System 2 has
         | 256 GB DDR4.
        
         | genpfault wrote:
         | Paper lists 3 configs, one with 128GB and two with 256GB.
        
         | m463 wrote:
         | I'm guessing most of the perf comes from the gpu memory size.
        
           | Enginerrrd wrote:
           | While that's likely true, I generally find that it's quite
           | rare that enough attention is paid to how information is
           | moved between disk, RAM, CPU and GPU. And paying close
           | attention to that can be extremely helpful. Taking the RAM up
           | to 11 can eliminate a lot of the art to it, which is a good
           | thing.
        
             | rbanffy wrote:
             | A machine in this class can easily have a terabyte of RAM.
             | Add a couple Optane DC sticks and you have enormous storage
             | at exceedingly high bandwidth.
        
       ___________________________________________________________________
       (page generated 2020-07-17 23:00 UTC)