hngopher.com

       [HN Gopher] The Computational Limits of Deep Learning
       ___________________________________________________________________
        
       The Computational Limits of Deep Learning
        
       Author : ozdave
       Score  : 39 points
       Date   : 2020-08-17 21:07 UTC (1 hours ago)
        
 (HTM) web link (arxiv.org)
 (TXT) w3m dump (arxiv.org)
        
       | xiphias2 wrote:
       | ,,This article reports on the computational demands of Deep
       | Learning applications in five prominent application areas and
       | shows that progress in all five is strongly reliant on increases
       | in computing power.''
       | 
       | I don't agree with the conclusion of the paper. The computing
       | architectures have been improving dramatically over the last few
       | years, and almost any task that was achievable 5 years ago with
       | deep learning is orders of magnitudes cheaper to train.
       | 
       | The energy resources taken by deep learning is increasing because
       | of the huge ROI for companies, but it will probably slow down as
       | the compute cost gets close to the cost of software engineers (or
       | the profit of a company), because at that point researching
       | improvements to the models gets relatively cheaper again.
        
       | cs702 wrote:
       | We can look Deep Learning's growing demands for computation and
       | despair, or view those growing demands as an economic incentive
       | to develop more powerful hardware that uses energy more
       | efficiently at a lower marginal cost and in a more sustainable
       | manner.
       | 
       | In other words, Deep Learning's growing need for computing power
       | seems to have reached a point at which it is now motivating
       | fundamental research to find greener, cheaper, more energy-
       | efficient hardware.
       | 
       | The economic incentives are _very_ powerful: Whichever companies
       | (or organizations, or countries) find ways to harness the most
       | computing power at the lowest marginal cost will win the race in
       | this market.
       | 
       | --
       | 
       | PS. The same could be said for Bitcoin mining: it is also
       | motivating fundamental research to develop greener, cheaper, more
       | energy-efficient, more powerful hardware. Whoever finds ways to
       | harness the most computing power at the lowest marginal cost will
       | make the most money processing transactions on the network.
        
         | rocqua wrote:
         | I think the case here is a lot easier than the case for bitcoin
         | mining. Bitcoin miners are so stupidly single purpose that
         | development there doesn't help much. Maybe in general it helps
         | create an industry for designing and manufacturing ASICs. I
         | suppose that might go into making ASICs for deep learning at
         | some point.
        
           | saddlerustle wrote:
           | Bitmain is one of TSMC's largest customers, and that
           | absolutely has been reinvested by TSMC in developing more
           | advanced fabrication techniques.
           | 
           | Also bitcoin mining chips are actually a lot like deep
           | learning chips in that it's a lot of simple operations scaled
           | out. And indeed, Bitmain now produces deep learning chips
           | too.
        
         | peterthehacker wrote:
         | Aren't companies like google[0] and nvidia[1] already doing
         | this?
         | 
         | The paper's point is that eventually we will reach computing
         | power limits and then we will have to improve the deep learning
         | algorithm's efficiency to continue to improve. From the
         | abstract:
         | 
         | > Continued progress in these applications will require
         | dramatically more computationally-efficient methods, which will
         | either have to come from changes to deep learning or from
         | moving to other machine learning methods.
         | 
         | [0] https://cloud.google.com/tpu [1] https://www.nvidia.com/en-
         | us/data-center/v100/
        
       | 256lie wrote:
       | I wonder how long we can continue overfitting these benchmark
       | datasets as a community of researchers? How much is ImageNet is
       | labeled incorrectly/subotimally?
        
       | freeone3000 wrote:
       | Will it? Or, like capital-intensive industries of the past, will
       | deep learning funnel its profits into bigger and bigger
       | computers, as has been done in the past and will be done again?
        
         | rabidrat wrote:
         | These are order-of-magnitude increases. If it costs $5m to
         | train GPT-3, which is 100x more compute than GPT-2, then it may
         | cost $500m to train GPT-4, and $50b to train GPT-5. This is
         | what is meant by economically (not to mention environmentally)
         | unsustainable.
        
           | freeone3000 wrote:
           | Environment aside (I don't even think the CURRENT rate of
           | training is environmentally safe) -- I see no inherent
           | problem with a 100x increase in cost per step holding anybody
           | back. Once it's trained, you can run it much cheaper. Who's
           | to say $50 billion of value can't be extracted from GPT-5?
        
             | AnimalMuppet wrote:
             | > Who's to say $50 billion of value can't be extracted from
             | GPT-5?
             | 
             | Perhaps it can. (Though the number of companies that can
             | afford to train it is rather small.) But can $5 trillion be
             | extracted from GPT-6? Even if it can, who can afford to
             | train it?
        
             | saddlerustle wrote:
             | Using electricity is not inherently damaging to the
             | environment. Very low cost and high power zero-carbon
             | generation sources exist (hydro and nuclear). For scale
             | also keep in mind that all the datacenters in the world
             | still use much less power than is used for smelting
             | aluminium.
        
       ___________________________________________________________________
       (page generated 2020-08-17 23:00 UTC)