[HN Gopher] DeepNet: Scaling Transformers to 1k Layers
       ___________________________________________________________________
        
       DeepNet: Scaling Transformers to 1k Layers
        
       Author : homarp
       Score  : 4 points
       Date   : 2022-03-02 22:10 UTC (50 minutes ago)
        
 (HTM) web link (arxiv.org)
 (TXT) w3m dump (arxiv.org)
        
       ___________________________________________________________________
       (page generated 2022-03-02 23:00 UTC)