[HN Gopher] DeepNet: Scaling Transformers to 1k Layers ___________________________________________________________________ DeepNet: Scaling Transformers to 1k Layers Author : homarp Score : 4 points Date : 2022-03-02 22:10 UTC (50 minutes ago) (HTM) web link (arxiv.org) (TXT) w3m dump (arxiv.org) ___________________________________________________________________ (page generated 2022-03-02 23:00 UTC)