[HN Gopher] How Sber Built ruDALL-E ___________________________________________________________________ How Sber Built ruDALL-E Author : aroccoli Score : 46 points Date : 2021-12-29 20:12 UTC (1 days ago) (HTM) web link (serokell.io) (TXT) w3m dump (serokell.io) | [deleted] | amelius wrote: | These models all seem to have the flaw that faces don't come out | symmetrically. Especially eyes look like they are in the wrong | location. | criticaltinker wrote: | _> The model is considered the greatest computational project in | Russia for now, totaling 24,256 GPU days to train the models._ | | _> We don't know for sure why OpenAI hasn't shown its work in a | more reproducible way. But this step is definitely done to | stimulate the further openness and progress of such models._ | | Super interesting and great commentary, thanks for sharing! | f311a wrote: | Sber also has an open-source version of GPT-3 for Russian. | | Sber is a state-owned Russian bank which is a pretty funny detail | given that a lot of banks can't even built a decent mobile app. | cpursley wrote: | The Sberbank mobile app in Russia is an order of magnitude | better than anything I've used in the US. The other large | Russian tech and service company apps are very very good | (Ozone, anything Yandex puts out). Even the federal services | apps are well executed - you can pay your property taxes and | other services by scanning a QR code. Some great tech coming | out of that country (accusations of hacking, aside). | baybal2 wrote: | Sberbank is a joke of a bank, mostly serving older generation | who kept using it on inertia from the time it was the only | bank you got in the country. | | Generally, it's bureaucratic, kafkaesque, and ill, as the | country which once made it. | trhway wrote: | Sberbank CEO (he is a Russian German and has some typical | traits making him noticeably different from typical Russian | bureaucrat) and his posse is the leading part of the | technocratic wing of the political elite in Russia. Their | people also lead another important bank - VTB | (international payments/etc for large corps), and Sber has | strong hold on various national networks, like naturally | anything money related, like municipal services and traffic | ticket payments for example, as well as on generic network | infra and datacenters. If Putin is gone tomorrow there is | strong chance that those technocrats will take the power (i | haven't noticed any significant animosity between them and | FSB which would otherwise be a complication). Particularly | important aspect showing their power is that there have | been no corruption scandals associated with them, at least | not that i can remember in the last decade at least. They | tread very carefully, not making any open political claims | while presenting themselves basically like apolitical tech- | infrastructure/platform for the efficient government and | society and doubling down on the source of their shadow | power - network/infra/technocracy. Thus they can't allow | themselves to suck too much technically, and thus they | naturally hire decent technical people (i have some first | handshakes among the upper management in technology there) | kgeist wrote: | Have you used Sberbank lately? I have a different | experience and I'm not from the "older generation". Its | mobile app is pretty decent, this year I got a mortgage | loan and it went pretty smooth, I didn't notice anything | bureaucratic or kafkaesque about it? I'm its client for 4 | years now and I'm struggling to remember negative | experience with it. They've been having an overhaul lately, | maybe it was far worse before. Yeah the cool kids prefer | Tinkoff nowadays but it's not true that only old people use | Sberbank. | cpursley wrote: | Even so, they do a pretty good job for a state-backed bank. | Better than anything state run I've experienced in the US. | | But I agree in principal with you - and from what I hear, | Tinkoff is one of the better choices and the founder is | well respected. | zkid18 wrote: | Well, so do 95% retail banks across the globe. | another_kel wrote: | It's a shitty bank by russian standards indeed, but this | has nothing to do with the fact that | | >The Sberbank mobile app in Russia is an order of magnitude | better than anything I've used in the US. | minimaxir wrote: | A very curious effect of ruDALL-E is that the finetuning works on | small datasets with unexpectedly good results. The Sneakers | example they note in this article is on about ~10k images. | | As an experiment, I finetuned ruDALL-E on about 1000 images of | Pokemon and generated from that, which yielded incredible results | that went viral: | https://twitter.com/minimaxir/status/1470913487085785089 | | I then tried finetuning ruDALL-E on _1_ Pokemon, yet still good | /horrifying results: | https://twitter.com/minimaxir/status/1474913997807755268 | | Unfortunately it's still a convoluted process to finetune | ruDALL-E; I hope they end up releasing a smaller model to make it | possible to do on a smaller/free GPU. (if they do, I'll release a | streamined Colab notebook + blog post on how to do it) | [deleted] | [deleted] | etaioinshrdlu wrote: | How much GPU RAM and time does it currently take to fine-tune | the current model? | minimaxir wrote: | Essentially all of a 16GB GPU VRAM, even with some layers | frozen. | | The more diverse the input images, the longer/more epochs the | finetuning process should take in order to get stable | results. The first Pokemon model was trained for about 4.5 | hours; the one-shot model was about 2 minutes. | lostmsu wrote: | Curious. How does freezing layers save you memory? Does it | save compute time much? | | I understand the frozen layers do not need gradients to be | stored? | minimaxir wrote: | Essentially yes. That technique is not exclusive to | ruDALL-E; large models often freeze early layers and | train lower layers only due to VRAM constraints. | lostmsu wrote: | Oh, right, only freezing early layers makes sense. I was | thinking you froze inner ones, but gradients would need | to be computed and kept for them to backpropagate to the | unfrozen early ones. ___________________________________________________________________ (page generated 2021-12-30 23:00 UTC)