[HN Gopher] Pathways Autoregressive Text-to-Image Model (Parti)
       ___________________________________________________________________
        
       Pathways Autoregressive Text-to-Image Model (Parti)
        
       Author : amrrs
       Score  : 56 points
       Date   : 2022-06-22 17:37 UTC (5 hours ago)
        
 (HTM) web link (parti.research.google)
 (TXT) w3m dump (parti.research.google)
        
       | minimaxir wrote:
       | Like Imagen, Parti is not open-sourced/easily accessible for the
       | same reasons.
        
       | htrp wrote:
       | https://news.ycombinator.com/item?id=31484562
       | 
       | Relevant discussion from the last model (imagen)
        
       | davikr wrote:
       | It's interesting how LAION-400M, an open-access dataset for
       | democratized AI, was used to train this model which will
       | seemingly never be truly available in its full capacity for the
       | lay population. Is it time for open-access datasets to consider
       | licensing measures to prevent this?
        
         | 6gvONxR4sf7o wrote:
         | More restrictive licensing wouldn't be enough. This stuff is
         | sufficiently transformative to count as fair use without any
         | permission at all from the data owner. New laws will be
         | required for stuff like this.
        
       | isoprophlex wrote:
       | By now these researchers could show me a deep learning model that
       | accurately predicts the future, and I'd shrug my shoulders and
       | say "so what?".
       | 
       | As a mortal, there's not much to learn from these insanely big
       | models anymore, which makes me kinda sad. Training them is
       | prohibitively expensive, the data and code are often
       | inaccessible, and i highly suspect that the learning rate
       | schedules to get these to converge are also black magic-ish...
        
         | [deleted]
        
         | albertzeyer wrote:
         | There is public code and data available to train similar models
         | (text generation, image generation, whatever you like).
         | Training details are also often available. The learning rate
         | schedule is actually nothing special.
         | 
         | However, you are fully right that the computation costs are
         | very high.
         | 
         | One thing we can learn is: It really works. It scales up and
         | gets better. Without really doing anything special. This was
         | kind of unexpected to most people. This is really interesting.
         | Most people expected that there is some limit and the
         | performance would level out. But so far this does not seem to
         | be the case. It rather looks like you could scale it up as much
         | as you want to get even better and better performance without
         | any limitation.
         | 
         | So, what to do now with this knowledge?
         | 
         | Maybe we should focus the research on reducing the computation
         | costs. E.g. by better hardware (maybe neuromorphic), or more
         | computational efficient models.
        
         | adamsmith143 wrote:
         | >By now these researchers could show me a deep learning model
         | that accurately predicts the future, and I'd shrug my shoulders
         | and say "so what?".
         | 
         | So what? Are you just taking the piss? You are saying a literal
         | Oracle wouldn't be impressive because the "learning rate
         | schedules" are black magic??
        
       | moritonal wrote:
       | I can't tell whether it's a curse or a blessing that the company
       | most invested and succeeding seemingly in AI is also the one who
       | seems least capable of commercially leveraging it and have shown
       | to fail at doing so with most their products.
        
         | albertzeyer wrote:
         | But they actually have machine learning in almost all their
         | products. E.g. Google Search, YouTube, GMail, Maps, AdSense,
         | all have machine learning at their core.
        
           | htrp wrote:
           | The ML in google search is apparently atrocious.... see all
           | of the posts about how Search doesn't work anymore
        
       | narrator wrote:
       | It's funny how they never release the model. I guess they are
       | scared of spammers, 4chan or worse, the Russians. This is the
       | harbinger of the future, isn't it? Technology that's too powerful
       | to be widely deployed is kept under lock and key by priests who
       | deal in secret knowledge only available to the properly
       | initiated.
        
         | jasonwatkinspdx wrote:
         | I really doubt that the motivation for not publishing the model
         | for "turns text into trippy images" is "it's too powerful to
         | trust the world with it" vs banal business reasons.
        
         | nootropicat wrote:
         | Eh, the cost of training is already within the reach of wealthy
         | individuals, and as better TPUs/GPUs appear on the cloud cost
         | drops.
         | 
         | This is definitely going to become commonly available tech this
         | decade.
        
       | albertzeyer wrote:
       | There is equal contribution, core contribution, and then the
       | order of authors. Which of those attributes have actually which
       | meaning?
       | 
       | I thought the order defines how much someone has contributed.
       | Core contribution sounds like it should be the most, so it should
       | be first but it is not here. Equal contribution sounds like it
       | should come right behind each other in the order but this is also
       | not the case here.
        
       | aantix wrote:
       | What are the inputs derived from the training images?
       | 
       | Do they do object detection prior and if so, at what granularity
       | (toes, eyes, hat, gloves, etc)? Is it at the pixel level?
        
         | Imnimo wrote:
         | I think the training data is just image/caption pairs. I don't
         | think there's any notion of localizing or detecting specific
         | objects in the training images.
        
       | ceeplusplus wrote:
       | Yet another LLM that is not released because the model doesn't
       | produce outputs which align with the researchers' Western,
       | liberal viewpoint. If the authors care so much then why are they
       | even releasing the architecture? To do the research but not
       | release the model weights because your feelings were hurt by the
       | output of some matrix multiplication is hypocrisy at its finest -
       | the authors get all the PR attention and benefits of publishing
       | with the veneer of being politically correct, but the actual
       | negative impact is not mitigated in the slightest. The real
       | difficulty is not reproducing the research but identifying the
       | architecture that works best in the first place, and the authors
       | have done that for any would-be malicious actors.
        
         | ghostly_s wrote:
         | > because the model doesn't produce outputs which align with
         | the researchers' Western, liberal viewpoint.
         | 
         | What evidence do you have for claiming this motivation?
        
         | davikr wrote:
         | Yes, thankfully Google has saved us from this one-in-a-century
         | world-ending catastrophe.
        
       ___________________________________________________________________
       (page generated 2022-06-22 23:00 UTC)