hngopher.com

       [HN Gopher] Emu Video and Emu Edit, our latest generative AI res...
       ___________________________________________________________________
        
       Emu Video and Emu Edit, our latest generative AI research
       milestones
        
       Author : ot
       Score  : 112 points
       Date   : 2023-11-16 15:59 UTC (7 hours ago)
        
 (HTM) web link (ai.meta.com)
 (TXT) w3m dump (ai.meta.com)
        
       | dougmwne wrote:
       | Emu Edit is awesome. I think we have officially brought this
       | scene from Star Trek to life.
       | 
       | https://m.youtube.com/watch?v=NXX0dKw4SjI&pp=ygUII3Npbm50ZWs...
        
         | clows wrote:
         | I thought of this Running Man scene
         | https://www.youtube.com/watch?v=BVdOr0z6X7Y
        
         | bane wrote:
         | With the advent of these models my head cannon now insists that
         | when Star Trek characters say they "programmed" something, they
         | really mean that they have a log of all of their iterative
         | prompts and that there's some optimization the computer can use
         | to aggregate all of those into the final resulting warp
         | model/holodeck simulation/transporter filter/biobed pathogen
         | detector/etc without having to reiterate through all of those
         | prompts again...kind of like a NixOS declarative build.
         | 
         | And when somebody comes along and fixes their program or
         | reprograms what they did, they simply insert or change some of
         | the prompts along the way and get a different effect.
         | 
         | When the characters add new data to the computer (like the
         | episode where Geordi added the psycho profile of the enterprise
         | engine designer), they're just tuning the foundational model
         | with some new input data.
         | 
         | Yeah....that _feels_ right for now to me.
        
         | cma wrote:
         | "Computer, load up CELERY MAN, please"
         | 
         | https://www.youtube.com/watch?v=a8K6QUPmv8Q
        
           | echelon wrote:
           | Tim and Eric are going to go crazy with Gen AI. They won't
           | need Adult Swim to toss them shoestring budgets.
        
         | Ajedi32 wrote:
         | > Computer, show me a table.
         | 
         | > There are 5047 classifications of tables on file. Specify
         | design parameters.
         | 
         | Interestingly enough, it seems existing AI models are already
         | better than the Star Trek computer at dealing with ambiguity.
         | Stable Diffusion would just generate a "normal" table and let
         | you go from there.
        
           | dougmwne wrote:
           | Yes, they seem to handle emotion, humor and ambiguity better
           | than Data or any computer ever on the shows. 24th century
           | technology, today.
        
       | colesantiago wrote:
       | Does anyone know where the source code is, I can't seem to find
       | it anywhere.
        
         | dado3212 wrote:
         | I don't think either of these (or the base Emu model) are open
         | source.
        
           | acheong08 wrote:
           | That's a bit disappointing. Meta had been on an "open source"
           | roll lately
        
             | JaDogg wrote:
             | First dose of gen AI is free
        
             | satvikpendem wrote:
             | Technically none of their models are actually open source.
        
         | burningion wrote:
         | There's some source code in the paper for Emu edit at least. If
         | you look at the supplementary material in the paper, you'll see
         | they spell out the techniques used there too.
         | 
         | I didn't see a repository, but I think in this case, the paper
         | is actually a perfect balance of detail? I think Meta benefits
         | from startups building using their tooling (startups usually
         | buy ads), and so the lack of a full implementation leaves a bit
         | of room for startups to turn the work in to something a bit
         | more production ready.
         | 
         | The cool techniques from the paper are:
         | 
         | Generating a bunch of example images in one go, and using CLiP
         | to score your generated images
         | 
         | And mixing pre-built pipelines and grammars to execute common
         | tasks.
         | 
         | These two ideas alone (with examples) give people in the space
         | plenty to run with.
         | 
         | Great paper!
        
       | enonimal wrote:
       | Is anyone able to determine how long it takes to generate a video
       | with one of these methods? Can't find it in the paper.
        
         | liuliu wrote:
         | Emu image is not significantly slower than SDXL or similar. So
         | you would expect to have similar performance as Hotshot. The
         | upscaler (8 frame to 37 frame) version probably would take
         | significantly longer.
        
       | tomdell wrote:
       | An impressive technical achievement, yes - but the
       | presentation/marketing of this is absurd.
       | 
       | The generated videos are aesthetically horrendous. I don't know
       | what kind of mental gymnastics are going on that they can
       | confidently describe something where the body shapes are
       | nonsensically in flux with every change of frame (look at the
       | eagle's talons, or the dog's leg movements as it runs) as "high-
       | quality video".
       | 
       | Is generative AI hype blinding them to how hideous these videos
       | are, or do they know and they just pretend like it's something it
       | isn't?
        
         | BoorishBears wrote:
         | Check out what AI generated images looked like 24 months ago
         | and this comment may feel a lot less pithy.
        
         | ShamelessC wrote:
         | Compared to prior work, it looks unbelievable. Is this just an
         | armchair criticism or have you been paying any attention?
        
       | davesque wrote:
       | Definitely looks like progress, but they're still firmly in the
       | center of the uncanny valley.
        
       | scudsworth wrote:
       | a huge pile of money on fire forever
        
       ___________________________________________________________________
       (page generated 2023-11-16 23:00 UTC)