[HN Gopher] Emu Video and Emu Edit, our latest generative AI res... ___________________________________________________________________ Emu Video and Emu Edit, our latest generative AI research milestones Author : ot Score : 112 points Date : 2023-11-16 15:59 UTC (7 hours ago) (HTM) web link (ai.meta.com) (TXT) w3m dump (ai.meta.com) | dougmwne wrote: | Emu Edit is awesome. I think we have officially brought this | scene from Star Trek to life. | | https://m.youtube.com/watch?v=NXX0dKw4SjI&pp=ygUII3Npbm50ZWs... | clows wrote: | I thought of this Running Man scene | https://www.youtube.com/watch?v=BVdOr0z6X7Y | bane wrote: | With the advent of these models my head cannon now insists that | when Star Trek characters say they "programmed" something, they | really mean that they have a log of all of their iterative | prompts and that there's some optimization the computer can use | to aggregate all of those into the final resulting warp | model/holodeck simulation/transporter filter/biobed pathogen | detector/etc without having to reiterate through all of those | prompts again...kind of like a NixOS declarative build. | | And when somebody comes along and fixes their program or | reprograms what they did, they simply insert or change some of | the prompts along the way and get a different effect. | | When the characters add new data to the computer (like the | episode where Geordi added the psycho profile of the enterprise | engine designer), they're just tuning the foundational model | with some new input data. | | Yeah....that _feels_ right for now to me. | cma wrote: | "Computer, load up CELERY MAN, please" | | https://www.youtube.com/watch?v=a8K6QUPmv8Q | echelon wrote: | Tim and Eric are going to go crazy with Gen AI. They won't | need Adult Swim to toss them shoestring budgets. | Ajedi32 wrote: | > Computer, show me a table. | | > There are 5047 classifications of tables on file. Specify | design parameters. | | Interestingly enough, it seems existing AI models are already | better than the Star Trek computer at dealing with ambiguity. | Stable Diffusion would just generate a "normal" table and let | you go from there. | dougmwne wrote: | Yes, they seem to handle emotion, humor and ambiguity better | than Data or any computer ever on the shows. 24th century | technology, today. | colesantiago wrote: | Does anyone know where the source code is, I can't seem to find | it anywhere. | dado3212 wrote: | I don't think either of these (or the base Emu model) are open | source. | acheong08 wrote: | That's a bit disappointing. Meta had been on an "open source" | roll lately | JaDogg wrote: | First dose of gen AI is free | satvikpendem wrote: | Technically none of their models are actually open source. | burningion wrote: | There's some source code in the paper for Emu edit at least. If | you look at the supplementary material in the paper, you'll see | they spell out the techniques used there too. | | I didn't see a repository, but I think in this case, the paper | is actually a perfect balance of detail? I think Meta benefits | from startups building using their tooling (startups usually | buy ads), and so the lack of a full implementation leaves a bit | of room for startups to turn the work in to something a bit | more production ready. | | The cool techniques from the paper are: | | Generating a bunch of example images in one go, and using CLiP | to score your generated images | | And mixing pre-built pipelines and grammars to execute common | tasks. | | These two ideas alone (with examples) give people in the space | plenty to run with. | | Great paper! | enonimal wrote: | Is anyone able to determine how long it takes to generate a video | with one of these methods? Can't find it in the paper. | liuliu wrote: | Emu image is not significantly slower than SDXL or similar. So | you would expect to have similar performance as Hotshot. The | upscaler (8 frame to 37 frame) version probably would take | significantly longer. | tomdell wrote: | An impressive technical achievement, yes - but the | presentation/marketing of this is absurd. | | The generated videos are aesthetically horrendous. I don't know | what kind of mental gymnastics are going on that they can | confidently describe something where the body shapes are | nonsensically in flux with every change of frame (look at the | eagle's talons, or the dog's leg movements as it runs) as "high- | quality video". | | Is generative AI hype blinding them to how hideous these videos | are, or do they know and they just pretend like it's something it | isn't? | BoorishBears wrote: | Check out what AI generated images looked like 24 months ago | and this comment may feel a lot less pithy. | ShamelessC wrote: | Compared to prior work, it looks unbelievable. Is this just an | armchair criticism or have you been paying any attention? | davesque wrote: | Definitely looks like progress, but they're still firmly in the | center of the uncanny valley. | scudsworth wrote: | a huge pile of money on fire forever ___________________________________________________________________ (page generated 2023-11-16 23:00 UTC)