[HN Gopher] Animate Anyone: Image-to-video synthesis for charact...
       ___________________________________________________________________
        
       Animate Anyone: Image-to-video synthesis for character animation
        
       Author : jasondavies
       Score  : 66 points
       Date   : 2023-11-30 17:45 UTC (5 hours ago)
        
 (HTM) web link (humanaigc.github.io)
 (TXT) w3m dump (humanaigc.github.io)
        
       | esotericsean wrote:
       | Pretty huge breakthrough. Hopefully we'll be able to access this
       | soon. Between this, SVD, and others, 2024 is going to be the year
       | of AI Video.
        
         | esafak wrote:
         | I suppose SVD means something other than singular value
         | decomposition, because that is not new?
        
       | allanrbo wrote:
       | Very impressive quality.
        
       | EwanG wrote:
       | I'm just waiting for the tool or toolchain where I can take a
       | manga that I like that doesn't have an anime, and get a season or
       | two out to watch when I feel like it rather than wait for it to
       | get an official release.
       | 
       | Bonus points if I can let the tool ingest season 1 or an OVA of
       | said material where a season 2 is never going to come (looking at
       | you "No Game, No Life")
        
         | Pxtl wrote:
         | "Hey Bing, can you make me a live action version of the
         | Scouring of the Shire as if it were part of the Peter Jackson
         | Lord of the Rings movies?".
        
         | all2 wrote:
         | To be honest, all the pieces are there to create a pipeline.
         | There's still a lot of work on the human side for shot
         | composition, camera movement, etc., but the pieces all exist
         | right now to make this a reality.
        
       | all2 wrote:
       | I'll mention Corridor Crew's _Rock, Paper, Scissors_ [0] as the
       | previous state of the art in terms of character animation /style
       | transfer/etc. using AI tooling.
       | 
       | I imagine this will make the barrier to entry for animated stuff
       | very, very low. You literally only need a character sheet.
       | 
       | Also, the creep factor for AI girlfriends has ratcheted up
       | another notch.
       | 
       | [0] https://www.youtube.com/watch?v=7QAGEvt-btI
        
       | elpocko wrote:
       | Why would you publish your findings on Github of all places, but
       | not release any code? I think this trend is really weird.
        
         | crazygringo wrote:
         | Because it's basically free webhosting and you don't need to
         | manage registering a domain?
         | 
         | I don't know for sure, but that's my guess. You could achieve
         | something similar with S3, but you need a credit card attached,
         | and then you need to worry about whose credit card, and what if
         | it gets unexpected traffic and who will pay...
         | 
         | You could use Google Sites as well, but then you need to buy a
         | domain, which again means requiring a credit card, and whose
         | responsibility is it to pay and for how many years?
         | 
         | I don't think it's mostly about the cost, I think it's mostly
         | about just not having to link a credit card?
        
         | octagons wrote:
         | Just guessing based on the authors' names and their affiliation
         | with Alibaba Group, but I think this research was published by
         | exclusively Chinese citizens.
         | 
         | In my experience, it's difficult to operate a small, personal
         | website from within China because of their regulations in
         | regards to non-government websites. Because of this, you will
         | often find that Chinese citizens will use approved (or at least
         | unrestricted) services like GitHub pages.
         | 
         | Having worked closely with many businesses based in China due
         | to my hobbies, I have noted that services like Google Docs and
         | Drive are favored for this reason.
         | 
         | I would guess there are ways to host content like this more
         | easily on platforms that are only accessible within China or
         | are not navigable without the user understanding Chinese
         | language.
         | 
         | I would also guess that this is part of the reason why services
         | that target customers in China tend to become "super apps" that
         | combine several services that non-Chinese users would expect to
         | find on disparate sites. For example, services may combine a
         | social media style newsfeed/interaction API, banking, email,
         | shopping, and food delivery into a single platform.
        
         | Kiro wrote:
         | What's the alternative? I haven't found anything that's easier
         | to deploy and manage than GitHub Pages.
        
       | crazygringo wrote:
       | Just wow. This is the first time I've seen AI generate convincing
       | human movement. (And the folds of fabric in dresses seem to move
       | realistically too!)
       | 
       | Of course, the actual movement skeleton is coming from presumably
       | real motion-capture, but still.
       | 
       | I'm curious what the current state is of _generating_ the
       | movement skeletons, which obviously has a ton of relevance to
       | video games. Where 's the progress in models where you can type
       | "a burly man stands with erect posture, then crouches down and
       | sneaks five steps forward, then freezes in fear" and output an
       | appropriate movement skeleton?
        
         | netruk44 wrote:
         | The input poses appear to be generated from OpenPose [0], which
         | uses regular images as input. With the creation of stable
         | diffusion video, you could theoretically prompt it to make a
         | video of what you wanted, then run it through OpenPose.
         | 
         | But I think the current approach is to take a picture/video of
         | yourself doing the motions you want the AI to generate. It's a
         | pretty low barrier to entry. Just about anyone with a
         | smartphone or webcam could make one.
         | 
         | Using just words leaves a lot to the model's interpretation. I
         | feel like you might wind up spending a lot of time manually
         | fixing little things, similar to how you might infill the
         | "wrong" parts of an AI generated image. It might be easier to
         | just take a 15 second video to get the exact skeleton animation
         | you want.
         | 
         | [0] https://github.com/CMU-Perceptual-Computing-Lab/openpose
        
         | anonylizard wrote:
         | This is already highly, highly relevant for 2d animation.
         | 
         | Many complex moves (especially dancing) are filmed in video
         | first, then the movement is traced over by hand. This is called
         | Rotoscoping.
         | 
         | This is basically auto-rotoscoping, and I expect it to see
         | commercial usage within popular high budget projects within 2
         | years. Previously, even the highest budget anime couldn't
         | really afford 2d dance scenes due to the insane drawings
         | required.
        
       | tobr wrote:
       | That's quite astonishing. In just a few years this might even be
       | generalized to work for characters other than conventionally
       | attractive young women.
        
         | nwoli wrote:
         | You really want that to be a comment you make on revolutionary
         | new tech? Think of what you'd think of finding dismissive
         | comments like that about the invention of the telephone
        
           | dvngnt_ wrote:
           | For ML questioning the data seems okay
        
           | redleggedfrog wrote:
           | I'll go one further. The uncomfortable fixation on attractive
           | women for the models is only an _inkling_ of what is to be
           | the primary application of such technology which will be
           | porn. No matter how amazing that tech underlying these
           | animated stills may be be the race for the lowest common
           | denominator is already shining through on their examples. Don
           | 't think for one moment they didn't choose those on purpose.
           | They know where it's heading.
        
       | hombre_fatal wrote:
       | This is absolutely insane. The DreamPose output they compare
       | themselves to is less than one year old.
       | 
       | It's funny to go back to the first avocado chair or deep dream
       | images that wowed me just a couple years ago.
       | 
       | I can't help but feel lost in the pace of tech.
        
         | johnyzee wrote:
         | It is a massive seismic shift. Almost any project in the works
         | right now, that involves visual content, looks like it will be
         | antiquated in a very short time. Including some that I am
         | working on :(. The new possibilities though... Breathtaking to
         | think about.
        
       | sys32768 wrote:
       | Just imagine when this merges with 3D modeling and VR.
       | 
       | The VR pr0n, the video games with dynamic AI characters. Dead
       | actors and historical figures resurrected into movies and
       | education.
       | 
       | I'm not so scared about my future nursing home now.
        
       | huytersd wrote:
       | How do you generate the movement data?
        
         | netruk44 wrote:
         | It looks like they're using OpenPose [0] images fed to a
         | special "pose guider" model. You can make them from just
         | regular video.
         | 
         | [0] https://github.com/CMU-Perceptual-Computing-Lab/openpose
        
       ___________________________________________________________________
       (page generated 2023-11-30 23:00 UTC)