[HN Gopher] Text2LIVE: Text-Driven Layered Image and Video Editing ___________________________________________________________________ Text2LIVE: Text-Driven Layered Image and Video Editing Author : lnyan Score : 96 points Date : 2022-07-10 09:49 UTC (1 days ago) (HTM) web link (text2live.github.io) (TXT) w3m dump (text2live.github.io) | cube2222 wrote: | That's really cool, and could realistically end up being very | useful as an end-user product. | | I'm just waiting for the GIF keyboard that creates GIF's based on | your prompt instead of searching through an existing database of | them. | | That will be truly next-level. | egfx wrote: | [deleted] | moritonal wrote: | Wtf, after waiting about 2mins for your site to load all I | could tell from your product page is that I could somehow | spend $1,299.00 on whatever it was. | metadat wrote: | I also had the same experience, this feels like a huckster | product plug for SEO. | | The immediate redirect is a little odd as well, malware | payload delivery anyone? | upupandup wrote: | how long until we can write stuff like: make everybody nude in | this kpop video? | Existenceblinks wrote: | Looks promising. I think this would end up with some well-defined | schema + some DSL. | minimaxir wrote: | This just reinforces my hypothesis that OpenAI's release of CLIP | on 2021 was more impactful to image research than the DALL-E | paper. | zmgsabst wrote: | If you missed it, like I did, here's a blog post by OpenAI: | | https://openai.com/blog/clip/ | metadat wrote: | Our best minds are working on this amazing near-magical new | technology.. which will end up being productized into an | Instagram Filter service to dynamically inject a stained glass | unicorn in place of a horse in a video. | | That's cool and all, but also really stupid and a pointless | distraction compared to how novel the underlying mathematics and | science are. This will quickly become a commodity and humans will | acclimate to seeing such tricks. The content produced won't even | be considered particularly impressive. | | Damn. I was hoping the singularity would be better. | skybrian wrote: | Yes, lowering the cost of special effects means that they're | not that special and there will be lots of crap. On the other | hand, it lowers the cost of filmmaking, so there should be | storytellers who use this to good effect, where the special | effect isn't that noticeable but serves the story? | | Though, a question is whether the good storytellers can be | found easily? It seems like the situation is similar in fan | fiction. | naillo wrote: | I think now is a good time (now that they're out of research | zone and actually working) for a flood of new people to enter | this space and try to come up with more creative ideas than | that. It's an exciting time for cool ideas and cool projects if | we take off our cynicism hat for a bit. | metadat wrote: | How is this going to be accessible to the new wave of people | you're imagining? | | I'd be all for it! Just not clear on a plausible path for how | this better future comes to pass. | naillo wrote: | Well you will have to do some work to understand them (read | papers etc). But distilled (from larger models) versions of | these things are fairly capable of being computed even on | the web etc. There's definitely cool low hanging fruit here | (though it's not plug and play import a library). The main | thing is that these have been proven to be as powerful as | they are (last year it wasn't clear they'd be able to get | this good), so with some effort there's definitely cool | stuff to be built. I'm excited (and working on it myself). | TaylorAlexander wrote: | I see it differently. As a robotics engineer I know the biggest | impediment to robotics development is getting computers to | understand the real world. The work on multimodal neurons, | which see the word cake and know to associate it with images of | cake, is a key stepping stone along the way to a fully | functional embodied AI that can solve difficult real world | problems. CLIP, DALL-E, and all these off shoots are | representations of what we can pull from these efforts today. | But long term this work will be incorporated in to bigger and | more capable AI systems. | | Just think: when I ask you "walk in to the workshop, grab a | hammer and a box of nails, and meet me on the roof to help me | secure some loose shingles" your mind is already imagining the | path you will take to get there, what it will look like when | you locate and grab the hammer and nails, and you've filled in | that to get on the roof you have to meet me in the back yard to | climb the ladder, which I never mentioned. | | All these tiny details your mind can do effortlessly take huge | efforts like CLIP to sort out how to make it work. And even | CLIP is only text and images. There is a lot more to go from | there. | | A lot of people focus on DALL-E and the artifacts that come out | along the way, but these are not the destination, just little | stops showing the progress we are making on a much larger | journey. | [deleted] | zitterbewegung wrote: | The thesis "Best new minds" is not even correct. If the "Best | new minds" didn't create things like social networking and | instagram we wouldn't even have the data sources to build upon | these new algorithms to even work. Also, without a bunch of | video game makers actually pushing graphics cards we also would | have the hardware to do these things. So the "best new minds" | accidentally enabled more "best new minds". | avgcorrection wrote: | So what? | | 1. Their priorities are wrong so they are not the best minds | | 2. If (1) is false because the best minds can have stupid | priorities, then The Best Minds is not the be-all-end-all of | everything | graiz wrote: | We're still in the first innings of this stuff. Zero-shot/CLiP | work will extend to audio, music, music videos and perhaps full | movies. I would love to see: "The Phantom Menace with Jar Jar | Binks replaced by leonardo dicaprio using the voice of James Earl | Jones" ___________________________________________________________________ (page generated 2022-07-11 23:00 UTC)