[HN Gopher] Stable Diffusion Textual Inversion ___________________________________________________________________ Stable Diffusion Textual Inversion Author : antman Score : 94 points Date : 2022-08-29 21:08 UTC (1 hours ago) (HTM) web link (github.com) (TXT) w3m dump (github.com) | bottlepalm wrote: | Wow, this is pretty cool. Instead of turning a picture back into | text, turn it into a unique concept expressed as variable S* that | can be used in later prompts. | | It's like how humans create new words for new ideas, use AI to | process a visual scene and generate a unique 'word' for it that | can be used in future prompts. | | What would a 'dictionary' of these variables enable? AI with it's | own language with orders of magnitude more words. Will a language | be created that interfaces between all these image generation | systems? Feels like just he beginning here.. | vinkelhake wrote: | RIP promptbase.com - they had a good run. | kelseyfrog wrote: | The won't even need to renew the domain name when it expires on | 2023-02-28 | daenz wrote: | This is a big deal! This adds a super power to communication, | similar to how a photo is worth 1000 words. An inversion is worth | 1000 diffusions! | hbn wrote: | I saw talk the other day how these ML art models aren't really | suited to doing something like illustrating a picture book | because it can synthesize a character once but wouldn't be able | to reliably recreate that character in other situations. | | Didn't take long for someone to resolve that issue! | [deleted] | zone411 wrote: | It's not quite at that level yet. The paper introducing it | recommends using only 5 images as the fine-tuning set so the | results are not yet very accurate. | zone411 wrote: | It should be noted that the official repo now also supports | Stable Diffusion: https://github.com/rinongal/textual_inversion. | ionwake wrote: | Anyone else starting to feel uncomfortable with the rate of | progress? | nmca wrote: | Yes. https://80000hours.org/problem-profiles/artificial- | intellige... | kelseyfrog wrote: | And in a blink of an eye, the career potential of all aspiring | "Sr. Prompt Engineer"s vanished into the whirlpool of automatable | tasks. | | On a more serious note, this opens up the door to exploring fixed | points of txt2img->img2txt and img2txt -> txt2img. It may open | the door to more model interpretability. | keepquestioning wrote: | ELI5 - why has there been a cavalcade of Stable Diffusion spam | on HN recently? What does it all mean? | fxtentacle wrote: | You can now fire artists/designers and replace them with AI. | Obviously, that's cheaper. | dougmwne wrote: | Someone will surely come by soon and tell us, "well | actually... artists and graphic designers are | irreplaceable." | | But for real, plenty of people are going to start rolling | their own art and skipping the artist. Not Coca-Cola, but | small to medium businesses doing a brochure or PowerPoint? | Sure! | visarga wrote: | I think there's going to be plenty of work in stacking | multiple AI prompts or manual retouching to fix rough | spots. It automates a task, not a job. Some people won't | use it at all and other people will use it only for | reference - in the end doing everything by hand, as | usual, because they have more control and because AI art | has a specific smell to it and people will associate it | with cheap. | | But it's not just for art and design, it has uses in | brainstorming, planning, and just to visualise your ideas | and extend your imagination. It's a bicycle for the mind. | People will eat it up, old copyrights and jobs be damned. | It's a cyborg moment when we extend our minds with AI and | it feels great. By the end of the decade we'll have | mature models for all modalities. We'll extend our minds | in many ways, and applications will be countless. There's | going to be a lot of work created around it. | djmips wrote: | For sure it automates some work. For example, my sometime | hobby of making silly photoshops looks like it will now | be a whole lot easier... Visual memes can just be a | sentence now. For more serious work I wonder... But it | does give pause about what it means for other forms of | work. | Eji1700 wrote: | It'll be an interesting line to be sure. | | Right now the tech still requires some nuance to be able | to slap it all together into what I think most people | would want. | | While i expect the interface and the like to get a lot | better, all good tutorials of this tech so far show many | iterations over many different parts of an image to get | something "cohesive". Blending those little mini | iterations together is VASTLY easier than just making the | whole thing, but not just plug and play for something | professional. | | Still there will be a huge dent in how long it takes to | make certain styles of work and that will lower demand | considerably, and there's a large market of artists who | thrive on casual commissions which this might replace. | CuriouslyC wrote: | The stable diffusion model just got officially released | recently, and in the last week a lot of easy to install | repositories have been forked off the main one, so it's very | accessible for people to do this at home. Additionally, the | model is very impressive and it's a lot of fun to use it. | keepquestioning wrote: | How does it compare to DALL-E | CuriouslyC wrote: | Worse at image cohesion and prompt matching, but | competitive in terms of final image quality in the better | cases. | hbn wrote: | It's an impressive new technology, and there's nothing else | out there like it in terms of the model being publicly | available and able to be run on consumer GPUs. | dougmwne wrote: | First, it was recently released, so there's novelty. Second, | the code and model weights were also released so it is open | and extensible, which this community loves. Thirdly, these | high quality image generation models are mind blowing to most | and it's not hard to imagine how transformative it will be to | the arts and design space. | | If if has any greater meaning, we might all be a little | nervous that it'll come for our jobs next, or some piece of | them. First it came for the logo designers, but I was not a | logo designer, and so on. | hwers wrote: | We could always do im2tex via just clip embedding the image. | The idea that you could hide/sell prompts is silly. (Having | human interpretable im2tex is cool tho.) | bravura wrote: | Is there a colab or easy to use demo of this? | zone411 wrote: | It requires a couple hours of actual training, so the barrier | to entry is higher. | frebord wrote: | How many years until we can generate a feature length film from a | script? | bitwize wrote: | I want to see the Batman film where the Joker gives Batman a | coupon for new parents but it is expired. That should really be | a real film in theatres. | djmips wrote: | you 'might' enjoy. Teen titans fixing the timeline. | goldenkey wrote: | I loled. | anigbrowl wrote: | 5 | | You could do storyboards from a shooting script* now, but | generalizing to synthesizing character and camera movement as | well as object physics is a ways off. | | * A version of the script used mainly by director and | cinematographer with details of each different angle to be used | covering the scene. | bottlepalm wrote: | It looks like this is trending towards making our | dreams/thoughts reality in a way in that what we imagine can | easily be turned into media - music, books, movies, etc.. Pair | this up with VR 'the metaverse' and you literally do get the | ability to turn thoughts into personalized explorable | realities.. what happens after that? | | * Do we get lost in it? | | * Does today's 'professional' fiction become a lot less | lucrative when we can create our own? | | * Is there a to leverage this technology the improve the human | condition somehow? | Loveaway wrote: | this probably already all happened before mate | afro88 wrote: | I think it will encourage novel ideas in all forms of art. In | other words, genuinely new styles and expression will be | scarce, because there wasn't thousands of forms of it to | train a model on yet. | | We will also adjust to AI generated art like have other | creative technologies and the novelty will wear off. We will | become good at identifying AI generated art and think of it | as cheap. | | Still, extremely exciting. | xdfgh1112 wrote: | I can create and explore realities using my imagination | alone, though. I personally don't think having it become | actual 2d or 3d art will have a lasting impact. It might be | fun for a while, but it will get old. | bottlepalm wrote: | It's kind of like your imagination on steroids as the | system creates worlds using you imagination as the seed and | augments it with summation of all the human creations used | to train the network. Give Stable Diffusion a sentence for | example, it will create something way beyond what you could | of imagined and/or created on your own. | Eji1700 wrote: | I suspect at least 10+ depending on your definition. | | Tools like this will absolutely be used by professionals to cut | out portions of the workload, but there's still a large gap | between something like this and actually making a coherent, | cohesive, consistent, paced, well framed and lit story from | text alone. | globalvisualmem wrote: | There is also recent work by Google called DreamBooth, though | similar to Imagen/Parti Google refuses to release any model or | code. | | https://dreambooth.github.io/ | cube2222 wrote: | It's impressive how all of this is quickly picking up steam | thanks to the Stable Diffusion model being open source with | pertained weights available. It's like every week there's another | breakthrough or two. | | I think the main issue here is the computational cost, as - if I | understand correctly - you basically have to do training for each | concept you want to learn. Are pretrained embeddings available | anywhere for common words? ___________________________________________________________________ (page generated 2022-08-29 23:00 UTC)