[HN Gopher] Stable Diffusion Textual Inversion
       ___________________________________________________________________
        
       Stable Diffusion Textual Inversion
        
       Author : antman
       Score  : 94 points
       Date   : 2022-08-29 21:08 UTC (1 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | bottlepalm wrote:
       | Wow, this is pretty cool. Instead of turning a picture back into
       | text, turn it into a unique concept expressed as variable S* that
       | can be used in later prompts.
       | 
       | It's like how humans create new words for new ideas, use AI to
       | process a visual scene and generate a unique 'word' for it that
       | can be used in future prompts.
       | 
       | What would a 'dictionary' of these variables enable? AI with it's
       | own language with orders of magnitude more words. Will a language
       | be created that interfaces between all these image generation
       | systems? Feels like just he beginning here..
        
       | vinkelhake wrote:
       | RIP promptbase.com - they had a good run.
        
         | kelseyfrog wrote:
         | The won't even need to renew the domain name when it expires on
         | 2023-02-28
        
       | daenz wrote:
       | This is a big deal! This adds a super power to communication,
       | similar to how a photo is worth 1000 words. An inversion is worth
       | 1000 diffusions!
        
       | hbn wrote:
       | I saw talk the other day how these ML art models aren't really
       | suited to doing something like illustrating a picture book
       | because it can synthesize a character once but wouldn't be able
       | to reliably recreate that character in other situations.
       | 
       | Didn't take long for someone to resolve that issue!
        
         | [deleted]
        
         | zone411 wrote:
         | It's not quite at that level yet. The paper introducing it
         | recommends using only 5 images as the fine-tuning set so the
         | results are not yet very accurate.
        
       | zone411 wrote:
       | It should be noted that the official repo now also supports
       | Stable Diffusion: https://github.com/rinongal/textual_inversion.
        
       | ionwake wrote:
       | Anyone else starting to feel uncomfortable with the rate of
       | progress?
        
         | nmca wrote:
         | Yes. https://80000hours.org/problem-profiles/artificial-
         | intellige...
        
       | kelseyfrog wrote:
       | And in a blink of an eye, the career potential of all aspiring
       | "Sr. Prompt Engineer"s vanished into the whirlpool of automatable
       | tasks.
       | 
       | On a more serious note, this opens up the door to exploring fixed
       | points of txt2img->img2txt and img2txt -> txt2img. It may open
       | the door to more model interpretability.
        
         | keepquestioning wrote:
         | ELI5 - why has there been a cavalcade of Stable Diffusion spam
         | on HN recently? What does it all mean?
        
           | fxtentacle wrote:
           | You can now fire artists/designers and replace them with AI.
           | Obviously, that's cheaper.
        
             | dougmwne wrote:
             | Someone will surely come by soon and tell us, "well
             | actually... artists and graphic designers are
             | irreplaceable."
             | 
             | But for real, plenty of people are going to start rolling
             | their own art and skipping the artist. Not Coca-Cola, but
             | small to medium businesses doing a brochure or PowerPoint?
             | Sure!
        
               | visarga wrote:
               | I think there's going to be plenty of work in stacking
               | multiple AI prompts or manual retouching to fix rough
               | spots. It automates a task, not a job. Some people won't
               | use it at all and other people will use it only for
               | reference - in the end doing everything by hand, as
               | usual, because they have more control and because AI art
               | has a specific smell to it and people will associate it
               | with cheap.
               | 
               | But it's not just for art and design, it has uses in
               | brainstorming, planning, and just to visualise your ideas
               | and extend your imagination. It's a bicycle for the mind.
               | People will eat it up, old copyrights and jobs be damned.
               | It's a cyborg moment when we extend our minds with AI and
               | it feels great. By the end of the decade we'll have
               | mature models for all modalities. We'll extend our minds
               | in many ways, and applications will be countless. There's
               | going to be a lot of work created around it.
        
               | djmips wrote:
               | For sure it automates some work. For example, my sometime
               | hobby of making silly photoshops looks like it will now
               | be a whole lot easier... Visual memes can just be a
               | sentence now. For more serious work I wonder... But it
               | does give pause about what it means for other forms of
               | work.
        
               | Eji1700 wrote:
               | It'll be an interesting line to be sure.
               | 
               | Right now the tech still requires some nuance to be able
               | to slap it all together into what I think most people
               | would want.
               | 
               | While i expect the interface and the like to get a lot
               | better, all good tutorials of this tech so far show many
               | iterations over many different parts of an image to get
               | something "cohesive". Blending those little mini
               | iterations together is VASTLY easier than just making the
               | whole thing, but not just plug and play for something
               | professional.
               | 
               | Still there will be a huge dent in how long it takes to
               | make certain styles of work and that will lower demand
               | considerably, and there's a large market of artists who
               | thrive on casual commissions which this might replace.
        
           | CuriouslyC wrote:
           | The stable diffusion model just got officially released
           | recently, and in the last week a lot of easy to install
           | repositories have been forked off the main one, so it's very
           | accessible for people to do this at home. Additionally, the
           | model is very impressive and it's a lot of fun to use it.
        
             | keepquestioning wrote:
             | How does it compare to DALL-E
        
               | CuriouslyC wrote:
               | Worse at image cohesion and prompt matching, but
               | competitive in terms of final image quality in the better
               | cases.
        
           | hbn wrote:
           | It's an impressive new technology, and there's nothing else
           | out there like it in terms of the model being publicly
           | available and able to be run on consumer GPUs.
        
           | dougmwne wrote:
           | First, it was recently released, so there's novelty. Second,
           | the code and model weights were also released so it is open
           | and extensible, which this community loves. Thirdly, these
           | high quality image generation models are mind blowing to most
           | and it's not hard to imagine how transformative it will be to
           | the arts and design space.
           | 
           | If if has any greater meaning, we might all be a little
           | nervous that it'll come for our jobs next, or some piece of
           | them. First it came for the logo designers, but I was not a
           | logo designer, and so on.
        
         | hwers wrote:
         | We could always do im2tex via just clip embedding the image.
         | The idea that you could hide/sell prompts is silly. (Having
         | human interpretable im2tex is cool tho.)
        
       | bravura wrote:
       | Is there a colab or easy to use demo of this?
        
         | zone411 wrote:
         | It requires a couple hours of actual training, so the barrier
         | to entry is higher.
        
       | frebord wrote:
       | How many years until we can generate a feature length film from a
       | script?
        
         | bitwize wrote:
         | I want to see the Batman film where the Joker gives Batman a
         | coupon for new parents but it is expired. That should really be
         | a real film in theatres.
        
           | djmips wrote:
           | you 'might' enjoy. Teen titans fixing the timeline.
        
           | goldenkey wrote:
           | I loled.
        
         | anigbrowl wrote:
         | 5
         | 
         | You could do storyboards from a shooting script* now, but
         | generalizing to synthesizing character and camera movement as
         | well as object physics is a ways off.
         | 
         | * A version of the script used mainly by director and
         | cinematographer with details of each different angle to be used
         | covering the scene.
        
         | bottlepalm wrote:
         | It looks like this is trending towards making our
         | dreams/thoughts reality in a way in that what we imagine can
         | easily be turned into media - music, books, movies, etc.. Pair
         | this up with VR 'the metaverse' and you literally do get the
         | ability to turn thoughts into personalized explorable
         | realities.. what happens after that?
         | 
         | * Do we get lost in it?
         | 
         | * Does today's 'professional' fiction become a lot less
         | lucrative when we can create our own?
         | 
         | * Is there a to leverage this technology the improve the human
         | condition somehow?
        
           | Loveaway wrote:
           | this probably already all happened before mate
        
           | afro88 wrote:
           | I think it will encourage novel ideas in all forms of art. In
           | other words, genuinely new styles and expression will be
           | scarce, because there wasn't thousands of forms of it to
           | train a model on yet.
           | 
           | We will also adjust to AI generated art like have other
           | creative technologies and the novelty will wear off. We will
           | become good at identifying AI generated art and think of it
           | as cheap.
           | 
           | Still, extremely exciting.
        
           | xdfgh1112 wrote:
           | I can create and explore realities using my imagination
           | alone, though. I personally don't think having it become
           | actual 2d or 3d art will have a lasting impact. It might be
           | fun for a while, but it will get old.
        
             | bottlepalm wrote:
             | It's kind of like your imagination on steroids as the
             | system creates worlds using you imagination as the seed and
             | augments it with summation of all the human creations used
             | to train the network. Give Stable Diffusion a sentence for
             | example, it will create something way beyond what you could
             | of imagined and/or created on your own.
        
         | Eji1700 wrote:
         | I suspect at least 10+ depending on your definition.
         | 
         | Tools like this will absolutely be used by professionals to cut
         | out portions of the workload, but there's still a large gap
         | between something like this and actually making a coherent,
         | cohesive, consistent, paced, well framed and lit story from
         | text alone.
        
       | globalvisualmem wrote:
       | There is also recent work by Google called DreamBooth, though
       | similar to Imagen/Parti Google refuses to release any model or
       | code.
       | 
       | https://dreambooth.github.io/
        
       | cube2222 wrote:
       | It's impressive how all of this is quickly picking up steam
       | thanks to the Stable Diffusion model being open source with
       | pertained weights available. It's like every week there's another
       | breakthrough or two.
       | 
       | I think the main issue here is the computational cost, as - if I
       | understand correctly - you basically have to do training for each
       | concept you want to learn. Are pretrained embeddings available
       | anywhere for common words?
        
       ___________________________________________________________________
       (page generated 2022-08-29 23:00 UTC)