[HN Gopher] I used Stable Diffusion and Dreambooth to create an ...
       ___________________________________________________________________
        
       I used Stable Diffusion and Dreambooth to create an art portrait of
       my dog
        
       Author : jakedahn
       Score  : 295 points
       Date   : 2023-04-16 18:29 UTC (4 hours ago)
        
 (HTM) web link (www.shruggingface.com)
 (TXT) w3m dump (www.shruggingface.com)
        
       | cogitoergofutuo wrote:
       | This is really interesting. I do wish the author included the
       | cost to train the model from replicate though.
        
       | wincy wrote:
       | He mentions the Colab for Dreambooth, that only takes ten minutes
       | or so to train using an A100 (the premium GPU) and you can have
       | it turn off after it finishes, and saves to Google Drive. Super
       | easy.
        
         | jakedahn wrote:
         | Yeah!
         | 
         | Here's the colab notebook, in case anyone is interested:
         | https://github.com/TheLastBen/fast-stable-diffusion
         | 
         | I've trained a few smaller models using their Dreambooth
         | notebook, but I think for 4000 training steps, an A100 will
         | usually take 30-40min. I believe replicate also uses A100s for
         | their dreambooth training jobs.
        
       | [deleted]
        
       | sinman wrote:
       | I did something loosely related. As a present for my girlfriend's
       | birthday, I made her a "90s website" with AI portraits of her
       | dog: https://simoninman.github.io/
       | 
       | It wasn't actually particularly hard - I used a Colab notebook on
       | the free tier to fine-tune the model, and even got chatGPT to
       | write some of the prompts.
        
         | jakedahn wrote:
         | hah, these are pretty cool! Well done!
        
         | AuryGlenz wrote:
         | In my (limited) experience, dogs seem to be easier than people
         | for fine-tuning - especially if your end result is going to be
         | artsy. Faces of people you know well being off in slight ways
         | really throws you off, but with dogs there's a bit more leeway.
        
       | amelius wrote:
       | But why pick a dog as an example?
       | 
       | Humans are much worse in telling dogs apart than other humans
       | (except perhaps the owner of the particular dog).
       | 
       | So for all we know, the AI didn't generate a portrait of this
       | particular dog but instead a generic picture of this breed of
       | dog.
        
         | chipgap98 wrote:
         | Because you invent a new word when you train dreambooth and
         | teach it that your subject is an example of that word. The fact
         | that the word you've created returns photos similar to subject
         | is a sign that it worked.
        
           | amelius wrote:
           | I suppose that dreambooth is pretrained on a large dataset
           | that includes many different dogs.
           | 
           | My point is that it is difficult to judge (for us) that the
           | returned photos are actually similar to the subject.
        
             | ModernMech wrote:
             | The paper shows dogs with very distinctive fur coloring.
             | Particularly the corgi with a white strip between its eyes.
             | I think the paper would be completely fraudulent if this
             | dog were also featured heavily in the training set. So the
             | point is the white stripe corgi isn't in the set, and with
             | a few examples, the model could then generate brand new
             | images of corgis with a similar fur pattern. Maybe all it
             | can do is fur patterns but it's a start.
        
         | jakedahn wrote:
         | Mostly because I thought of it more as an art project than a
         | technical accuracy project. However, the honest answer to your
         | question, is because I have a ridiculous amount of photos of my
         | dog on my phone . Getting training data is hard work.
         | 
         | But this is totally true, I found that maybe 30% of the images
         | I generated did not look like my dog at all. However the rest
         | do a good job at capturing his eyes and facial expressions that
         | he actually makes. I thought that the chosen image I worked
         | from captured the look of his eyes super well.
         | 
         | But yeah, nobody but me would really appreciate that.
        
         | AuryGlenz wrote:
         | I linked this elsewhere but here are Pokemon image generations
         | of my (mutt) dog: https://imgur.com/a/11OxoSA
         | 
         | She's pretty unique looking and it comes through even with
         | heavy styling.
        
       | asadlionpk wrote:
       | If anyone wants to try Dreambooth online, I made a free website
       | for this: https://trainengine.ai
        
       | itronitron wrote:
       | [flagged]
        
         | steve_adams_86 wrote:
         | If they like it, then it's not garbage for them. Mission
         | accomplished. What you think of art on a stranger's wall isn't
         | really the point -- it's more so about the technology behind
         | it.
         | 
         | I suppose you could be indirectly commenting about how you
         | think the technology does a bad job generating art, but there
         | are better ways to say it.
        
         | spikej wrote:
         | They like it. And it was a good excuse to work with new tech.
         | Why poo poo on it?
        
           | itronitron wrote:
           | I like my comment, and it was a good excuse to work with new
           | tech, why poo poo on my comment?
        
             | mdp2021 wrote:
             | Because your comment was pretty objectively inappropriate
             | and improductive - gratuitous. Or did you mean something
             | productive that we should have guessed?
        
           | Fricken wrote:
           | Leaving poo poo on things is a popular passtime for many dog
           | people.
        
             | mdp2021 wrote:
             | 'is a popular passtime for many [] people'
             | 
             | Fixed That For You.
             | 
             | Interestingly for the ethologist, they have habitats: for
             | example, the bottom comments in the stacks in YouTube...
        
         | mdp2021 wrote:
         | Good, so in order to produce good AI-aided graphics the
         | producers will have to become _critics_ , arts experts, with
         | the important side effect of personal elevation and the
         | collective gain of society. "Wins" on all sides.
         | 
         | Update: three minutes later, it seems that somebody did not get
         | the irony.
        
         | lxe wrote:
         | What a useful and nuanced critique. Thanks!
        
         | beezlewax wrote:
         | Highly subjective comment. Art is not something that is either
         | "good" or "not good". It can hold value to the creator
         | intrinsically. Like a kids crayon scribbles.
        
       | simonw wrote:
       | I love how much work went into this.
       | 
       | There's a great deal of pushback against AI art from the wider
       | online art community at the moment, a lot of which is motivated
       | by a sense of unfairness: if you're not going to put in the time
       | and effort, why do you deserve to create such high equality
       | imagery?
       | 
       | (I do not share this opinion myself, but it's something I've seen
       | a lot)
       | 
       | This is another great counter-example showing how much work it
       | takes to get the best, deliberate results out of these tools.
        
         | quadcore wrote:
         | _a lot of which is motivated by a sense of unfairness_
         | 
         | Say you generate a picture with midjourney - who is/are the
         | closest artist(s) you can find for that picture?
         | 
         | Not the AI, not the prompter, so the closest artists you can
         | find for that picture are the ones who made the pictures in the
         | training set. So generating a picture is outright copyright
         | infringement. Nothing to do with unfairness in the sense of
         | "artists get out compete". Artists dont get out compete - they
         | are stolen.
        
           | ModernMech wrote:
           | Typical Midjourney workflow involves constantly reprompting
           | and fine tuning based on examples and input images. When you
           | arrive at a given image in Midjourney, it's often impossible
           | to recreate it even with the same seed. You'll need the input
           | image as well, and the input image is often the result of a
           | long creative process.
           | 
           | Why is it you discount the creative input of the user? Are
           | they not doing work by guiding the agent? Don't their choices
           | of prompt, input image, and the refinement of subsequent
           | generated images represent a creative process?
        
             | quadcore wrote:
             | I agree with you on the technicality - if we say the
             | promter is an artist, then the picture belongs to him.
        
         | quadcore wrote:
         | From what I read on the internet, people assume AI generated
         | art is a difficult question legaly speaking. Some literally
         | assume artists complain only because there are out competed.
         | 
         | I disagree - I think that AI generative art is an easy case of
         | copyright infrigement and an easy win for a bunch of good
         | lawyers.
         | 
         | That's because you can't find an artist for a generated picture
         | other than the ones in the training set. If you can't find a
         | new artist, then the picture belongs to the old ones, so to
         | speak. I really dont see what's difficult with that case. I
         | think the internet assume a bit to quickly it's a difficult
         | question and a grey area when maybe it just isnt.
         | 
         | It's noteworthy that Adobe did things differently than the
         | others and the way they did things goes in the direction im
         | describing here. Maybe it's just confirmation bias.
        
           | stavros wrote:
           | I agree. This is a clear-cut case of copyright infringement,
           | as is all art. After all, people painting images have only
           | seen paintings other people painted.
        
           | GuB-42 wrote:
           | > That's because you can't find an artist for a generated
           | picture other than the ones in the training set. If you can't
           | find a new artist, then the picture belongs to the old ones,
           | so to speak.
           | 
           | It doesn't belong to the "old ones", it is at best a
           | derivative work. And even writing a prompt, as trivial as it
           | might seem, makes you an artist. There are modern artists
           | exposing a random shit as art, and you may or may not like
           | it, but they are legally artists, and it is their work.
           | 
           | The question is about fair use. That is, are you allowed to
           | use pictures in the dataset without permission. It is a
           | tricky question. On one extreme, you won't be able to do
           | anything withing infringing some kind of copyright. Used the
           | same color as I did? I will sue you. On the other extreme,
           | you essentially abolish intellectual property. Copying
           | another artist style in your own work is usually fair use,
           | and that's essentially what generative AI do, so I guess
           | that's how it will go, but it will most likely depends on how
           | judges and legislators see the thing, and different countries
           | probably will have different ideas.
        
           | circuit10 wrote:
           | It's not as simple as that though because the algorithm does
           | learn by itself and mostly just uses the training data to
           | score itself against, it doesn't directly copy it as some
           | people seem to think. It can end up learning to copy things
           | if it sees them enough times though
           | 
           | "you can't find an artist for a generated picture other than
           | the ones in the training set. If you can't find a new artist,
           | then the picture belongs to the old ones, so to speak"
           | 
           | I don't think that's valid on its own as a way to completely
           | discount considering how directly it's using the data. As an
           | extreme example, what if I averaged all the colours in the
           | training data together and used the resulting colour as the
           | seed for some randomly generated fractal or something? You
           | could apply the same arguments - there is no artist except
           | the original ones in the training set - and yet I don't think
           | any reasonable person would say that the result obviously
           | belongs to every single copyright owner from the training set
        
           | ModernMech wrote:
           | But this person's dog isn't in the training set, so why
           | should some artist be credited for a picture they never drew?
           | Not a single person has drawn his dog before, now there is a
           | drawing of his dog, and you want to credit someone who had no
           | input to the creative process here?
        
             | quadcore wrote:
             | If you can find a new artist then I think the picture
             | belongs to him.
        
             | austinjp wrote:
             | "Input into the creative process" is surely broader than
             | simply "painted the portrait". Artists most certainly never
             | consented to have their works used as training data. To
             | this extent, they might be justifiably pissed off.
             | 
             | Artists and designers have furthered their careers (and
             | gained notoriety) by 'ripping off' others since the dawn of
             | time. This used to require technical artistic ability; now
             | less so. The barrier to entry is.... not necessarily lower
             | now, but different.
        
           | mdp2021 wrote:
           | > _an artist for a generated picture_
           | 
           | Normally - outside the specific context of AI generated art
           | -, there is not a relation "work1 - past author" , but "work
           | - large amount of past experience". (1"work": in the sense of
           | product, output etc.)
           | 
           | If the generative AI is badly programmed, it will copy the
           | style of Smith. If properly programmed, it will "take into
           | account" the style of Smith. There is a difference between
           | learning and copying. Your tool can copy - if you do it
           | properly, it can learn.
           | 
           | All artists work in a way "post consideration of a finite
           | number of past artists in their training set".
        
         | brucethemoose2 wrote:
         | TBH it would be much easier with more streamlined tooling,
         | especially if doing it locally with lora/lycoris.
         | 
         | Its kinda like using ffmpeg for vapoursynth for video editing
         | instead of a video editing GUI.
         | 
         | That being said the training parameter/data tuning is
         | definitely an art, as is the prompting.
        
         | asddubs wrote:
         | most of the criticism I've seen is that it's all trained on
         | uncompensated stolen artwork. Much like how copilot is trained
         | on GPL code, disregarding its license terms.
        
           | minimaxir wrote:
           | The general argument (IANAL) is that it's Fair Use, in the
           | same vein as Google Images or Internet Archive scraping and
           | storing text/images. Especially since the outputs of
           | generated images are not 1:1 to their source inputs, so it
           | could be argued that it's a unique derivative work. The
           | current lawsuits against Stability AI are testing that,
           | although I am skeptical they'll succeed (one of the lawsuits
           | argues that Stable Diffusion is just "lossy compression"
           | which is factually and technically wrong).
           | 
           | There is an irony, however, that many of the AI art haters
           | tend to draw fanart of IP they don't own. And if Fair Use
           | protections are weakened, their livelihood would be hurt far
           | more than those of AI artists.
           | 
           | The Copilot case/lawsuit IMO is stronger because the
           | associated code output is a) provably verbatim and b) often
           | has explicit licensing and therefore intent on its usage.
        
             | bendmorris wrote:
             | >it could be argued that it's a unique derivative work
             | 
             | Creating a derivative work of a copyrighted image requires
             | permission from the copyright holder (i.e., a license)
             | which many of these services do not have. So the real
             | question is whether AI-generated "art" counts as a
             | derivative work of the inputs, and we just don't know yet.
             | 
             | >b) often has explicit licensing and therefore intent on
             | its usage
             | 
             | It doesn't matter. In the absence of a license, the default
             | is "you can't use this." It's not "do whatever you want
             | with it." Licenses grant (limited) permission to use;
             | without one you have no permission (except fair use, etc.
             | which are very specifically defined.)
        
               | adamm255 wrote:
               | If a person trained themselves on the same resources, and
               | picked up a brush or a camera and created some stunning
               | art in a similar vein, would we look at that as a
               | derivative work? Very interesting discussion. Art of all
               | forms are inspired by those who came before.
               | 
               | Inspired/trained... I think these could be seen as the
               | same.
        
               | bendmorris wrote:
               | Training a human and training a model may use the same
               | verb but are very different.
               | 
               | If the person directly copied another work, that's a
               | derivative work and requires a license. But if a person
               | learned an abstract concept by studying art and later
               | created art, it's not derivative.
               | 
               | Computers can't learn abstract concepts. What they can do
               | is break down existing images and then numerically
               | combine them to produce something else. The inputs are
               | directly used in the outputs. It's literally derivative,
               | whether or not the courts decide it's legally so.
        
               | [deleted]
        
               | Ukv wrote:
               | > Computers can't learn abstract concepts
               | 
               | Goalposts can be moved on whether it has "truly learned"
               | the abstract concept, but at the very least neural
               | networks have the ability to work with concepts to the
               | extent that you can ask to make an image more "chaotic",
               | "mysterious", "peaceful", "stylized", etc. and get
               | meaningfully different results.
               | 
               | When a model like Stable Diffusion has 4.1GB of weights
               | and was trained on 5 billion images, the primary impact
               | of one particular training image may be very slightly
               | adjusting what the model associates with "dramatic".
               | 
               | > If the person directly copied another work, that's a
               | derivative work and requires a license
               | 
               | Not if it falls under Fair Use. Here's a fairly extreme
               | example for just how much you can get away with while
               | still (eventually) being ruled Fair Use:
               | https://www.artnews.com/art-in-america/features/landmark-
               | cop... - though I wouldn't recommend copying as much as
               | Richard Prince did.
               | 
               | > The inputs are directly used in the outputs
               | 
               | Not "directly" - during generation, normal prompt to
               | image models don't have access to existing images and
               | cannot search the Internet.
        
               | asddubs wrote:
               | I don't think we should hold technology to the same
               | standards as humans. I'm also allowed to memorize what
               | someone said, but that doesn't mean I'm allowed to record
               | someone without their knowledge (depending on the
               | location)
        
               | simonw wrote:
               | "Creating a derivative work of a copyrighted image
               | requires permission from the copyright holder"
               | 
               | That's why "fair use" is the key concept here. Under US
               | copyright law "fair use" does not require a license. The
               | argument is that AI generated imagery qualifies as "fair
               | use" - that's what's about to be tested in the courts.
               | 
               | https://arstechnica.com/tech-policy/2023/04/stable-
               | diffusion... is the best explanation I've seen of the
               | legal situation as it stands.
        
             | [deleted]
        
           | userbinator wrote:
           | AI is just showing us a fact that many are unwilling to
           | admit: _everything_ is a derivative work. Much like humans
           | will memorise and regurgitate what they 've seen.
        
           | simonw wrote:
           | The trained on stolen artwork critique is reasonable - I
           | helped with one of the first big investigations into how that
           | training data worked when Stable Diffusion first came out:
           | https://simonwillison.net/2022/Sep/5/laion-aesthetics-
           | weekno...
           | 
           | It's interesting to ask people who are concerned about the
           | training data what they think of Adobe Firefly, which is
           | strictly trained on correctly licensed data.
           | 
           | I'm under the impression that DALL-E itself used licensed
           | data as well.
           | 
           | I find some people are comfortable with that, but others will
           | switch to different concerns - which indicates to me that
           | they're actually more offended by the idea of AI-generated
           | art than the specific implementation details of how it was
           | trained.
        
             | bugglebeetle wrote:
             | I think the more correct argument is that Stable Diffusion
             | effectively did a Napster to force artists into shit
             | licensing deals with large players who can handle the
             | rights management. It's unlikely that artists would've ever
             | agreed to them otherwise, but since the alternative now is
             | to have your work duplicated by a pirate model or legally
             | gray service, what are you going to do? This seems borne
             | out by the fact that Stability AI themselves are now
             | retreating behind Amazon for protection.
        
             | adamm255 wrote:
             | When I did Photography at college, a lot of the work was
             | looking at other works of art. I spent a lot of time in
             | Google Images, diving through books from the Art section
             | and going to galleries. Lots of photo copying was involved!
             | 
             | I then did works in the style of what I'd researched. I
             | trained myself on works I didn't own, and then produced my
             | own.
             | 
             | I kind of see the AI training as similar work, just done
             | programmatically vs physically.
             | 
             | Certainly a very interesting topic.
             | 
             | I can't get my head around how far we've come on this in
             | the last 6-12 months. From pretty awful outputs to works
             | winning Photography awards. And prints of a dog called
             | Queso you'd have paid a lot of money to an illustrator for.
        
               | rgbrgb wrote:
               | I think it's more analogous to if you had tweaked one of
               | those famous works directly in photoshop then turned it
               | in. The model training likely results in near replicas of
               | some of the training data encoded in the model. You might
               | have a near replica of a famous photograph encoded in
               | your head, but to make a similar photograph you would
               | recreate it with your own tools and it would probably
               | come out pretty different. The AI can just output the
               | same pixels.
               | 
               | That's not to say there aren't other ways you might use
               | the direct image (e.g. collage or sampling in music) but
               | you'll likely be careful with how it's used, how much you
               | tweak it, and with attribution. I think the weird problem
               | we're butting up against is that AFAIK you can't figure
               | out post-facto what the "influence" is from the model
               | output aside from looking at the input (which does
               | commonly use names of artists).
               | 
               | I work on an AI image generator, so I really do think the
               | tech is useful and cool, but I also think it's
               | disingenuous (or more generously misinformed) to compare
               | it to an artist studying great works or taking
               | inspiration from others. These are computers inputting
               | and outputting bits. Another human analog would be
               | memorizing a politician's speech and using chunks of it
               | in your own speech. We'd easily call that plagiarism, but
               | if instead every 3 words were exactly the same? Hard to
               | say... it's both more and less plagiarism.
               | 
               | Just how much do you need to process a sampled work
               | before you need to get permission of the original artist?
               | It seems to be in music that if the copyright holder can
               | prove you sampled them, even if it's unrecognizable, then
               | you're going to be on the hook for some royalties.
        
               | simonw wrote:
               | "The model training likely results in near replicas of
               | some of the training data encoded in the model."
               | 
               | I don't think that's true.
               | 
               | My understanding is that any image generated by Stable
               | Diffusion has been influenced by every single parameter
               | of the model - so literally EVERY image in the training
               | data has an impact on the final image.
               | 
               | How much of an impact is the thing that's influenced by
               | the prompt.
               | 
               | One way to think about it: the Stable Diffusion model can
               | be as small as 1.9GB (Web Stable Diffusion). It's trained
               | on 2.3 billion images. That works out as 6.6 bits of data
               | per image in the training set.
        
             | Jevon23 wrote:
             | >It's interesting to ask people who are concerned about the
             | training data what they think of Adobe Firefly, which is
             | strictly trained on correctly licensed data.
             | 
             | If they truly got an appropriate license agreement for
             | every image in the training set then I have no issues with
             | that.
             | 
             | >I'm under the impression that DALL-E itself used licensed
             | data as well.
             | 
             | DALL-E clearly used images they did not have a license for.
             | Early on it was able to output convincing images of Pikachu
             | and Homer Simpson. OpenAI certainly didn't get licensing
             | rights for those characters.
        
           | einpoklum wrote:
           | Stolen artwork? Why, I'm shocked! Shocked and chagrined!
           | Where, prey tell, does OpenAI keep that vast warehouse full
           | of stolen paintings? And have you alerted Interpol?
        
         | minimaxir wrote:
         | Unfortunately it's become a meme among AI art haters that AI
         | art is "just inputing text into a text box" despite the fact
         | that is far from the truth, particularly if you want to get
         | specific results as this blog post demonstrates.
         | 
         | Some modern AI art workflows often require _more_ effort than
         | actually illustrating using conventional media. And this blog
         | post doesn 't even get into ControlNet.
        
           | squidsoup wrote:
           | Only if you exclude the countless hours an illustrator has
           | spent developing their craft.
        
             | yieldcrv wrote:
             | being sympathetic to that requires pretending that the user
             | would have _ever_ commissioned an artist for that idea at
             | all. both the transaction and the idea would have simply
             | never happened. it was _never_ valuable enough or important
             | enough to commission a human, hope you got the correct
             | human, wait week after week for revision after revision.
             | 
             | people that want to hone a niche discipline _for
             | themselves_ still can do that. just be honest about doing
             | it for yourself.
        
             | libraryatnight wrote:
             | Using AI as a tool to create art takes nothing away from
             | anyone who spent time learning a skill or craft that they
             | use in their own pursuit of expression.
             | 
             | People will be arguing about whether or not art made with
             | AI is art, and artists will just be using it or not. I
             | remember an interview about electronic music where Bjork
             | addressed concerns that if you use a computer to make
             | music, it has no soul, and she said if the person using the
             | machine to make the music puts soul into it, it will have a
             | soul.
             | 
             | I remember David Bowie in the mid 90s saying if he was
             | young in that decade he might not have been a musician,
             | because in the 60s being a musician seemed subversive and
             | at the time of the interview the internet was carrying the
             | flag of subversion.
             | 
             | Anyway, it's interesting to watch these conversations. I'd
             | never claim to know what art is or try to tell someone, but
             | it seems to me that already because of the controversy
             | artists are drawn to AI and further exciting the
             | conversation. Commercial artists seem the most threatened;
             | animators, designers, etc. I understand why, but I don't
             | think arguing that AI isn't "art" is going to help their
             | cause any more than protesting digital painting wasn't art,
             | electronic music wasn't art, and much earlier that
             | photography wasn't art.
             | 
             | All the time these conversations are happening, the art's
             | getting made and we're barreling towards the next 'not art'
             | movement.
        
           | capableweb wrote:
           | > Some modern AI art workflows often require more effort than
           | actually illustrating using conventional media. And this blog
           | post doesn't even get into ControlNet.
           | 
           | Indeed. Another criticism that I can definitely somewhat see
           | the idea behind, is that the barrier to entry is very
           | different from for example drawing. To draw, you need a pen
           | and a paper, and you can basically start. To start with
           | Stable Diffusion et al, you need either A) paid access to a
           | service, B) money to purchase moderately powerful hardware or
           | C) money to rent moderately powerful hardware. One way or
           | another, if you want to practice AI generated art, you need
           | more money than what a pen and paper cost.
        
             | MayeulC wrote:
             | > is that the barrier to entry is very different from for
             | example drawing
             | 
             | Thqt got me thinking. I agree, but from another
             | perspective: the skillset is different. Traditionally, the
             | approach to art was very bottom-up. Start with a pen and
             | basic contouring techniques. Understanding more advanced
             | techniques require a lot of work (perspective, shadows,
             | etc).
             | 
             | "AI" art generally does away with basic techniques. The
             | emphasis is more on composing, styling. A top-down
             | approach. "AI" artists may be able to iterate quicker by
             | seeing "almost-finished" versions quickly (though a skilled
             | artist can most likely imagine their work pretty well).
             | 
             | But most of all, the tools and required skills are very
             | different. You don't need to know a lot about machine
             | learning, but it certainly helps. Probably pretty far from
             | the skillset of most current artists. And people generally
             | fear what they don't understand. And if I was an artist,
             | I'd be at least a bit concerned about (i) it undercutting
             | the value of my art, (ii) having to learn this alien way of
             | doing things to remain competitive (by way of selection,
             | artists probably enjoy their current tools.
             | 
             | Anyway, I imagine photography was similarly upsetting in a
             | lot of ways. It also didn't happen overnight. I also
             | suspect we are going to see similar improvements to output
             | quality as in early days of photography.
             | 
             | Another similarity is with digital music (and
             | recording/remixing before that). I wonder if we're going to
             | see new genres emerge as a result (the equivalent of
             | techno/electro).
        
               | Paul-Craft wrote:
               | Your comment in particular captures it, but I can imagine
               | a lot of the same sort of comments on this post being
               | made about film cameras when they came out, then again
               | about digital cameras.
        
               | prpl wrote:
               | Digital cameras made burst photos go from $.25+ a frame
               | at 5FPS to effectively free with rates at 30+FPS now.
               | That was transformative but also lead to all sorts of
               | lamentations about lack of skill
        
               | rprospero wrote:
               | I remember my university photography club trying to get
               | digital cameras banned from campus because "art only
               | happens in the darkroom".
        
             | dragonwriter wrote:
             | > To draw, you need a pen and a paper, and you can
             | basically start. To start with Stable Diffusion et al, you
             | need either A) paid access to a service, B) money to
             | purchase moderately powerful hardware or C) money to rent
             | moderately powerful hardware
             | 
             | A 4GB NVidia GPU (sufficient to run Stable Diffusion with
             | the A1111 UI) is hardly "moderately powerful hardware",
             | and, beyond that, Stable Horde (AI Horde) exists.
             | 
             | OTOH, a computer and internet connection are more expensive
             | than a pencil, even if nearly ubiquitous.
        
             | rob74 wrote:
             | Yeah, well then, please draw an image of my dog in the
             | style of van Gogh, using pen and paper. I would say that
             | for most of us, the more cost-effective way to get high
             | quality artwork will still be Stable Diffusion...
        
             | realusername wrote:
             | Stable Diffusion doesn't really need powerful hardware, any
             | graphic card will do, it will just be a bit longer. There's
             | even ports on smartphones nowadays.
        
             | simonw wrote:
             | There are plenty of traditional art mediums that require
             | significant financial outlays to get started: oil painting,
             | ceramics, glass blowing etc.
             | 
             | There are plenty of free online tools for using all kinds
             | of AI image generation techniques, and they don't require
             | powerful hardware, just something that can browse websites
             | or run Discord.
        
               | adamm255 wrote:
               | Plus training, lessons and inspiration. And talent.
               | 
               | It's like with dreams. They can be terribly intricate and
               | detailed, but ask me to draw something creative and I'm
               | out.
        
             | smallerfish wrote:
             | Stable Diffusion works fine on a CPU - on an AMD Ryzen
             | 5700, approx 90s per image (and I believe comparable or
             | faster on my old i7-6700). If you want to kick off a batch
             | in the background while you work on something else, that's
             | plenty fast. (I use:
             | https://github.com/brycedrennan/imaginAIry).
        
             | minimaxir wrote:
             | The cost has gone _way_ down in the last couple months.
             | 
             | With a super-cheap T4 GPU (free in Google Colab), PyTorch
             | 2.0, and the latest diffusers package, you can now generate
             | batches of 9-10 images in about the same time it took to 4
             | images when Stable Diffusion was first released. This
             | drastically speeds up the cherry-picking and iteration
             | processes: https://pytorch.org/blog/accelerated-diffusers-
             | pt-20/
             | 
             | Google Cloud Platform also now has preview access to L4
             | GPUs, which are 1.5x the cost of a T4 GPU but 3x throughput
             | for Stable diffusion workflows (maybe more given the
             | PyTorch 2.0 improvements for newer architectures), although
             | I haven't tested it: https://cloud.google.com/blog/products
             | /compute/introducing-g...
        
               | tough wrote:
               | We're minmaxing those costs thanks for the data
        
           | tester457 wrote:
           | It's a meme because 99% of the ai art creators don't go that
           | deep, they only prompt.
           | 
           | Even if they did have a more complex workflow most of them
           | are still based on copyrighted training data, so there will
           | be many lawsuits.
        
         | basisword wrote:
         | > if you're not going to put in the time and effort, why do you
         | deserve to create such high equality imagery?
         | 
         | This isn't high quality imagery. Don't get me wrong, the tech
         | is cool and I love the work that's went into making this
         | picture. But this isn't something I would ever hang on my wall.
         | There's probably a market for it, but I get the strong
         | impression it's the "live, laugh, love" market. The people that
         | buy pictures for their wall in the supermarket. The kind of
         | people who pay individual artists money to paint bespoke images
         | of their pet are not going to frame AI art. I don't think the
         | artists need to worry.
        
           | yellow_postit wrote:
           | I would expect it's only a matter of time till those
           | "traditional" artists also adopt these tools into their
           | workflows. Similar to the initial pushback against the
           | "digital darkroom" which is now the mainstay of photography.
           | 
           | In-ai-aided art, like manually developed film, will trend
           | towards a niche.
        
           | theaiquestion wrote:
           | > This isn't high quality imagery. Don't get me wrong, the
           | tech is cool and I love the work that's went into making this
           | picture. But this isn't something I would ever hang on my
           | wall.
           | 
           | Well yeah but that doesn't change the OP commenter's point
           | that it takes a lot of work to get high quality art still.
           | 
           | > I don't think the artists need to worry.
           | 
           | I disagree here but only on the basis of what type of art it
           | is. Stock art/photography, and a lot of media designwork is
           | likely at risk because we can now create "good enough" art at
           | the click of a button for almost no cost. I agree that the
           | "hang on the wall level good" artists aren't at risk just
           | yet, but between the more filler-art and the uh "anime/furry"
           | commissioners are definitely at risk right now for anything
           | except the highest quality artists.
        
         | mdp2021 wrote:
         | The shruggingface submission is very interesting and very
         | instructive.
         | 
         | Nonetheless, it would be odd and a weak argument to point
         | criticism towards not spending adequate <<time and effort>> (as
         | if it made sense to renounce tools and work through unnecessary
         | fatigue and wasting time). More proper criticism could be in
         | the direction of "you can produce pleasing graphics but you may
         | not know what you are doing".
         | 
         | This said, I'd say that Stable Diffusion is a milestone of a
         | tool, incredible to have (though difficult to control). I'd
         | also say that the results of the latest Midjourney (though
         | quite resistant to control) are at "speechless" level. (Noting
         | in case some had not yet checked.)
        
           | Paul-Craft wrote:
           | > More proper criticism could be in the direction of "you can
           | produce pleasing graphics but you may not know what you are
           | doing".
           | 
           | I don't get this. If one "can produce pleasing graphics," how
           | does that not equal knowing what they're doing? I only see
           | this as being true in the sense of "Sure, you can get places
           | quickly in a car, but you don't really know how it works."
        
             | mdp2021 wrote:
             | > _how does that not equal knowing what they 're doing_
             | 
             | The goal may not be to produce something pleasant. The
             | artist will want some degree of artistic value; the
             | communicator will want a high degree of effectiveness etc.
             | The professional will implicitly decide a large number of
             | details, in a supposedly consistent idea of the full aim.
             | The non professional armed with some generative AI tool may
             | on the contrary leave a lot to randomness - and obtain a
             | "pleasant" result, but without real involvement, without
             | being the real author nor, largely, the actual director.
        
       | indigodaddy wrote:
       | Pretty cool stuff. Personally though, not a huge fan of his "the
       | one" choice. Some of the other images in his assortment were much
       | better imo. Each to their own of course though!
        
         | steve_adams_86 wrote:
         | I agree, but I find it pretty cool that they were able to
         | generate and pick from what they wanted. This seems like one of
         | the real strengths of generative AI -- people can tune outputs
         | they otherwise couldn't create (unable to paint, draw, play
         | guitar, etc).
         | 
         | People can debate if it's actually good that people can create
         | art without being artists, but again, I think it's great that
         | the author had the freedom to create what they had in mind
         | without much outside influence. This has been a goal for
         | computers in general for so long, and it seems like we're
         | actually arriving with some mediums.
        
       | lxe wrote:
       | This is a great writeup on some of the nuances and gotchas you
       | have to watch out for when finetuning using dreambooth and the
       | generative creative process in general.
        
       | cinntaile wrote:
       | It's unfortunate a lot of the nice artsy detail disappeared when
       | he had to recreate part of the head, but I guess that is
       | inevitable. Great work and interesting writeup.
        
       | yieldcrv wrote:
       | Results at the top of your article/project please
        
       | spaceman_2020 wrote:
       | I would highly recommend using Photoroom's background removal
       | tool. Does a far, far better job than Photoshop.
        
       | EGreg wrote:
       | What are the tools we can run on a Linux machine?
       | 
       | EDIT: four downvotes and zero answers how to run it on a Linux
       | machine...
        
         | minimaxir wrote:
         | You were likely downvoted because you asked how to use it for
         | NFTs, which you just edited it out.
        
         | cogitoergofutuo wrote:
         | The only piece of software mentioned in the article that
         | doesn't run on Linux is Draw Things.
        
       | [deleted]
        
       | [deleted]
        
       | liuliu wrote:
       | There might be a few things Draw Things missing from this
       | article: no mask blur, not selecting the inpainting model for
       | inpainting work.
       | 
       | Tomorrow's release should contain both mask blur and inpainting
       | ControlNet, which might help these use cases.
        
         | jakedahn wrote:
         | Yeah, it was likely just user error. I actually really love
         | Draw Things, because I can run it locally on my mac and quickly
         | experiment without having to sling HTTP requests or spin up
         | GPUs.
         | 
         | I did the actual work back on March 11th, so I was likely on an
         | older build; but I was seeing issues where inpainting was just
         | replacing my selection/mask with a white background. I had the
         | inpainting model loaded, but couldn't figure it out.
         | 
         | I'm planning to continue playing with Draw Things locally, and
         | exploring the inpainting stuff. For such an iterative process I
         | feel like a local client would make for the best experience.
        
           | liuliu wrote:
           | There is no user error but UX issues :)
           | 
           | That has been said, you probably used paintbrush rather than
           | the eraser? There would be more help on the Discord server
           | though! https://discord.gg/5gcBeGU58f
        
       | skor wrote:
       | I liked the original more than the final version. The vector
       | style drawing was much more futuristic and more interesting.
       | 
       | Seems like lots of work went into that and I hope the author
       | enjoyed the process and enjoys the final result.
        
       | bigbillheck wrote:
       | Personally I paid a friend $200 to create an art portrait of my
       | dog.
        
       | AuryGlenz wrote:
       | I've done so much with a fine-tuned model of my dog.
       | 
       | I previously made coloring pages for my daughter of our dog as an
       | astronaut, wild west sheriff, etc. They're the first pages she
       | ever "colored," which was pretty special for us. Currently I'm
       | working on making her into every type of Pokemon, just for fun.
        
         | mdp2021 wrote:
         | Using which tools, specifically?
        
           | AuryGlenz wrote:
           | Stable Diffusion, generically.
           | 
           | StableTuner to fine tune the model - I can't recall the name
           | of the model I trained on top of, but it was one of the top
           | "broad" 1.5 based models on Civitai. Automatic1111 to do the
           | actual generating. I used an anime line art LoRA (at a low
           | weight) along with an offset noise LoRA for the coloring book
           | pages as otherwise SD makes images be perfectly exposed. For
           | something like that you obviously want a lot more white than
           | black.
           | 
           | EveryDream2 would be another good tuning solution.
           | Unfortunately that end of things is far from easy. There are
           | a lot of parameters to change and it's all a bit of a mess. I
           | had an almost impossible time doing it with pictures of my
           | niece, my wife is hit or miss, her sister worked really well
           | for some reason, and our dog was also pretty easy.
        
         | go_discover wrote:
         | Do you need an m1 macbook to do this? I have a 2015 macbook
         | pro..
        
         | AuryGlenz wrote:
         | I uploaded a couple of the Pokemon generations really quick as
         | examples. I still need to go through and do quick fixes for
         | double tails (the tails on Pokemon are _not_ where they are on
         | regular animals, apparently), watermarks, etc. and do a quick
         | Img2Img on them.
         | 
         | https://imgur.com/a/11OxoSA
        
           | jakedahn wrote:
           | These are great!
        
             | AuryGlenz wrote:
             | Thanks. They aren't necessarily the best ones - I just
             | uploaded some quickly. Like I said, they still need final
             | touches too. I probably should have worked on the prompt a
             | bit more before I went all in too.
             | 
             | For anyone else doing it, the ability to do something like
             | [vaporeon:cinderdog:.5] so it starts with a specific
             | Pokemon and transitions into the dog later was great for
             | some types.
             | 
             | One of the fun things about this sort of thing are happy
             | accidents. One of the fire types generated as two side by
             | side - a puppy and an evolution.
        
           | minimaxir wrote:
           | For generating Pokemon, I recommend using this model along
           | with a textual inversion of your pet:
           | https://huggingface.co/lambdalabs/sd-pokemon-diffusers
        
       ___________________________________________________________________
       (page generated 2023-04-16 23:00 UTC)