hngopher.com

       [HN Gopher] DALL*E Now Available Without Waitlist
       ___________________________________________________________________
        
       DALL*E Now Available Without Waitlist
        
       Author : minimaxir
       Score  : 256 points
       Date   : 2022-09-28 16:20 UTC (6 hours ago)
        
 (HTM) web link (openai.com)
 (TXT) w3m dump (openai.com)
        
       | MrZander wrote:
       | Is DALL-E noticeably better than Stable Diffusion and other self-
       | hostable options? I don't see why I would even bother going
       | through the sign up process for OpenAI, only to be limited by
       | their filters. Seems like they are late to the party now.
        
         | speedgoose wrote:
         | Dall.e is much better at understanding what you want. And
         | sometimes stable diffusion feels a bit overfitted on some
         | prompt (especially with cars).
         | 
         | But Dall.e is often behind in terms of image quality. They are
         | nice looking from far, but a bit more blurry or weird than
         | stable diffusion if you look closely.
         | 
         | However you can use boths together. These days I tend to use
         | stable diffusion first, but when a prompt is not going well I
         | copy paste it in dall.e and get what I meant much easily. And
         | then I import the dall.e generated image in stable diffusion to
         | work it a bit more and get something a bit better looking.
        
         | davidbarker wrote:
         | I've run thousands of prompts with DALL-E 2, thousands with
         | Midjourney, and probably hundreds with Stable Diffusion.
         | 
         | My (very qualitative) feeling is that DALL-E 2 is good with
         | composition and realism (e.g. generating photographs -- you'll
         | still get artefacts but it's less likely to look "computer
         | graphics-y"), and is quite forgiving (you will usually end up
         | with an image that makes sense).
         | 
         | Midjourney had a recent update and can now produce beautiful
         | images with far more detail and realism than DALL-E 2 in some
         | cases, especially for human and animal faces, but excels more
         | on the computer art side of things. (Midjourney now has a
         | community showcase gallery:
         | https://www.midjourney.com/showcase/)
         | 
         | Stable Diffusion is a bit less forgiving than both, in my
         | experience. Some people are able to create stunning images, but
         | you have to invest more time into figuring out what works best.
         | 
         | I'm currently looking into taking images generated with DALL-E
         | 2, then using them as a starting point for Stable Diffusion to
         | add detail. It works partciualrly well for cartoon-style
         | images.
         | 
         | For example:
         | 
         | - Original DALL-E 2 image of a horse in a city:
         | https://i.imgur.com/CaNHHR7.jpeg
         | 
         | - That image used as a starting point for Stable Diffusion:
         | https://i.imgur.com/EW1iKOO.png and
         | https://i.imgur.com/VOQ35Oz.png
         | 
         | You can see it significantly cleans up the artefacts the
         | original DALL-E 2 image had. (Note: the original DALL-E 2 image
         | is 1024 pixels square, but Stable Diffusion generated a 512
         | square output.)
        
           | jw1224 wrote:
           | Midjourney's recent upgrade was largely thanks to integration
           | with Stable Diffusion. Somehow Midjourney's images still
           | retain a more "premium" artistic feel to them though.
        
         | mminer237 wrote:
         | In my experience, DALL*E has generated much better images, but
         | people have varying opinions. Stable Diffusion is more
         | configurable, so you might be able to tinker with it to get
         | what you want more whereas DALL*E just works pretty decently.
        
         | langitbiru wrote:
         | For some categories like realistic people, DALL-E is better.
        
       | LightG wrote:
       | Damn, I'm picky ... you want my phone number before even getting
       | a chance to try it out?
       | 
       | NOPE.
        
       | 1024core wrote:
       | Is this DALL-E or DALL-E 2 ?
        
       | ALittleLight wrote:
       | DALL-E's censor really frustrates me. I hit "inappropriate"
       | warnings that threaten "If you do this repeatedly you may be
       | banned" all the time. Probably one in six prompts I tried on
       | DALL-E got this warning. I don't see why a creative tool should
       | have such aggressive censorship built in.
       | 
       | One example of an "inappropriate" prompt that stuck out in my
       | mind, and I think is pretty representative - I was trying to
       | recreate DVD box art of Breaking Bad, but replace the main
       | characters with cats. When my prompts were things like "Meth
       | dealing cat" I would get the "inappropriate" flag. Frustrating.
       | 
       | I much prefer stable diffusion. Quality is different, but being
       | able to generate whatever I want without censorship is an
       | enormous benefit. Plus, the cost. OpenAI is far too expensive. A
       | friend and I are writing a novel. We wanted to see what it would
       | be like to just feed each paragraph of text from our novel to the
       | image generator, along with combinations of descriptions. This
       | would be a pretty expensive experiment with DALL-E, but it can
       | run locally for the cost of electricity with stable-diffusion.
        
         | ehPReth wrote:
         | Agreed. I really hate DALL-E's censoring of prompts, I'm glad
         | they're losing out.
        
       | murkt wrote:
       | Isn't available in Ukraine. Why?.. And it's not like a "temporary
       | error, please try again". It just directly states "nope, not for
       | Ukraine". "Open"AI.
        
         | ronsor wrote:
         | I have a nagging feeling this is their way of "preventing" fake
         | war pictures from cropping up.
        
           | vanadium1st wrote:
           | I doubt it - GPT-3 is also restricted for users in Ukraine.
        
           | murkt wrote:
           | I wonder if Russia is blocked as well. And Belarus.
           | 
           | If they want to prevent fake war pictures, it should be done
           | with their NSFW filter. Not with blocking the whole
           | (suffering!) country.
        
         | diimdeep wrote:
         | FYI, there is filter that prevents to generate political
         | content and other nasty stuff
        
           | murkt wrote:
           | Yes, I know about their political and NSFW filter. But why
           | block Ukraine? Do they think that everyone from Ukraine will
           | 100% use this for generating dead Putins, or what?
           | 
           | I just can't think of any good reason to do that. I can think
           | of one bad, though
        
             | mourner wrote:
             | > generating dead Putins, or what?
             | 
             | To be fair, the first ever prompt I tried out with an AI
             | image generator like this is "putin eaten alive by pigs" :)
             | It's hard to refrain.
        
         | SSLy wrote:
         | I've just tried with a VPN and got to phone number screen with
         | an IP geolocated near Kyiv. What's the error you're getting?
        
           | vanadium1st wrote:
           | After confirming the ukraininan phone every page gives the
           | same error - "OpenAI's API is not available in your country."
           | The same goes for GPT-3. They just banned our country as a
           | whole. For a ukraininan right now this situation is
           | admittedly not the biggest problem in my life. Still don't
           | see a reason for OpenAI to restrict our access.
        
             | SSLy wrote:
             | Alright, that's indeed incredibly foolish of them. I hope
             | the new HIMARS package will sweeten the insult at least a
             | bit :^)
        
       | johnfn wrote:
       | It's really amazing how DALL-E missed the boat. When it was
       | launched, it was a truly amazing service that had no equal. In
       | the months since then, both Midjourney and Stable Diffusion
       | emerged and got to the point where they produce images of equal
       | or better quality than DALL-E. And you didn't have to wait in a
       | long waitlist in order to gain access! They effectively gave
       | these tools free exposure by not allowing people to use DALL-E.
       | 
       | Furthermore, the pricing model is much worse for DALL-E than any
       | of its competitors. DALL-E makes you think about how much money
       | you're losing continuously - a truly awful choice for a creative
       | tool! Imagine if you had to pay photoshop a cent every time you
       | made a brushstroke. Midjourney has a much better scheme (and
       | unlimited at only 30/month!), and, of course, Stable Diffusion is
       | free.
       | 
       | This is a step in the right direction, but I feel that it is too
       | little, too late. Just compare the rate of development.
       | Midjourney has cranked out a number of different models,
       | including an extremely exciting new model ("--testp"), new
       | upscaling features, improved facial features, and a bunch more.
       | They're also super responsive to their communtiy. In the
       | meantime, OpenAI did... what? Outpainting? (And for months,
       | DALL-E had an issue where clicking on any image on the homepage
       | would instantly consume a token. How could it take so long to fix
       | such a serious error?) You have this incredible tool everyone is
       | so excited to use that they're producing hundred-page documents
       | on how to get better results out of it, and somehow none of that
       | actually makes it into the product?
        
         | fullshark wrote:
         | Or maybe they got undercut by an open source implementation and
         | that was inevitable no matter what. How can you compete with
         | free?
        
           | notahacker wrote:
           | It's easier to compete with free (most paid products do) if
           | most of the people interested in AI generated art have been
           | paying for their service for months rather than browsing for
           | alternatives. Especially since their supposed advantage is
           | better prompt understanding rather than image quality; easy
           | to dismiss StableDiffusion if your first impressions of it
           | are "doesn't understand me like DALL-E" rather than "wow,
           | this is magic"
           | 
           | The "waitlist" model might work when the product isn't ready
           | for prime time or the exclusivity is a part of the pitch, but
           | it's greatly overrated in other respects. I got a "The Wait
           | Is Over" email to tell me I'm off the waitlist and able to
           | use a not-exactly-new stock trading app this week as the UK
           | economy crashed. Yeah, thanks, but no thanks...
        
         | skybrian wrote:
         | This gets upvoted due to inexplicable DALL-E hate, but on the
         | other hand I'm keeping my DALL-E account and cancelled my
         | MidJourney account because the DALL-E account doesn't cost me
         | anything when I don't use it. Having an account I barely use is
         | great because I can go generate an image whenever I want for
         | comparison purposes.
         | 
         | (Furthermore, if I don't use it very often, I'm in the free
         | tier due to the 15 free credits a month.)
         | 
         | Also, do you realize that Stable Diffusion is also running a
         | pay-for-usage model at dreamstudio.ai? I like that too.
        
         | AlexanderTheGr8 wrote:
         | Can you link to the hundred-page document? I believe you are
         | talking about prompt engineering, and I would love to get more
         | information about it. I am struggling with figuring out good
         | prompts.
        
         | dqpb wrote:
         | I think it's hard to move fast and be the morality police at
         | the same time.
        
         | Keyframe wrote:
         | I'm still heavily exploring these new tools from an artist
         | perspective. I never managed to get a run on Midjourney, but
         | between DALL-E and SD there are quite a few differences.
         | Broadly speaking, DALL-E seems to better get a hang on
         | photographic results and interpreting "what I meant". With
         | stable diffusion it's a lot of fiddling and putting manual
         | emphasis on certain keywords until just getting it right.
         | 
         | Overall, pricing will need to be adjusted over time as well. I
         | set out on an experiment the other day that you can see here:
         | https://twitter.com/Keyframe/status/1574338738808934400
         | 
         | I went about trying to utilize stable diffusion for an
         | imaginary concept project (concept for characters of a remake
         | of TMNT, heh). Process was similar to how I'd do it with
         | another artist more than if I drew it alone. It was back and
         | forth, from rough outlines and then honing into details.
         | Inpainting and img2img helped A TON and I hope I'd get
         | dreambooth running soon as well since that will be a game-
         | changer in the combination of things.
         | 
         | Between exploration phase, detailing, alternatives, and manual
         | painting and over painting, I'd say PER OUTPUT final image I
         | created in the region of a thousand or so interim images.
         | Process overall did take a lot of time but not as much as
         | completely manual and I didn't feel like I had as much control
         | as manual of course, but I did feel ultimately bold enough that
         | I thought I had creative control. With dreambooth I expect it
         | to close the gap.
         | 
         | Overall, I was extremely pleased with the experiment and I'll
         | continue exploring it, even though I'm not doing artwork
         | professionally anymore. And so far no, it's not going to
         | replace artists. It's another tool removing labour, but adds
         | time on direction needed. Ultimately it'll be another brush in
         | the toolbox.
        
         | shadowgovt wrote:
         | It's a little heartbreaking because arguably, OpenAI tried to
         | do the responsible thing here: come up with a sustainable
         | business model to make AI-generated images profitable while
         | respecting trademarks and controlling for some objectionable
         | content. Very corporate; very above-the-board.
         | 
         | Emad Mostaque, a millionaire hedge-fund manager with money to
         | burn, spent approximately $600,000 to train a model and dumped
         | it out for public consumption: no account for how it will be
         | used, no concern about any sociopolitical consequences, damn
         | the torpedoes and straight ahead. He basically burned down a
         | potential industry space and hugely complicated an ongoing
         | conversation on how these tools will interact with / disrupt
         | the lives and livelihoods of artists... But he also basically
         | changed the world overnight. Hashtag-squad-goals, am I right?
         | 
         | There's a lesson to be learned here. I haven't decided what it
         | is yet. Though I note that it's a lesson that probably applies
         | to few people who don't have $600,000 to set aflame.
        
           | dalmo3 wrote:
           | > dumped it out for public consumption: no account for how it
           | will be used, no concern about any sociopolitical
           | consequences
           | 
           | I hadn't heard the story of how stable diffusion was created.
           | Sounds like the guy is a true hero from your description. And
           | only for $600k? Imagine if he decided to "burn" the rest of
           | his millions on similar initiatives.
        
             | shadowgovt wrote:
             | I unfortunately lack the imagination to think of similar
             | initiatives that could be addressed in such a fashion.
        
         | mrtksn wrote:
         | It's almost as if OpenAI got the right idea at the beginning
         | but sometime somewhere maybe in a meeting room, contrary to
         | their initial openness goals they decided to be closed walled
         | garden for a product that doesn't exist. IMHO giving a prompt
         | to generate image is amazing but isn't a product because you
         | can't actually produce useful stuff(it's great as an
         | exploration tool). It seems to me, OpenAI rushed into
         | monetisation and control before having a killer app.
         | 
         | On the other hand Stable Diffusion emerged as a free tool where
         | large community can experiment and search for the killer app
         | together. People started adapting it into other tools and
         | workflows and so far it seems like the magic is in finding
         | prompts that make the device generate good quality outputs.
         | Earlier today I saw announcement about lexica.art(Stable
         | Diffusion prompt tool) getting funded.
        
           | Uehreka wrote:
           | >contrary to their initial openness goals
           | 
           | Their goals were never about openness at all though. From the
           | beginning I've felt like they should've called themselves
           | something like "SafeAI", since their stated goal was
           | basically to develop advanced AI first, then keep a lid on it
           | until they could somehow ensure it was "safe" or would only
           | be used by "good" people.
           | 
           | Sure, OpenAI might sound nicer, but it also drags this
           | contradiction into the foreground whenever someone says their
           | name.
        
             | metacritic12 wrote:
             | Yup, OpenAI is founded by AI-safety-as-a-religion people.
             | They're essentially single-issue voters, who believes
             | earnestly their issue is the only issue that matters. You
             | see analogues of them in e.g. climate change (right or
             | wrong).
             | 
             | This religion definitely has a parentalist bent to it that
             | rubs a lot of people the wrong way. I vaguely recall them
             | floating on Twitter the theoretical idea of whether
             | murdering people to prevent AI-takeover is acceptable, due
             | to how bad AI-takeover is.
             | 
             | Not surprising limiting access, spying on what its users
             | are using their tools for, etc, is acceptable to them.
             | 
             | This is much in the same vein as how for Lenin, the
             | eventual triumph of the working class is so important as to
             | justify a little bit of interim violence, dictatorship, and
             | summary executions.
        
               | Miraste wrote:
               | The difference being that climate change activists have
               | mountains of data, decades' worth, backing their cause,
               | while OpenAI and friends have a sci fi story they made
               | up, based on nothing. The whole "AI alignment" movement
               | is the worst example of arrogance in modern tech. Even
               | the nomenclature screams condescension - the imaginary
               | AGI needs "aligned values?" Aligned with whose values?
               | Invariably it ends up being the creators', at the expense
               | of squashing everyone else's. The DALL-E "acceptable use"
               | rules are a dystopian nightmare and they are born of
               | incredibly pompous self-righteousness.
        
               | addingadimensio wrote:
               | Shutting down nuclear power is also dystopian.
        
               | Miraste wrote:
               | Anyone who calls themself a climate change activist and
               | supports shutting down nuclear plants is - well, I prefer
               | not to use invective on HN, so let's say _extremely
               | misguided._
        
               | addingadimensio wrote:
               | So the green party in basically every country?
        
               | hwers wrote:
               | "They have literally floated the idea of whether
               | murdering people to prevent AI-takeover is acceptable,"
               | 
               | Where? I probably believe you but it almost makes me
               | worried about the well being of the stability ai founders
               | (on a long term horizon)
        
               | metacritic12 wrote:
               | To be honest, I thought I saw it somewhere on Twitter but
               | can't find it right now after a few minutes of search. It
               | was proposed as a theoretical question, not as a
               | statement -- like is AI so bad that if you had a time
               | machine, it would be worth killing the pivotal people to
               | prevent it. Terminator style.
               | 
               | I've modified my OP to clarify that.
        
             | somenameforme wrote:
             | Wow, I was about to argue against this 'unfairly cynical'
             | take, but it's completely correct.
             | 
             | ---
             | 
             | (2015) OpenAI's original "Introducing OpenAI Post" :
             | https://openai.com/blog/introducing-openai/ : "As a non-
             | profit, our aim is to build value for everyone rather than
             | shareholders. Researchers will be strongly encouraged to
             | publish their work, whether as papers, blog posts, or code,
             | and our patents (if any) will be shared with the world.
             | We'll freely collaborate with others across many
             | institutions and expect to work with companies to research
             | and deploy new technologies."
             | 
             | (2018) OpenAI's "Charter" : https://openai.com/charter/ :
             | 
             | "We are concerned about late-stage AGI development becoming
             | a competitive race without time for adequate safety
             | precautions. Therefore, if a value-aligned, safety-
             | conscious project comes close to building AGI before we do,
             | we commit to stop competing with and start assisting this
             | project."
             | 
             | "We are committed to providing public goods that help
             | society navigate the path to AGI. Today this includes
             | publishing most of our AI research, but we expect that
             | safety and security concerns will reduce our traditional
             | publishing in the future, while increasing the importance
             | of sharing safety, policy, and standards research."
             | 
             | ---
             | 
             | Provides some interesting context to the fact that Elon
             | left the company's board in February 2018 over
             | "disagreements about the company's development."
        
               | moffkalast wrote:
               | > if a value-aligned, safety-conscious project comes
               | close to building AGI before we do, we commit to stop
               | competing with and start assisting this project
               | 
               | Haha, and then they would proceed to get told to politely
               | piss off.
        
             | moffkalast wrote:
             | > stated goal was basically to develop advanced AI first,
             | then keep a lid on it until they could somehow ensure it
             | was "safe" or would only be used by "good" people
             | 
             | That was the stated made up bullshit they spun because
             | "we're keeping this walled to figure out how to squeeze the
             | most profit out of it" doesn't go as well with their focus
             | groups.
        
             | ryanmcbride wrote:
             | For real it's insane to me how much I bump up against their
             | community guidelines. For example, you'll get a community
             | guidelines block if you enter a prompt like "An
             | illustration of a computer in the style of Henry Vandyke
             | Carter".
             | 
             | removing "Vandyke" from the prompt lets it go through[1],
             | but doesn't result in the style I want. Because there's no
             | artist that I'm aware of that goes by "Henry Carter". The
             | middle name is important.
             | 
             | It reminds me of the old 2D Runescape days where the
             | language filter would convert "dictionary" to "**tionary".
             | 
             | [1]https://ibb.co/4106mfF
        
               | msoucy wrote:
               | My favorite example of an issue like this (the Scunthorpe
               | problem) is from the mobile game Kingdom Hearts Unchained
               | X. In it, players used Medals based around Disney
               | characters, including experiment 626. However, for a long
               | time after the game's release, players were unable to say
               | that name in chat, because it got rendered as s***ch
        
               | wyldfire wrote:
               | Scunthorpe prob [1]. Apparently they couldn't just
               | unleash a model to fix this [2]
               | 
               | [1] https://en.wikipedia.org/wiki/Scunthorpe_problem
               | 
               | [2] https://www.techdirt.com/2018/08/31/scunthorpe-
               | problem-why-a...
        
               | Miraste wrote:
               | It's doubly pathetic because of how they frame it:
               | 
               | -We are the world's most advanced AI company.
               | 
               | -Our filter verifiably acts as a simple blacklist.
               | 
               | -You aren't allowed to see the blacklist because it's
               | really a "contextual" filter, so you'll have to guess.
               | 
               | -If you guess wrong too many times you'll be banned.
               | 
               | -Using our service more often increases the chance you'll
               | hit the number of wrong guesses.
               | 
               | -No, you can't know what that number is.
        
               | Blahah wrote:
               | H V Carter
        
               | ryanmcbride wrote:
               | I'm out of credits but I'll try that later
        
           | JumpCrisscross wrote:
           | > _OpenAI rushed into monetisation and control before having
           | a killer app_
           | 
           | They also went for B2B first. Which is weird. Why not
           | parallel a B2C app? It could be a subscription or packs of
           | drawings. It would generate buzz and give useful data on the
           | sorts of things real people type into these systems. I
        
             | bombcar wrote:
             | They thought too many of the B2B customers would just use
             | the B2C app is what I would guess, but there are ways to
             | limit that.
        
           | suyash wrote:
           | They might as well change their company name to Close AI at
           | this point.
        
           | namlem wrote:
           | OpenAI is infected with AI safety brainworms
        
             | dmix wrote:
             | AI ethics as a whole has become a bit of a joke.
             | 
             | Any highly motivated group without much to do will seek out
             | things to make themselves seem important and necessary.
        
               | pfisherman wrote:
               | I don't think it is a joke, so much as misguided. I see a
               | lot of focus on technical solutions, when the real
               | problems are social. The big research question should not
               | be "how can we build a 'safe' system" so much as "how
               | should (or shouldn't) we use these new tools and
               | capabilities"?
        
               | oefnak wrote:
               | I'm very interested why you think an advanced AI wouldn't
               | be dangerous?
               | 
               | Assume: - AGI wants to stay alive - Humans can create
               | more AIs - Other AIs would compete for the same resources
               | 
               | Then: Easiest way to make sure that they would get no
               | more competitors would be...
        
               | ryanmcbride wrote:
               | AGI and what we're talking about here are completely
               | different.
        
               | astrange wrote:
               | That'd be sad if we invented a superintelligent AI but
               | still taught it the lump of labor fallacy.
               | 
               | Anytime you see "creating an AI will obviously kill you"
               | try reading it as "having children will obviously kill
               | you" and see if it still makes sense.
        
               | babyshake wrote:
               | My take on this is that AI ethics is really important,
               | but just preventing AI from doing certain things like
               | creating celebrity deepfakes is somewhat lazy and
               | ineffective. A better application of AI ethics is
               | developing technology that can reliably detect deepfakes,
               | rather than just putting artificial limits in your
               | product and acting like that is going to stop pandora's
               | box from being opened.
        
               | im3w1l wrote:
               | The funny thing about generated porn is that once it
               | becomes ubiquitous real leaked tapes become deniable. So
               | the possible downside for celebrities is that when they
               | _intentionally_ leak a tape to create buzz it may be met
               | with a yawn.
        
               | bee_rider wrote:
               | Yeah. People are definitely going to abuse Stable
               | Diffusion, I'm sure they already are. But I don't really
               | know what OpenAI's plan was. It's like they rushed up to
               | a Pandora's box, took a peek and shouted to everyone
               | "Good news everyone, we taped Pandora's box closed!"
               | somehow without noticing that they were doing so from
               | inside Pandora's warehouse.
               | 
               | On the other hand, everybody's been saying Pandora's
               | warehouse was over there for a while -- it isn't really
               | that they are to blame for showing us the way in or
               | anything, I just don't understand what they were trying
               | to accomplish.
        
               | PoignardAzur wrote:
               | That's a fantastic metaphor.
        
             | biomcgary wrote:
             | And, based on market share, they are clearly the target of
             | a sophon directed by Roko's basilisk.
        
             | gedy wrote:
             | I was enthusiastic about DALL-E but the "safety measures"
             | are both heavy handed and naive. It gets in the way for
             | many normal/reasonable prompts but seems easy to work
             | around with various wordplay, so not sure the point. Stable
             | Diffusion and others have been much easier to deal with.
        
               | ackfoobar wrote:
               | > heavy handed and naive
               | 
               | It's good thing that the "safety measure" is the way it
               | is - an afterthought. It means that those ideologues
               | haven't yet had influence on the model itself.
        
               | samatman wrote:
               | The harm is really hand-wavey and speculative, frankly.
               | 
               | An image classifier calling Black faces gorillas?
               | Embarrassing, insulting, has to be fixed. AI pre-crime
               | classifiers for police departments? I'm against it,
               | across the board.
               | 
               | Do we really care that the image mulchers default to
               | stereotypes? It means if you say "basketball player"
               | they'll mostly be Black, if you just say "doctor" they'll
               | mostly be white males (and probably balding with a
               | stethoscope), but this can be qualified easily in the
               | prompt.
               | 
               | It just reflects the training data, and the smart thing
               | to do is shrug and add enough words to get the image you
               | want. It's not trying to throw shade, it literally
               | understands nothing, it's not able to understand things,
               | just match text prompts to generated images.
               | 
               | Nerfing DALL-E by randomly adding 'diverse words' just
               | makes it harder to dial in the image you want. Let's say
               | you want a Vietnamese male doctor drinking coffee on
               | break in Hanoi, it's not going to help you if 1/3rd of
               | the images have "female" or "black" tagged onto it.
               | 
               | It just seems low stakes. We wouldn't come after a human
               | artist who happened to paint a picture which conforms to
               | simple occupational stereotypes, why should AI be any
               | different? It's not like it will refuse to give you what
               | you want if you ask.
        
           | shadowgovt wrote:
           | Stable Diffusion's creator spent a completely ridiculous
           | amount of money to make that free tool.
           | 
           | OpenAI's mistake may have been "planning to have a business
           | model;" the alternative they should have gone with was
           | "Instead of taking investor money with promises of some kind
           | of return, _be_ a hedge fund manager, make $100 million, and
           | then set $600,000 on fire with no plan to recoup the cost
           | because it 's play-money to you."
        
             | joe_the_user wrote:
             | You can't have a business model of "become rich and then
             | use money for X". Business _are_ how (a very few) people
             | become very rich.
             | 
             | Moreover, there are very rich people already, like Warren
             | Buffet, Bill Gates and Elon Musk, funding projects for
             | doing good like world hunger, education and _" AI Safety"_.
             | And Open AI was a project of this sort of thing,
             | originally. The thing is that even very rich people demand
             | that the enterprises they give money to be as self-
             | supporting as possible and their money is spread fairly
             | thin. The only way Open AI could become an AI development
             | shop, employing many top developers, was to have the
             | financing level of a commercial company. Which means it
             | constantly puts out products that don't seem like they can
             | make money because AI algorithms don't seem to controllable
             | - Open AI seems to only be able to have the first
             | implementation of X, not the best implementation. Once the
             | basic idea is out, someone else can produce a similar thing
             | with a budget that doesn't include a research team.
        
               | shadowgovt wrote:
               | Precisely. The road StabilityAI took to releasing Stable
               | Diffusion is precluded for most startup companies.
        
             | minimaxir wrote:
             | $600k is an order of magnitude less than what it cost to
             | train GPT-3 or DALL-E 2.
             | 
             | When that figure came out, the popular talking point was
             | how _cheap_ Stable Diffusion cost to make and how easily a
             | well-funded competitor could create their own custom
             | variant.
        
               | fragmede wrote:
               | $600k is also list price for the GPU time spent. As an
               | investor in a GPU cloud company the actual cost was
               | probably way less than that.
        
             | petercooper wrote:
             | _and then set $600,000 on fire_
             | 
             | That seems a bit cynical. While SD's creator _might_ not
             | recoup that money directly, a lot of end users have
             | benefited from its creation. That money has figuratively
             | gone up in flames no more than the time or labor cost of an
             | open source developer whose code is used by millions of
             | people, IMO.
        
               | shadowgovt wrote:
               | It's a bit cynical; my point is that it's the kind of
               | decision you get to make when you're the sole owner of
               | $100 million and not the kind you get to make when you're
               | a startup company founder working with other investors'
               | money.
               | 
               | OpenAI wouldn't have been able to do what StabilityAI did
               | because OpenAI is incentivized to make return on
               | investment; Mohammad Emad Mostaque is not.
        
             | Miraste wrote:
             | Stability AI hit over a million users on their paid SD
             | implementation, Dream Studio, in less than a month. I'd bet
             | they recoup the training money.
        
             | l33tman wrote:
             | $600k was the off-the-shelf market value of the GPU time
             | spent. They got this time at a much lower rate (according
             | to the founder himself) and for the PR and fame they got,
             | that money is ridiculously little.
             | 
             | Somewhat tangentially, I speculate that crowd-sourced
             | training will become a thing.
        
           | minimaxir wrote:
           | > IMHO giving a prompt to generate image is amazing but isn't
           | a product because you can't actually produce useful stuff
           | 
           | Making art/weird pictures doesn't have to be _useful_ , as
           | that use case is the entire reason MJ/SD went viral.
        
             | hugozap wrote:
             | True, but if you can't integrate it into your workflow it
             | will stay as a toy (and that's ok)
        
               | cbozeman wrote:
               | People are already integrating it into their workflows.
               | 
               | The inpainting plugins with Photoshop and Krita are
               | already working absolute wonders.
        
               | nodja wrote:
               | https://www.youtube.com/watch?v=2rA4Ny-QQfg
        
               | lbotos wrote:
               | Not sure if you've seen, but the person you are replying
               | to wrote a blog post recently on their experiences of
               | "getting stable output":
               | 
               | https://minimaxir.com/2022/09/stable-diffusion-ugly-
               | sonic/
        
             | mrtksn wrote:
             | It is not art and art is useful(we can disagree on what's
             | art, the age old question).
             | 
             | MidJourney and others are actually useful for exploration
             | but the outputs are not because they can't spit finished
             | deliverables to the specs. No one is paying for a picture
             | of Mermaid eating marmalade, trending on art station,
             | beautiful face, sharp focus, octane 8k.
             | 
             | They are great for exploration, it's just that I don't
             | believe this is the killer app for these tools. We will
             | find out what's the killer app with Stable Diffusion
             | because with Stable Diffusion people can experiment beyond
             | entering some prompts.
        
               | danielbln wrote:
               | Training your own likeness into the stable diffusion (via
               | dreambooth) and then using it is absolutely hypnotic.
        
               | TulliusCicero wrote:
               | There are situations where that kind of art is useful
               | though. People have pointed out that it could work just
               | fine for art for card games like Magic. Probably a lot of
               | board games too.
        
               | mrtksn wrote:
               | Sure, some people can find it useful but IMHO that's not
               | a product, at least a good one. Consider how much more
               | universally useful are other products like Photoshop or
               | Blender.
               | 
               | I think a major problem is reproducibility and output
               | controllability. Rolling the dice multiple time and using
               | some of the outputs is not good enough for most
               | applications.
               | 
               | Maybe this can be solved at some point but it's not at
               | this moment. The advantage of Stable Diffusion is that it
               | can be possible for someone to implement it, with OpenAI
               | this feature doesn't exist and its not useful until they
               | implement it.
        
             | UncleEntity wrote:
             | 100% unusefull but is seriously addictive...
        
             | npunt wrote:
             | Indeed, "it's just a toy" is often the place these things
             | start
        
         | hartator wrote:
         | Yes, it does feel they shot themselves in the foot.
         | 
         | Their marketing was excellent, but somehow pushed expectations
         | too much and underdelivered. It also felt very elitist. Not
         | very tinkers in a garage that this generation of stable
         | diffusion feels like.
        
         | obert wrote:
         | My thought on timing this is about Responsibility.
         | 
         | I don't think other options put any or so much consideration
         | about AI impact.
         | 
         | Perhaps we're just finding out that people don't care. Yet?
        
         | danielfoster wrote:
         | For something that is supposed to be intelligent, sometimes the
         | restrictions around DALL-E make no sense. My recent request to
         | generate an image of two cats sleeping together was not allowed
         | because apparently this is "adult content."
        
         | irrational wrote:
         | Equal or better quality? I suppose it depends on what you are
         | trying to create, but that hasn't been my experience at all.
        
           | hooloovoo_zoo wrote:
           | Agreed; SD barely follows prompts at all.
        
             | zimpenfish wrote:
             | > Agreed; SD barely follows prompts at all.
             | 
             | I would heartily disagree - I've generated ~6.5k images
             | using SD locally and most of them could be linked to the
             | prompt they came from.
        
               | itintheory wrote:
               | Have you seen a decent tutorial for setting up SD
               | locally? I've been using it through huggingface, but that
               | seems pretty limited.
        
               | Nimitz14 wrote:
               | Official repo is straightforward:
               | https://github.com/CompVis/stable-diffusion
               | 
               | Have to admit just started looking into it, mb there are
               | better options
        
               | zimpenfish wrote:
               | No, sorry, but there's a whole bunch of one-click things
               | now, I think?
               | 
               | I'm running it on Windows 10 using (a modified version
               | of) https://github.com/bfirsh/stable-diffusion.git and
               | Anaconda to create the environment from their
               | `environment.yaml` (all of which was done using the
               | normal `cmd` shell). Then to use it, I activate that env
               | from `cmd` and switch into cygwin `bash` to run the
               | `txt2img.py` script (because it's easier to script, etc.)
               | 
               | [edit: probably helps that I already had a working VQGAN-
               | CLIP setup which meant all the CUDA stuff was already
               | there. For that I followed
               | https://www.youtube.com/watch?v=XH7ZP0__FXs which covered
               | the CUDA installation for VQGAN-CLIP.]
        
               | Caseee wrote:
               | You can find a number of different guides over at the
               | stable diffusion subreddit, from CLI to GUIs in different
               | flavors.
               | 
               | https://www.reddit.com/r/StableDiffusion/comments/xcq819/
               | dre...
        
               | hooloovoo_zoo wrote:
               | Doesn't 'most of them could be linked to the prompt they
               | came from' strike you as damning with faint praise?
        
               | zimpenfish wrote:
               | Not hugely - e.g. taking the 38 prompts including "a
               | painting by William Adolphe Bouguereau" (which is easily
               | the worst of the modifiers for me), 10 of them I'd say
               | were "no clue to the prompt". For the 56 Munch images, 54
               | were good and 2 were quibbles ("an isopod as an angel"
               | had no isopod but did have an angelic human - is that a
               | pass or no?)
               | 
               | (Which is probably better than you'd get from a human
               | given the exact same prompts.)
        
               | twojacobtwo wrote:
               | > SD _barely_ follows prompts at all.
               | 
               | > ...and _most_ of them could be linked to the prompt
               | they came from.
               | 
               | You made it sound as if there is almost no connection
               | between the prompt and the images and zimpenfish said
               | that the majority could be linked, implying a strong
               | connection. He/she doesn't have to be praising it at all
               | to counter your claim.
        
           | johnfn wrote:
           | Which one are you comparing against? I've tried hundreds of
           | prompts between SD and DALL-E and get comparable results.
           | Midjourney was lagging for a while, but the new --testp
           | parameter is really remarkable, which, in my view, makes it
           | superior not only to Stable Diffusion but also to DALL-E as
           | well.
        
             | yreg wrote:
             | My experience is that with prompts that fit into OpenAI's
             | limiting content policy DALL-E text2img results are usually
             | much better. And I use SD like 95% of the time, so it's not
             | the case that I would be more used to DALL-E.
        
               | KaoruAoiShiho wrote:
               | I need some examples because I don't really see it for
               | the vast majority of usecases.
        
               | yreg wrote:
               | Here I wanted to illustrate the game Waffle[0], first
               | attempt with Dalle was pretty good, not true for SD:
               | 
               | https://labs.openai.com/s/rCzJwauuiaIj1Pd3IyJGaHS3
               | 
               | Here I wanted an illustration of a nuclear plant in a
               | japanese landscape, first attempt with Dalle produced
               | multiple good results. I tried SD and MJ (back when MJ
               | didn't use SD) as well, had trouble even with multiple
               | attempts:
               | 
               | https://labs.openai.com/s/FxhxtMFe3kFS8msV8vekRAJ3
               | 
               | There are others, but anyway I think my examples are not
               | important since it will be always easy to cherry pick
               | prompts that yield the best results in model X.
               | 
               | In my experience SD is good at producing (especially non-
               | photo-realistic) art that looks pretty and DALL-E is
               | better at following a specific prompt when I know what
               | exactly I want.
               | 
               | Of course I recognise your experience might (and probably
               | does) differ.
               | 
               | [0] - https://wafflegame.net/
        
             | gpt5 wrote:
             | An easy example of DALL-E superiority is its ability to
             | combine two different concepts together.
             | 
             | For example, DALL-E performs extremely impressively on
             | prompts in the format of "a still of homer Simpson in The
             | Godfather" (replace character and movie as you wish). with
             | the other two it's a lot of misses
        
               | avereveard wrote:
               | from dall-e: https://i.imgur.com/RHiOjuM.png
               | 
               | I would argue that none of these follow the prompt. they
               | all represent a goodfather frame in simpson stile, which
               | is not about placing homer in a godfather still.
        
               | bitcurious wrote:
               | >An easy example of DALL-E superiority is its ability to
               | combine two different concepts together.
               | 
               | This is a con for some prompts. As an example, I asked
               | for a painting of an elephant and a dog drinking tea
               | together. The result was a dog with an elephant nose next
               | to a teapot.
               | 
               | A similar misfire was the word 'porcupine' which drew
               | pigs, I guess because porc is in it? Anyway, it's idea-
               | blending is a little too aggressive.
        
               | TillE wrote:
               | Yeah you're right that Stable Diffusion produces garbage
               | for that prompt.
               | 
               | I'd love to see a site with lots of examples of the same
               | prompt fed into various models, I assume someone has
               | already made that.
        
               | zimpenfish wrote:
               | > Yeah you're right that Stable Diffusion produces
               | garbage for that prompt.
               | 
               | I dunno, I generated 20 images from that prompt locally
               | and got three good ones[1].
               | 
               | https://imgur.com/a/rZ6wOEF
        
               | wunderbaba wrote:
               | What? None of the people in these images are even
               | remotely recognizable as Homer Simpson.
        
               | zimpenfish wrote:
               | What would you count as a pass then? A literal rendering
               | of the cartoon Homer Simpson on top of a still from the
               | actual Godfather film?
        
               | cbozeman wrote:
               | With StableDiffusion I can buy a used RTX 3090 on eBay
               | for $650, tell the model to generate 5,000 images, and
               | then review each one until I find what it is I'm looking
               | for.
               | 
               | Turns out a shitload of misses are acceptable when it
               | only takes 4-7 seconds to generate an image from a
               | prompt. 5000 generations on an RTX 3090 takes around 7
               | hours +/- 30 minutes, by the way.
        
               | johnfn wrote:
               | While this is likely true for this specific prompt, I
               | think that cherry-picking a single prompt that DALL-E
               | outperforms SD on is not _super_ indicative of anything.
               | I 've conversely found a large number of prompts where SD
               | outperforms DALL-E, either in aesthetic quality or just
               | following directions! I think you'd really have to
               | compare both of them across a large number of prompts of
               | different types to be sure.
        
         | nextstep wrote:
         | this is something that people only on HN would write/believe.
         | Missed the boat on what? Giving away free images from a prompt?
         | 
         | This is all early days and these demos are neat but the real
         | value is yet to be seen. Maybe when this technology is licensed
         | and integrated into Photoshop or Instagram or something like
         | that.
        
         | miles wrote:
         | > Midjourney and Stable Diffusion emerged and got to the point
         | where they produce images of equal or better quality than
         | DALL-E
         | 
         | I cannot speak to DALL-E's results, as the signup process is
         | currently broken (after providing email, name, and phone
         | number, was met with "We're experiencing a temporary issue with
         | signups due to a vendor outage. We apologize for the
         | inconvenience!"), but the Stable Diffusion results I've been
         | getting are not just unusable, but downright bizarre... here
         | are the four images it produced for "morihei ueshiba doing a
         | double backflip": https://imgur.com/a/EvkQpBT
        
           | miles wrote:
           | Finally was able to get the signup process sorted (discovered
           | that I had to use a different email address than the one I
           | had originally requested beta access with); DALL-E's results
           | for the same prompt were more human at least:
           | https://imgur.com/a/OahhDS4 .
        
         | minimaxir wrote:
         | I am surprised OpenAI didn't adjust the price of DALL-E 2 given
         | the rise of free/low-cost competitors.
         | 
         | Granted, DALL-E appears to be buckling under demand regardless
         | so the supply/demand curve doesn't warrant a price drop yet.
        
           | ortusdux wrote:
           | Name brand recognition goes a long way.
        
         | kylevedder wrote:
         | I've spent a significant amount of time playing with the
         | variety of Diffusion models available and DALLE 2 tends to
         | produce much better quality images. The other killer feature is
         | DALLE 2 has support for in-fill.
        
         | ericd wrote:
         | This has been the dominant story going around, I guess because
         | people want it to be true since they're pissed at OpenAI for
         | not being so open, but StableDiffusion's text2image is nowhere
         | near as good as DALL-E 2 in my experience. DALL-E 2 is
         | _incredible_ at that, StableDiffusion is not.
         | 
         | But maybe it doesn't matter, because many times more people are
         | playing around with StableDiffusion, such that the absolute
         | number of good images being shared around is much higher with
         | StableDiffusion, even if the average result isn't great.
        
           | johnfn wrote:
           | > I guess because people want it to be true since they're
           | pissed at OpenAI for not being so open
           | 
           | This is honestly not my experience at all. When I first tried
           | SD and MJ, I did so with a very clear and distinct feeling
           | that they were "knock-off DALL-Es" and I strongly doubted
           | that they would be able to produce anything on the level of
           | DALL-E. Indeed, I believed this for my first couple hundred
           | prompts, mostly because I didn't know how to properly prompt
           | them.
           | 
           | After using them for around a month, I slowly realized that
           | this was not the case, and in fact they were outperforming
           | DALL-E for most of my normal usage. I have a bunch of prompts
           | where SD and MJ produce absolutely beautiful and coherent
           | artwork with extremely high consistency, that when sent to
           | DALL-E, give significantly worse results.
        
             | wunderbaba wrote:
             | It depends on what you're generating, complex prompts in
             | DALLE ("a witch tossing rapunzels hair into a paper
             | shredder at the bottom of the tower") blow midjourney and
             | stable diffusion out of water.
             | 
             | But if all you're doing is the equivalent of visual mad
             | Libs: "Abraham Lincoln wearing a zoot suit on the moon.",
             | then SD and MJ suffice.
        
           | enlyth wrote:
           | Yes, it's true, I've tried all the available models and
           | DALL-E 2 outperforms Stable Diffusion. It understands prompts
           | way better and SD sometimes just plainly ignores parts of
           | your prompt or misinterprets them completely. SD cannot
           | generate hands at all for example, they look more like
           | appendage horrors from another dimension.
           | 
           | OTOH, the main limiting factor for DALL-E 2 from my point of
           | view is the ultra-aggressive NSFW filter. It's so bad that
           | many innocent prompts get stopped and you get the stern
           | message that you'll be banned if you continue, even though
           | sometimes you have no idea which part of the prompt even
           | violated the rules.
        
             | whywhywhywhy wrote:
             | End of the day. The hands don't matter and pointing out
             | that it's worse because of that when the benefits of SD are
             | so huge means absolutely nothing.
             | 
             | Dall-E can't even do many of the images SD can so seems
             | silly to hold hands up as the AI art tool Turing test.
        
             | macrolime wrote:
             | It's not true that SD cannot generate hands. It's a bit
             | tricky, but it's possible.
             | 
             | Sometimes hands will turn out just fine and sometimes they
             | will suddenly become fine after some random other stuff is
             | added to the prompt.
             | 
             | It's clearly still missing a bit in terms of accurately
             | following prompts, but it's capable of generating a lot of
             | things that may not have obvious prompts. This should
             | improve a lot with larger models. I believe SD is already
             | working on it.
        
           | lolinder wrote:
           | It's not just many times more people, it's also the fact that
           | Stable Diffusion can be used locally for ~free.
           | 
           | If I get a bad result from DALL-E 2, I used up one of my
           | credits. If I get a bad result from Stable Diffusion running
           | on my local computer, I try again until I get a good one. The
           | result is that even if DALL-E 2 has a better success rate per
           | _attempt_ , Stable Diffusion has a better success rate per
           | _dollar spent_.
           | 
           | This also affects the learning curve. I've gotten pretty good
           | at crafting SD prompts because I could practice a _lot_
           | without feeling guilty. I never attempted to get better with
           | DALL-E 2, because I didn 't really want to spend money on it.
        
           | educaysean wrote:
           | From my experience there isn't a clear difference in quality
           | between the output produced by Dalle2 and Stable Diffusion.
           | They both suffer from their own unique idiosyncrasies, and
           | the result is that they have differently shaped learning
           | curves.
           | 
           | I do admit that I rate the creativity of Dalle2 higher than
           | that of SD. It can occasionally create really unexpected and
           | exciting compositions, whereas SD will more often lean more
           | conventional.
        
           | hwers wrote:
           | I genuinely think stable diffusion is better than dalle.
           | There's a really obvious ugly artifact on almost all the
           | dalle image's I've seen that SD doesnt suffer from.
           | 
           | But anyway, SD is far superior even if you consider dalle
           | better per image since you can create 1000 SD outputs and
           | just pick the one you like best (which for sure will have one
           | that's better than the dalle output you got)
        
           | madrox wrote:
           | You're right. History has shown the best quality product
           | doesn't always win if there's a "just okay" solution laying
           | around that's more accessible. VHS and Windows both come to
           | mind.
        
         | diebeforei485 wrote:
         | I suspect it was driven by moral panic and not necessarily
         | business considerations.
        
         | logicchains wrote:
         | Maybe OpenAI is learning the same lesson compiler and runtime
         | vendors learned a couple decades ago: it's very hard to compete
         | with open source.
        
         | mberning wrote:
         | It reeks of "product management". It's getting managed out of
         | relevancy.
        
         | datacruncher01 wrote:
         | I think the same thing is going to happen to the new models as
         | well. Something better and more efficient is going to eat their
         | lunch. Maybe we'll see more application specific models and a
         | general model sitting on top of that to compost results
         | together down the road.
        
         | josephcsible wrote:
         | Saying "missed the boat" makes it sound like it was just bad
         | luck and not OpenAI's fault, but I'd argue that it was their
         | fault. They could have made DALL-E open source; they just chose
         | not to.
        
         | Gregioei wrote:
         | You do understand how fast this space is moving and how new
         | everything is right?
         | 
         | Your criticism is in my opinion not valid.
         | 
         | Do they need to react to the market? Perhaps depends on what
         | there goal even is.
         | 
         | Is dall-e 2 fun to use and cost wise totally fine? For me yes.
         | 
         | But I also have people running SD with a hacky webui on some
         | good GPUs for free. How many people actually have access to it.
         | 
         | Is there also a good benchmark on which tool is inherent
         | better? Because it is also totally fine to have multiple
         | offerings.
         | 
         | I really don't sure if you ever seen product development for
         | yourself.
         | 
         | Dall-e clearly took the potential misuse risk much further than
         | others.
        
           | naillo wrote:
           | Just as user friendly as dalle but for stable diffusion and
           | more free credits: https://beta.dreamstudio.ai/dream
        
             | Gregioei wrote:
             | I did a short test and created already 20 pictures withhe
             | same dall e prompt without a result as good as dall-e.
             | 
             | And in another test the faces are super shitty.
             | 
             | Dall e also gives you 4 pictures per credit and dream 1.
             | 
             | So good to have more options I think. Two different
             | products feeling different.
        
               | naillo wrote:
               | I generally find stable diffusion outputs better than
               | dalle so it's surprising you say that. A good prompt
               | makes a big difference though.
        
               | Gregioei wrote:
               | This doesn't make my experience less true.
               | 
               | But I also played around with sd.
               | 
               | I still think my original comment is valid.
        
               | lairv wrote:
               | I think it depends a lot on what you mean by "better
               | output"
               | 
               | DALL-E is very good at conceptually representing complex
               | prompt. Something like "a bear with a diving mask surfing
               | in the ocean, a pelican is sitting on its shoulder",
               | DALL-E will immediately produce coherent results, while
               | SD requires lot of prompt tuning, and sometimes it's even
               | impossible to get it to represent some concepts (I
               | haven't tested this particular prompt tho)
               | 
               | SD is good for producing "artistic" images if that makes
               | any sense
               | 
               | edit: ok I tried the "surfing bear" prompt with DALL-E 2
               | and SD and the results are consistent with my point, I
               | put the raw prompt without tuning, and cherry picked the
               | best image out of 4 with both models, here is what I got
               | :
               | 
               | DALLE-2:
               | https://labs.openai.com/s/Q9824QOfXln4r9FLFNM3v9v1
               | 
               | SD: https://imgur.com/a/czcMgiC
               | 
               | For SD, even by tuning the prompt I wasn't able to get
               | the diving mask or the bird on the shoulder
        
           | orangecat wrote:
           | _But I also have people running SD with a hacky webui on some
           | good GPUs for free. How many people actually have access to
           | it._
           | 
           | The re
        
             | Gregioei wrote:
             | The credits from dall e are still cheap and you get some
             | every month.
        
         | matsemann wrote:
         | What really exited me about SD was how many creative things it
         | was used for because people could modify and use it. Just the
         | first week I saw tens of different cool projects here on HN.
         | With Dall-E, I have only ever seen prompts+images.
        
       | xwdv wrote:
       | It's too late I personally don't give a fuck about DALLE when
       | there's better alternatives easily available now, they missed the
       | boat. The brand is tarnished IMO.
        
       | draw_down wrote:
        
       | WalterBright wrote:
       | I wondered what stable diffusion was, so went to
       | stablediffusion.com. The front page gives no indication as to
       | what it is. So I clicked on their FAQ page:
       | 
       | > What does Stablilty AI do? Stability AI is building open AI
       | tools to provide the foundation to awaken humanity's potential.
       | Our values are lived by every team member and shown by everyone
       | who excels at Stability AI. They are how we measure ourselves and
       | our work. Our vibrant communities consist of experts, leaders and
       | partners across the globe. They are developing cutting-edge open
       | AI models for Image, Language, Audio, Video, 3D, and Biology. AI
       | by the people, for the people.
       | 
       | Still don't know what it does. Continuing to the next FAQ:
       | 
       | > What's our business model? We're a company of builders who care
       | deeply about real-world implications and applications. Many of
       | our most considerable advances grow from working across multiple
       | teams. We are unafraid to go against established norms and
       | explore creativity. Our primary drive is to generate breakthrough
       | ideas and convert them into solutions. We respect innovation over
       | tradition. We trust that our differences make us more robust, and
       | so we seek reason within every difference of perspective.
       | 
       | Oh well. I give up.
        
         | xigency wrote:
         | Not meaning to be rude, but you may be living under a rock.
         | There've been many, many HN posts [0][1] about the open source,
         | free to use txt2img and img2img ML model that is Stable
         | Diffusion over the past few months.
         | 
         | Though I agree that their website provides no useful
         | information at all.
         | 
         | [0]
         | https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...
         | 
         | [1] https://news.ycombinator.com/item?id=32665587
        
           | WalterBright wrote:
           | > you may be living under a rock
           | 
           | No wonder my back hurts.
        
         | bogwog wrote:
         | Stable Diffusion is this: https://github.com/CompVis/stable-
         | diffusion
         | 
         | It doesn't have a fancy product website because it's not really
         | a product, it's 'just' the model. Developers can use it to
         | build a product. The Stability.ai people themselves built one
         | called Dream Studio (https://dreamstudio.ai), but there are
         | also some free and open source frontends you can run on your
         | own hardware if you have a GPU.
         | 
         | I guess your confusion comes from the fact that people tend to
         | talk about "Stable diffusion" and not "Dream Studio" or one of
         | the many frontends available for it.
        
         | skybrian wrote:
         | Yeah, the home page is terrible. They have bigger plans, but
         | for now, it's another machine learning image generation tool.
         | An easy way to try it out is to get an account at
         | http://dreamstudio.ai/.
        
         | TulliusCicero wrote:
         | Stable Diffusion is an image generation model that's been
         | released to the public at large. If you have a decent GPU, you
         | can run the model yourself. (Even without a decent GPU
         | technically you can still do it, though it's much slower)
        
         | aetherson wrote:
         | Stable Diffusion is an open source image generation AI: it does
         | roughly the same thing that Dall-E and others do.
        
           | WalterBright wrote:
           | Thanks (to you and the others who helpfully replied).
           | Apparently the web page authors kinda missed that completely!
        
         | ayewo wrote:
         | Stable Diffusion is an AI model for computer generated art.
         | 
         | https://github.com/CompVis/stable-diffusion
        
       | etaioinshrdlu wrote:
       | OpenAI already behaves like big tech. Nothing gets done or spoken
       | without PR, marketing, and lawyers having their say. Not a great
       | signal for their future trajectory.
        
         | Workaccount2 wrote:
         | To me what makes them like big tech is way overcharging for
         | something that has a nominal cost to them, on the basis of
         | "value produced".
        
           | bogwog wrote:
           | How do you build a business if you sell your
           | products/services at cost?
        
             | Workaccount2 wrote:
             | The cost of everything tech is grossly inflated. Look at
             | Amazon, they make an absolute killing with AWS. It's just
             | fat margins all over. Google and MS price competitively
             | with AWS and happily make a killing too.
             | 
             | Sooner or later though, someone is going to come along and
             | say "You know, I'd be fine with a 5% profit margin" and the
             | house of cards will fall while the tech bros cry "value"
             | the whole way down. You could trick yourself into thinking
             | a sharpie is worth $40/mo if you drink enough of the
             | "value" coolaid.
        
           | stingraycharles wrote:
           | Value-based pricing really is something much bigger than just
           | "big tech", and is actually something every business owner
           | should do. I'm surprised to find a comment against value
           | based pricing on HN to be honest.
        
             | Workaccount2 wrote:
             | Imagine plumbers used value based pricing.
        
       | hiidrew wrote:
       | I find it somewhat humorous based on their name that they're
       | losing exposure to Stable Diffusion, an actual open source
       | alternative. Although they created a lot of hype with their
       | waitlist strategy. I'm personally partial to Midjourney, the
       | discord UI is weird but once you get used to it, just looking at
       | other's photos is equally as impressive as creating your own.
       | 
       | Anyways, this space seems to be moving so quickly that it's
       | difficult to keep up anyway.
        
       | SanderNL wrote:
       | I was on the waiting list for a long, long time. The waiting and
       | their "safety" features left a bad taste. Now I can't even be
       | bothered.
       | 
       | Thanks Stable Diffusion!
        
       | not2b wrote:
       | Just got
       | 
       | "We're experiencing a temporary issue with signups due to a
       | vendor outage. We apologize for the inconvenience!"
        
       | savant_penguin wrote:
       | "Competition is a bitch" - OpenAI, probably
        
       | Workaccount2 wrote:
       | The divide between SD and DALLE doesn't matter for anything
       | meaningful. The tech is advancing at a breakneck speed. I really
       | cannot emphasize that enough. If SD is behind today, I wouldn't
       | even be the least bit surprised if it's ahead in one month from
       | now. Or be where DALL-E is today and DALLE just be that much
       | further.
       | 
       | I suspect that even in 6 months from now people will be starting
       | to see consistently good generation from "worse" prompting. The
       | cat is out of the bag on this and running way faster than
       | anticipated. Hold on, because the "AI generating fake media"
       | thought experiment of the last decade has now officially gone
       | live.
        
       | Patrol8394 wrote:
       | Trying to signup, but they ask for the phone number ... no thank
       | you
        
       | whywhywhywhy wrote:
       | It's just too late no idea why anyone in their right mind would
       | use a pay-per-image tool that modifies what you put into it when
       | they can have an unlimited and open local tool.
       | 
       | Turns out all that "waitlist", the ethics lecturing, letting in
       | only bluechecks and the larping about how dangerous it is doomed
       | your product in the end.
       | 
       | Hopefully the next time someone makes a tool as revolutionary as
       | this they'll remember the mistakes of OpenAI.
        
         | tucif wrote:
         | Is it really doomed because competition showed up? I'd be
         | surprised if they didn't get quite a bit of paying users right
         | away.
         | 
         | I get that competing with an open self-hosted alternative is a
         | tough sell, but is this really different from other pay vs
         | self-host scenarios?
        
           | whywhywhywhy wrote:
           | Try them both and see for yourself, see how generating 1000
           | images via SD feels vs 1000 images on Dall-E 2.
           | 
           | I'd bet one you'll hit 1000 generations much faster than the
           | other.
        
         | langitbiru wrote:
         | "an unlimited and open local tool" -> Not everyone has GPU or
         | powerful machines.
        
           | zaptrem wrote:
           | It has been optimized to the point where it now runs (albeit
           | slowly) on _four year old smartphones_
           | https://twitter.com/wattmaller1/status/1573768941096374274
        
             | LanternLight83 wrote:
             | Anecdotally, it runs twice as fast and with 20% lower VRAM
             | use on my machines than it did when I first experimented
             | with it in the first week after release. I no longer need
             | to hyper-optimize memory use (as a layman!) and patch that
             | optimized model into web-frontends that don't come with it
             | to get it to run on a 4gb card, there are flags now that
             | make it "Just Work"(tm). Down from 90s/5122img -> ~35s.
             | Exited to try it out on some AMD APU's when I've got time
             | to see if that can outperform my dedicated but ancient 4gb
             | card.
        
           | bestcoder69 wrote:
           | You can use it locally or hosted. The hosted services are all
           | cheaper than dalle2. It runs on google colab's free tier.
           | 
           | Also, you can run it locally on a non-powerful machine. It
           | just takes longer, but you can also just queue up as many
           | prompts as you want and let your machine crank them out at
           | its own pace. I use a first-gen macbook air m1 and it usually
           | takes ~90s to generate an image with my usual settings.
        
           | naillo wrote:
           | https://beta.dreamstudio.ai/dream
        
             | thorum wrote:
             | And since SD is open you're not limited to DreamStudio,
             | there is a whole ecosystem of other webapps being developed
             | like:
             | 
             | https://dreamlike.art
             | 
             | https://patience.ai
             | 
             | https://getimg.ai
             | 
             | Plus integrations into other applications like Photoshop
             | and Canva. Open source has such a huge multiplier effect
             | for innovation.
        
           | whywhywhywhy wrote:
           | Artists, creatives and directors who benefit most from this
           | do.
           | 
           | For anyone else stability offer a paid web version.
        
           | Gigachad wrote:
           | It runs pretty well on my MacBook.
        
       | cptaj wrote:
       | It says its not available in my country. Why the region lock?
        
       | vario wrote:
       | I was on that waiting list, and when I was finally invited, I
       | couldn't log in for unknown purposes--it asked me to apply for
       | wait again. That was it for me--life is too short for tech drama.
        
       | hugozap wrote:
       | I've tried to contact OpenAI and they never answer. I was excited
       | about them in the beginning but not anymore. Looks like they are
       | really disconnected from the community and not interested in
       | engaging.
        
       | thorum wrote:
       | > We are currently testing a DALL*E API with several customers
       | and are excited to soon offer it more broadly to developers and
       | businesses so they can build apps on this powerful system.
       | 
       | Here's the real news! Just hope they disable the autoban for API
       | users. It's one thing to filter NSFW outputs like SD websites do,
       | but blocking application API accounts for requests by end users
       | would make it unusable.
        
       | simonswords82 wrote:
       | Just tried to sign up and it says sign up is not possible due to
       | a vendor outage :(
        
       | asciimov wrote:
       | Well it's a shame they still want my phone number. I don't wanna
       | give some random company my number if they aren't gonna call me.
        
         | smallerfish wrote:
         | Plus they disallow voip, which is a problem for me since I only
         | have voip.
        
         | lazyjones wrote:
         | It's pretty evil to troll people into giving them e-mail and
         | name and then have the audacity to ask for a phone number.
         | Without giving people who don't want that the possibility to
         | delete the previously entered personal data...
        
         | O__________O wrote:
         | Agree.
         | 
         | Specifically, signup process is: email, email-verification,
         | create-password, full-name, phone. Leaving the process to try
         | the login will return you to the request for a phone.
        
           | londons_explore wrote:
           | The phone number is to try and stop people signing up
           | multiple times to get more free credits.
           | 
           | At least in the USA, getting hold of large numbers of phone
           | numbers for free isn't easy.
        
             | arecurrence wrote:
             | Indeed as VOIP numbers are banned. However, I know a number
             | of people that aren't using DALL-E because they only have
             | VOIP numbers (This is an ever growing reality today). They
             | got to the phone step and were unable to continue (and
             | support never replied to their requests for help).
        
             | mcbuilder wrote:
             | When you are competing directly with a comparable free
             | solution maybe these hoops don't make sense.
        
             | [deleted]
        
             | Dma54rhs wrote:
             | Starts from 1 cent a number, doesn't have to from USA. It
             | does add cost I agree, but these things are sold in bulk
             | for very cheap.
        
               | O__________O wrote:
               | Link?
               | 
               | Even for a one-time-use for verification, last I looked
               | for one-off number verifications, it was $1 per
               | authentication; to be fair, didn't search too hard.
        
         | pc86 wrote:
         | I don't want to give some random company my number _because
         | they might call me_.
        
       | dqpb wrote:
       | I just want to commend OpenAI for protecting us from the sight of
       | breasts. As we all know, breasts are some of the most dangerous
       | and corrupting things known to humankind. Unleashing upon the
       | world an advanced AI capable of rendering breasts would almost
       | certainly result in the complete collapse of civilization as we
       | know it.
       | 
       | However, President Raisi is very disappointed by OpenAI's brazen
       | and disgusting display of female hair. It's extremely insensitive
       | that OpenAI has enforced only a mere subset of the worlds
       | cultural mores.
        
       | hit8run wrote:
       | F$ Microsoft and their shady move to lock openai. To be honest I
       | am not interested in their offering. Yes they were fast. But
       | there are also many true open source models available now. I
       | don't want Microsoft to decide what I am allowed to do with it.
       | Also they hide usable features behind a very expensive paywall
       | and limit the budget to something like 25$ per month LOL. So
       | actually they are like: "Here is a glimpse of it. But build
       | nothing with it for broader audience. We want to see what you
       | come up with and then copy paste it into our own products."
        
       | karmasimida wrote:
       | With Stable Diffusion fully open sourced.
       | 
       | Honestly, I don't think I need DALL-E right now, as SD is free
       | and MUCH MUCH more customizable.
        
       | stephc_int13 wrote:
       | Not interested anymore, thank you for nothing.
        
       | nojvek wrote:
       | DALL.E == MS Internet Explorer, StableDiffusion == Chromium.
       | 
       | Hard to beat a high quality open source product. OpenAI missed
       | the boat on "Open AI"
        
       | mcherm wrote:
       | It's unfortunate that they rejected my attempt to create an
       | account because my phone number wasn't from one of their standard
       | providers.
        
         | lenwood wrote:
         | Why do they require a phone number to begin with?
        
         | Bluecobra wrote:
         | Same here, I have a Google Voice number and it rejected it.
         | Pretty lame, it's not like I am going to be banking with them.
        
           | peanuty1 wrote:
           | That sucks, I only have a Google Voice number.
        
       | m3kw9 wrote:
       | Now that stable diffusion is starting to eat onto its market
       | share they unlock the artificial supply mechanism
        
       | toxik wrote:
       | It's a funny world where OPEN AI lost the battle because... it
       | wasn't open.
        
         | kwertyoowiyop wrote:
         | Crazy talk. Next you'll be saying Human Resources isn't about
         | protecting employees!
        
           | evouga wrote:
           | Of course not. It's about managing the company's... human
           | resources.
        
             | moffkalast wrote:
             | And executives are the people you execute if a company goes
             | bankrupt.
        
       | soared wrote:
       | > We're experiencing a temporary issue with signups due to a
       | vendor outage. We apologize for the inconvenience!
       | 
       | I can't imagine how many signups they're getting right now. It's
       | almost behoove them to pregenerate a bunch of prompts and give
       | people a sandbox version without signing in.
        
       | petarb wrote:
       | I wasn't able to create an account either, didn't say why just
       | gave me an error. Disappointing after I gave them all the info
       | they wanted. Stable Diffusion and Midjourney it is
        
       | a3w wrote:
       | I gave them my phone number, only then the signup failed. Well
       | thanks for nothing, that is on par with scammers.
        
       | pentagrama wrote:
       | Just signed up and phone number verification is mandatory to use
       | the service. Don't want to share my phone number with this
       | service. Now I'm stuck on the phone number form and can't delete
       | my account. Don't recommend.
        
       | davidbarker wrote:
       | Apparently sign ups are currently down because of a "vendor
       | outage". Perhaps because of the increased load?
       | 
       | Once you're able to get access, I believe you'll receive 15
       | credits per month for free. Each credit allows one generation,
       | and each generation produces 4 images.
       | 
       | Rather than using up credits trying to learn how to formulate
       | your prompts, I ran hundreds and uploaded them to
       | https://generrated.com (I posted it a couple of weeks ago as a
       | Show HN) -- hopefully they might be useful as a starting point
       | and save you some credits/money.
        
         | johndough wrote:
         | Thank you for the compilation. Much appreciated! I just wanted
         | to mention that the images in the row labeled "pen and ink
         | caricature" do not load.
        
           | davidbarker wrote:
           | Oops. Thanks! I'm not sure how I missed that. I've pushed a
           | fix that should be live in 10 minutes.
        
         | mFixman wrote:
         | This is a fantastic project.
         | 
         | I really liked the 16th century Indian painting of an astronaut
         | [1], and now I see a role of programs like DALL-E in giving
         | people a good intuition on how to identify art from different
         | styles and periods.
         | 
         | [1]
         | https://generrated.com/?prompt=16thCenturyPainting&subject=a...
        
           | davidbarker wrote:
           | Thank you for your kind words. I've learnt a lot about art
           | styles while I've been putting the site together -- both from
           | the DALL-E 2 generations and research I've done to make sure
           | those generations at least somewhat match the style I
           | requested.
           | 
           | I'm not sure if you know (it might not be obvious -- my
           | fault!) but you can click on a prompt heading to see all 20
           | images created in one place, if that's more useful to you.
           | e.g. https://generrated.com/prompts/16thCenturyPainting
        
         | Bluecobra wrote:
         | This is pretty neat! Can also you use DALL-E mini/Craiyon to
         | help with prompts or that going to be way off?
        
           | davidbarker wrote:
           | Thanks! I actually don't have much experience with DALL-E
           | mini/Craiyon, but I have heard people have taken the output
           | of Craiyon and using it as the input for Stable Diffusion to
           | improve the quality.
           | 
           | I've been doing something similar with some DALL-E 2 images:
           | https://news.ycombinator.com/item?id=33011336
        
         | jaynetics wrote:
         | Some more artists that often seem to yield interesting results:
         | Giger, Klee, Klimt.
        
         | daqhris wrote:
         | Oh, for sure, they will be useful to me. I got access to DALL-E
         | shortly before summer but I haven't played with it due to
         | "prompts and credits" scheme. Thank you!
        
         | dereg wrote:
         | This is great. I really love all my Matisse generations. On the
         | other hand, I find DALL-E to be uniformly bad at recreating
         | Edward Hopper. Almost none of the generations capture the
         | spirit of his work. It's especially obvious when you run the
         | same Hopper prompts on Stable Diffusion. I wonder why.
        
       | IceWreck wrote:
       | Yeah, no.
       | 
       | Too late and stable diffusion works on my machine, I don't have
       | to depend on anyone else to use it unlike this.
        
       | loufe wrote:
       | I got early access months ago and have been loving it. I had so
       | much fun thinking of stupid prompts with colleagues and their
       | results hang randomly around the office. I made an effort to
       | start making birthday/other holiday cards for my family using
       | Dall-E by telling stories in an art style, like old oil paintings
       | or pencil drawings, which has been a huge success. I'm sold,
       | though I do agree the pricing is a tad steep especially
       | considering the presence of competitors.
        
       | langitbiru wrote:
       | I wonder how people behind Imagen feel about this. Stable
       | Diffusion is open source. DALL-E is open for public (albeit, with
       | some limitations).
       | 
       | https://imagen.research.google/
        
         | gorkish wrote:
         | Cynical take: When has Google ever cared whether or not anyone
         | else can play with their toys? Plus it's less likely to be
         | cancelled if it's never turned into a product.
        
       | barbariangrunge wrote:
        
       | jobs_throwaway wrote:
       | Even if Dall-E was free, the "safety" filters make it a non-
       | starter for me. Its totally asinine to put any kind of content
       | filters on a creativity tool.
        
         | hbn wrote:
         | They also mess with your prompts in an attempt to make them
         | more diverse by inserting words like "black" into prompts. e.g.
         | if you type "person in an office" it'll generate some images
         | from the prompt "black person in an office"
         | 
         | People were able to discover this by typing a prompt, something
         | like "person holding a sign that says" and it would output
         | pictures of people holding signs that just say the word
         | "black", revealing that it was actually generating images from
         | the prompt "person holding a sign that says black"
         | 
         | https://twitter.com/rzhang88/status/1549472829304741888?t=R4...
        
           | astrange wrote:
           | I doubt it literally just appends words, it probably gets
           | processed with a GPT prompt like "take this string and make
           | it specifically mention X ethnicity".
        
       ___________________________________________________________________
       (page generated 2022-09-28 23:00 UTC)