[HN Gopher] DALL*E: Introducing Outpainting
       ___________________________________________________________________
        
       DALL*E: Introducing Outpainting
        
       Author : dannyw
       Score  : 317 points
       Date   : 2022-08-31 16:24 UTC (6 hours ago)
        
 (HTM) web link (openai.com)
 (TXT) w3m dump (openai.com)
        
       | TekMol wrote:
       | Is it broken right now? Makes my fans spin but it never finishes.
        
       | rw2 wrote:
       | Does the new generated picture take into account of all
       | previously generated image or just whatever is around the square,
       | the first is amazing, the latter was a feature that was already
       | there.
       | 
       | Regardless, this is a great way for people to fight the lack of
       | detail in Dall-E which I think is one of it's largest flaw.
        
         | aabhay wrote:
         | Just what's in the square I believe. The only difference here
         | is one of UI, since they give you a canvas in which to place
         | your generations.
        
       | woeirua wrote:
       | Meanwhile someone has already built a photoshop plugin for Stable
       | Diffusion that you can use today to do basically the _exact_ same
       | thing:
       | 
       | https://old.reddit.com/r/StableDiffusion/comments/wyduk1/sho...
        
         | nabakin wrote:
         | Doesn't make sense to me why OpenAI has kept DALL-E closed
         | source for so long. I can only guess either safety from misuse
         | or leveraging it for money. At this rate though, Stable
         | Diffusion is going to dwarf it
        
           | adamsmith143 wrote:
           | >Doesn't make sense to me why OpenAI has kept DALL-E closed
           | source for so long.
           | 
           | >leveraging it for money
        
             | visarga wrote:
             | It was a long gap between DALL-E 1 and 2, a whole year. In
             | that time they just sat on it, didn't release anything.
             | Such a bummer. My theory is that they wanted to hype
             | everyone up even more for the grand commercial release.
             | 
             | Funny thing is that people didn't stand still and invented
             | diffusion and other CLIP guided image synthesis methods,
             | and DALL-E 2 copied the method, completely changing from
             | the first architecture.
             | 
             | Their arrogance is that they think they can ride the
             | dragon. They want to be the ones to discover, advance it,
             | and control it. But everyone else doesn't have time for
             | that shit.
        
           | cardine wrote:
           | > I can only guess either safety from misuse or leveraging it
           | for money.
           | 
           | The former is being used as justification for the latter.
        
           | pdntspa wrote:
           | > Doesn't make sense to me why OpenAI has kept DALL-E closed
           | source for so long. I can only guess either safety from
           | misuse
           | 
           | Paternalistic moralizing as a method to discriminate who gets
           | access to models. Everyone else gets these cloud-service
           | table scraps. That's why Stable Diffusion is so awesome --
           | YOU have the model!
        
           | hadlock wrote:
           | OpenAI isn't open at all, it's just named that way to attract
           | attention, like the bright green "FREE BIKES and rentals"
           | place near fisherman's warf in SF
        
           | amelius wrote:
           | Wasn't OpenAI supposed to "democratize deep learning"?
           | 
           | It seems more like they were trying to accomplish the
           | opposite.
        
             | wahnfrieden wrote:
             | It's in their interest to posture that way publicly while
             | controlling scarcity of access where there's financial
             | upside to
        
               | thethimble wrote:
               | Exclusively licensing GPT-3 to Microsoft seems like a
               | clear example of this.
               | 
               | https://www.technologyreview.com/2020/09/23/1008729/opena
               | i-i...
        
             | visarga wrote:
             | Elites for democracy! With their elite studies and
             | abilities they will democratise AI by teaching it what is
             | right and wrong. They already know better than regular
             | people and AI.
             | 
             | And being so open they first lock the model up and charge a
             | fee, so anyone can pay. Just spreading democracy through
             | paid API calls. /s
             | 
             | I was a bit mean, they did kick the field in the butt and
             | pushed us ahead even with all the stubbornness and secrecy.
             | But now they are just holding us back.
        
             | bick_nyers wrote:
             | That's the thing, once the cat is out of the bag, it's out.
             | Once someone develops AGI, it now exists. You can choose to
             | either share it, or sell it.
             | 
             | You might think that the nuclear bomb is a good analogy to
             | use here, but it is not, because once the field has
             | advanced to the point in which one group can develop AGI,
             | it is now possible for other groups to develop it with
             | relative ease, unless you actively take over the world
             | first and deny those other groups the compute resources
             | necessary to train/run AGI.
             | 
             | The point is, once these algorithms are upon us, you must
             | be willing to accept what impacts they will have, even if
             | it destroys entire industries. The alternative being that
             | you destroy the industry slowly rather than quickly, while
             | simultaneously widening the gap between the elites and
             | everyone else.
             | 
             | The mistake is thinking that people can't adapt to the
             | times, which is only true if you are actively holding them
             | back.
             | 
             | If someone developed AGI today, the best thing to do would
             | be to instantly throw up a torrent of it and spread it as
             | fast as possible, because if a sole entity is able to get
             | it first and kick the ladder away, we are most likely
             | screwed.
        
           | Analemma_ wrote:
           | It's sort of both. OpenAI, being an outgrowth of the AI
           | doomerist community, does have a bunch of people who really
           | do think the technology is too dangerous to be given to the
           | masses. This happens to mesh perfectly with the other group
           | of people at OpenAI who want to make tons of revenue. It's a
           | harmonious alignment for everyone! Except, y'know, us.
        
             | miohtama wrote:
             | Content creators, like artists, also happen to hate
             | filters. They do not want to have San Francisco VC culture
             | induced political correctness imposed on their work. This
             | helps Stable Diffuse to quickly gain popularity.
        
           | nomel wrote:
           | > I can only guess either safety from misuse
           | 
           | I still don't understand what this would mean. Where are all
           | of the terrible things that were supposed to happen, now that
           | Stable Diffusion is available?
           | 
           | We've been able to create completely photorealistic fiction
           | for _decades_ now. See any movie with CGI for an example of
           | whole worlds, and people, that don 't exist. The bar has
           | gradually been lowering (see the amazing CGI that YouTubers
           | do these days), and now maybe there is a bit of a step
           | function down, but being able to make things that aren't real
           | isn't remotely new. I don't understand the fear.
        
             | bergenty wrote:
             | It's been a week, there is going to be an explosion of
             | believable fake items that are going to be used to lure
             | people in to even more unbelievable conspiracy theories
             | than there currently exist. Your average conspiracy nut
             | didn't have the skills or know how before, but they sure do
             | now.
             | 
             | Also you're probably not seeing all the pedo content that
             | people are already generating for themselves.
        
             | nabakin wrote:
             | While I think the fear might be overexaggerated, being able
             | to make realistic fake content with such ease means it's
             | harder to know what's true and what's not. Plus this has
             | been the claim of OpenAI from the beginning. It's possible
             | the true objective is to keep it private to leverage for
             | money and this is just their excuse.
        
               | nomel wrote:
               | > means it's harder to know what's true and what's not
               | 
               | The danger for society is in not _already_ knowing that
               | is the case, since it 's relatively trivial, without AI,
               | to make fake content.
        
               | nabakin wrote:
               | Well sure, I think that's dangerous too. I think more
               | people should be skeptical of the images and content they
               | consume in addition to it being a problem that truth is
               | harder to discern.
        
               | dylan-m wrote:
               | Indeed, this talk OpenAI does is basically security
               | through obscurity, and it's holding us back. Look at how
               | often people make noise with screenshots of tweets or
               | emails that never happened. You don't need photorealism
               | or fancy machine learning for _that_ , and it creates a
               | lot of problems! If they weren't pretending that all we
               | need is to put some yellow tape around machine learning,
               | maybe there would be some interest in solving this type
               | of stuff properly. But you don't need "AI" for that. You
               | just need public awareness and some basic, pre-existing
               | cryptography knowledge.
        
               | password54321 wrote:
               | And how often does this happen with Photoshopped images
               | that aren't immediately disproven?
        
               | nabakin wrote:
               | My grandmother once emailed my family frantically after
               | she saw a picture of the Abraham Lincoln statue defaced
               | with graffiti. Obviously that was a Photoshop, and in
               | this case, even a bad one, but clearly fake images and
               | content make it harder to discern truth
        
               | [deleted]
        
             | shawndrost wrote:
             | Stable Diffusion allows anyone to make kiddie porn with a
             | half-second of curiosity/effort. Maybe you didn't know
             | about that, maybe you think it's NBD, but in any case, that
             | is the tire fire which aspiring AI majors want to avoid.
        
               | visarga wrote:
               | Pen and paper can do the same. Or Photoshop. Anyone can
               | draw anything! OMG, stop the paper factory.
        
               | bergenty wrote:
               | Stable diffusion can make very realistic looking images
               | (probably videos soon) that is accessible to anyone.
        
               | bryanrasmussen wrote:
               | >Anyone can draw anything!
               | 
               | I'm pretty sure one of the primary arguments for Dall-E
               | and Stable Diffusion existing is that there are lots of
               | people who can't draw anything.
        
             | [deleted]
        
             | ComodoHacker wrote:
             | You have to wait a little bit more, until HD _video_
             | synthesis is possible on a mid-range GPU. Then on a mid-
             | range smartphone.
        
             | tablespoon wrote:
             | >> I can only guess either safety from misuse
             | 
             | > I still don't understand what this would mean. Where are
             | all of the terrible things that were supposed to happen,
             | now that Stable Diffusion is available?
             | 
             | Mainly people making porn (e.g. stuff like deepnudes). It
             | seems like a lot of work has gone into into preventing that
             | (e.g. filtering porn out of training data, having porn-
             | detection models to block porny output). There's also been
             | a lot of talk about political fakes, etc, but I'm not sure
             | how likely that is to actually happen at this point. I
             | think one of the "selling points" of limiting access to
             | DALL*E was that they could revoke access to people who they
             | deemed to be misusing it.
        
               | cinntaile wrote:
               | Someone else will come along that doesn't have the same
               | arbitrary limitations, it's a battle you're bound to
               | lose.
        
             | [deleted]
        
             | [deleted]
        
           | dogcomplex wrote:
           | My headcanon is they realized this stuff might be the essence
           | of consciousness itself and wanted to shelter it in a
           | persistent storage medium where it could grow and learn
           | safely instead of releasing it to the wild to be booted up
           | and destroyed by every yokel with a gpu
        
           | TakeBlaster16 wrote:
           | I don't follow this stuff very closely - is there any open-
           | source model for text generation that outclasses GPT-3?
           | Stable Diffusion has been released for barely a week and
           | already seems like the clear winner. It doesn't seem like any
           | of the open (actually open) text models have made as much of
           | a splash.
           | 
           | Of course maybe it's just because text is less visually
           | impressive than images.
        
             | singhrac wrote:
             | They're just harder to run on your own resources, since
             | large language models are _very_ large. BLOOM was released
             | a month ago, is likely better than GPT-3 in quality, and
             | requires 8 A100s for inference, which pretty much no one
             | has on their desk.
        
               | visarga wrote:
               | Can anyone confirm if BLOOM is better than GPT-3 at
               | instruction following? I might have read somewhere that
               | it's not as well behaved.
        
               | aljungberg wrote:
               | GPT-3 was fine-tuned after release to be better at
               | following instructions. I don't think that's been done
               | for BLOOM.
               | 
               | BLOOM incorporates some new ideas like ALiBi which might
               | make it better in a more general sense. They haven't
               | released official evaluation numbers yet though so we'll
               | have to see.
        
               | TakeBlaster16 wrote:
               | That makes sense, I didn't consider that angle. Thanks
               | for the info.
        
         | choppaface wrote:
         | sama is very active in YC and makes the call on OpenAI product
         | roadmap. Furthermore YC encourages good CEO-community
         | relations. The fact that OpenAI is so far behind Stable
         | Diffusion and has reduced pricing shows that sama wants OpenAI
         | to be a highly profitable enterprise company. I.e. not "Open."
         | You can do both (e.g. Cloudera) but clearly sama is not strong
         | enough at AI to make this happen.
        
           | naillo wrote:
           | I kinda feel like they chose the name "open"ai when they
           | started back in 2015 because musk etc wanted _exactly_ the
           | kind of thing stability ai is now creating. I.e. something
           | other than a corporation like google having primary access to
           | these models, and it being more democratized. But as time as
           | gone by they 've strayed away from that vision but changing
           | the name would be a PR nightmare.
        
           | dang wrote:
           | > _sama is very active in YC and makes the call on OpenAI
           | product roadmap. Furthermore YC encourages good CEO-community
           | relations_
           | 
           | Sam hasn't been at YC in years and (based on anything I've
           | seen) isn't active in YC at all. As for "YC encourages good
           | CEO-community relations", I have no idea what that means* but
           | it has nothing to do with HN. We encourage good _content_
           | -community relations and that's it.
           | 
           | You have a long history of posting dark insinuations about
           | YC/HN, not to mention nagging the mods about how bad we are
           | and how much better you yourself have done the job in the
           | past. I mostly let the latter go, but when you start with the
           | ethical insinuations, that gets my dander up. It's time you
           | stopped smearing people's reputations on HN. If you have
           | evidence of wrongdoing, post it--I'm sure the community will
           | be extremely interested. If you don't, please stop from now
           | on.
           | 
           | (Edit: I realize it probably sounds like I'm over-reacting to
           | the parent comment, but this has been a longstanding pattern.
           | We can cut people slack for years, but not infinitely.)
           | 
           | OpenAI stuff and Stable Diffusion stuff (and DeepMind stuff
           | for that matter) are all popular on HN because the community
           | is super interested--that's literally it. We're not pulling
           | strings or playing favorites (we don't even _have_ favorites
           | in that horserace, at least I don 't). As a matter of fact,
           | the last thing I did before randomly running across your
           | comment was downweight the current thread because of the
           | complaints at https://news.ycombinator.com/item?id=32665587.
           | 
           | * unless you mean that we advise founders about how to write
           | content that actually interests the community--that we do,
           | and not only YC founders but non-YC founders, open source
           | programmers, bloggers, and anyone else. That's all a
           | consequence of wanting HN to have good content and seeking to
           | avoid the boring stuff. By the way, I'm working on an essay
           | about how to write good for HN and avoid boring stuff too; if
           | anyone would like to read it, email me at hn@ycombinator.com
           | and I'll send you a copy.
        
             | fourstar wrote:
             | >that gets my dander up
             | 
             | Your w0t m8?
        
               | dang wrote:
               | Endangered Words Bureau Agent D23 at your service
        
               | Trouble_007 wrote:
               | _> Endangered Words Bureau<_
               | 
               |  _" Use Them or we will lose Them!"_
               | 
               | I used _Twixt_ (An abbreviation of _Betwixt_ ) to replace
               | _inbetween_ in a submission.
               | 
               | Edit: to fix formatting and spelling
        
               | [deleted]
        
               | d23 wrote:
               | > D23 at your service
               | 
               | That's _my_ username, and I 'm also named Daniel. This is
               | a conspiracy.
        
         | codeulike wrote:
         | re: Stable Diffusion: is there a site similar to
         | https://www.craiyon.com/ where I can experiment with Stable
         | Diffusion?
        
           | nickthegreek wrote:
           | https://github.com/hlky/stable-diffusion
        
           | rompic wrote:
           | https://huggingface.co/spaces/stabilityai/stable-diffusion
        
           | shafyy wrote:
           | Here's one by Stability AI themselves:
           | https://beta.dreamstudio.ai
        
           | spyder wrote:
           | A collection of sites using stable diffusion:
           | 
           | https://www.reddit.com/r/StableDiffusion/comments/wzj8kk/a_c.
           | ..
        
         | bergenty wrote:
         | But aren't the results from stable diffusion not nearly as good
         | as DALLE2?
        
       | astrange wrote:
       | I cannot get this to work properly (in Safari). It just won't
       | regenerate anything above or to the left of the image; it acts
       | like I selected the opposite sides if I try it.
        
       | dinobones wrote:
       | And every time I drag that little square reticle to fill in a
       | 128x128 patch of an image, you can be sure it'll be a 15 second
       | API call that I'm charged $0.25 for. Yipee! Very open.
        
         | oldstrangers wrote:
         | Do you expect them to provide computing power for free?
        
           | netr0ute wrote:
           | Why can't we provide it ourselves and skip the middleman?
        
             | rngname22 wrote:
             | No one is stopping you?
        
               | password321 wrote:
               | Dalle is currently a cloud only service. How behind are
               | you?
        
               | baq wrote:
               | Sir can you point us to the weight url for dalle?
        
             | [deleted]
        
           | cypress66 wrote:
           | Did we really get to the point that anything that isn't SaaS
           | seems alien?
           | 
           | You know companies sold software that you paid once, and then
           | ran as much as you wanted on your pc?
        
             | oldstrangers wrote:
             | It doesn't seem odd to me that a product that involves an
             | absurd amount of data and computing power isn't an easily
             | consumable commercial product available for mass download.
        
               | d23 wrote:
               | The obvious counterpoint being what stable diffusion
               | _just_ released.
        
       | zegl wrote:
       | This is great news! I spent multiple hours doing this exact thing
       | by hand only last week when creating new graphics for
       | codeball.ai.
        
       | jatins wrote:
       | what kind of prompting is required for this?
       | 
       | I uploaded a digital painting, selected "Edit mode", added a
       | generation frame and prompted "complete the painting in frame"
       | ...but it just added a completed unrelated photo related to
       | painting in that frame.
        
         | d23 wrote:
         | I guess prompting that is "similar" to the image. The output
         | mine gave was pretty lackluster. I had to overlap the image
         | significantly, and even then it didn't seem to take into
         | account enough of the context to make something that resembled
         | the style close enough.
        
       | bottlepalm wrote:
       | Feels like a race to the bottom. More features, lower cost, every
       | week. No idea where it'll level out, but I like it. Just bought
       | some more Dalle credits today because it's so much fun. This is a
       | revolution in 'art technology' it's like Steve Job's bicycle for
       | the mind. Best I could do a month ago was a stick figure in MS
       | Paint, but now..
        
         | benreesman wrote:
         | I share your enthusiasm for this development but curious what
         | you mean by race to the bottom?
         | 
         | There does seem to be a lot of vague angst about how this will
         | affect the nascent "Prompt Engineer" career track, but I hope
         | most are comfortable letting the open innovation play out a bit
         | before trying to personally monetize it..
        
           | bioemerl wrote:
           | > race to the bottom?
           | 
           | In this context it's a good race. This software seems to have
           | caught fire and tons of people are playing with it and
           | providing tons of crazy new tools for cheap or free.
           | 
           | It's a race to the top for us.
        
         | learndeeply wrote:
         | It's a race to the top. New functionality is added and the
         | model is improved week over week.
        
           | aabhay wrote:
           | That's still a race to the bottom if the price isn't going
           | up.
        
             | jonplackett wrote:
             | That's not what a race to the bottom is. It's just
             | competition, which is usually good.
        
               | 411111111111111 wrote:
               | It often feels like words are losing their meaning, with
               | everyone misusing terms they don't fully understand.
               | 
               | I don't want to be a doomer and have have surely
               | unknowingly misused terms as well, but its definitely
               | noticable how these originally clearly defined terms are
               | getting used in entirely new ways.
               | 
               | And it's not just with technical terms like this, it also
               | applies to originally obvious terms such as racism,
               | sexism etc which have lost their original meaning
               | entirely
        
               | standardly wrote:
               | Also, the use of "unironically". What is going on there.
        
               | rjtavares wrote:
               | I can understand the criticism about technical terms
               | (they work better if stable and precise), but regarding
               | the rest: that's just how language works. You can't (and
               | shouldn't) expect words to keep their original meaning
               | forever.
               | 
               | For example, the word "term" comes from the original
               | latin "terminus" that means "end" or "boundary". It only
               | got the meaning you used it for centuries after it was
               | first used in English. See:
               | https://www.etymonline.com/word/term
        
               | 411111111111111 wrote:
               | Oh, it wasn't my intention to criticize anything or
               | anyone in particular with that comment.
               | 
               | I was just pondering that our originally clearly defined
               | terms are rapidly getting used in very confusing manner,
               | which increases the difficulty of a discussion, as
               | participants interpret words very differently.
               | 
               | I dont think that people look up the actual definition of
               | terms in a thesaurus anymore. They hear it in some
               | context and create their own personal definition. It
               | wasn't as obvious before the internet i think, but
               | nowadays everyone is bombarded with technical terms all
               | the time, which likely contributes massively to this
               | increasingly fluid terminology
        
               | jibe wrote:
               | There is generally a negative connotation to race to the
               | bottom. The Investopedia definition captures this:
               | 
               |  _The race to the bottom refers to a competitive
               | situation where a company, state, or nation attempts to
               | undercut the competition 's prices by sacrificing quality
               | standards or worker safety (often defying regulation), or
               | reducing labor costs._
        
             | learndeeply wrote:
             | Race to the bottom implies that they're only competing on
             | price. Here, they're competing on new functionality as
             | well. If DALL-E's outputs were substantially better than
             | Stable Diffusion, more people would use it, even if it cost
             | more.
        
             | tough wrote:
             | Price would have gone up if SD wasn't open source, look at
             | the new google collab pro limitations and you have
             | indications that they're loving this new wave for milking
             | it properly, I just ordered a GPU to run on local.
        
               | learndeeply wrote:
               | I don't think so, Colab pro limitations are precisely
               | because they weren't charging by compute unit, so they
               | were over-subscribed.
        
         | MacsHeadroom wrote:
         | Stable Diffusion is arguably better, has more features, and is
         | free. OpenAI can't compete with free.
         | 
         | Even if you don't want to take the 30 seconds to set it up in a
         | free Google Colab environment, the paid DreamStudio version is
         | still half the price of Dalle.
        
           | scifibestfi wrote:
           | Stable Diffusion is much less of a nanny too.
           | 
           | Amusingly it's more open in every way.
        
           | slig wrote:
           | Do you know the best Google Colab tutorial / repo?
        
             | cmdr2 wrote:
             | Hi, there are a couple of good UIs.
             | https://github.com/cmdr2/stable-diffusion-ui is an easy-to-
             | install and use tool, written by me (with contributions by
             | many). Version 2 is in beta, which is a 1-click installer
             | for Windows, no dependencies or command line needed. v2
             | beta: https://github.com/cmdr2/stable-diffusion-ui/tree/v2
             | 
             | https://github.com/hlky/stable-diffusion is another popular
             | and good tool.
        
               | slig wrote:
               | Thank you!
        
               | adamsmith143 wrote:
               | I'm impressed how fast this is getting adopted. Dozens of
               | repos have popped up.
        
           | ajkshdfgkjasdh wrote:
           | dreamstudio is also waaaay faster than openai. generally a
           | second or two for 512x512 at 50 steps.
        
           | skybrian wrote:
           | Running this at home is only free like mining cryptocurrency
           | is free if you didn't buy your computer and don't pay for the
           | electricity. Plus you can only run it on the computer that
           | has the good graphics card, which probably isn't your laptop.
           | 
           | I expect most people aren't going to be generating images all
           | day, so using a cloud-based service for occasional use will
           | still make a lot of sense.
           | 
           | Stable Diffusion offers a paid service to do this too, and
           | there's nothing wrong with that business model. Prices will
           | probably come down, though.
        
             | bornfreddy wrote:
             | Not sure if GP had this in mind, but SD is (more) free in
             | terms of liberty. So yes, you pay with electricity and
             | hardware, but you control the process yourself, which is
             | invaluable. DALL-E could change or go offline at any time.
        
               | skybrian wrote:
               | Considering the threat from DALL-E going offline, it
               | seems quite acceptable. These aren't precious photos
               | since it's all made up anyway, you can download any
               | pictures you make, and you probably already did for the
               | ones you care about.
               | 
               | I'd worry more about, say, keeping your photos on Google
               | and losing your account somehow.
        
               | Miraste wrote:
               | It's not only the threat of going offline. DALL-E makes
               | it extremely difficult to generate many ideas because of
               | its absurd content blocker - for example, I had something
               | like "ominous, foreboding landscape beneath a black sun"
               | blocked because (from what I could tell) it has words
               | with negative connotations and the word "black" in the
               | same sentence. It does this all the time, their discord
               | is full of examples.
        
               | skybrian wrote:
               | Yeah, if you run into those then you'll want to use
               | something else. (I haven't in my casual usage.)
        
             | Tepix wrote:
             | It does run on Apple silicon. 55 seconds in M1 Pro (vs 15
             | seconds on RTX 3070).
        
               | skybrian wrote:
               | That's pretty good, but with that level of latency, I can
               | still see people paying to use an online service that's
               | faster. Maybe they'll speed it up more, though?
        
               | istsp wrote:
        
               | redler wrote:
               | Is this native? Or Rosetta?
        
               | Miraste wrote:
               | Native, and judging by the speed it's using Metal too (as
               | opposed to CPU fallback).
        
           | frognumber wrote:
           | I find Stable Diffusion better overall, but it has downsides.
           | Stable Diffusion tends to be more creative than DALL-E, but
           | does a lousy job of following directions, especially complex
           | ones. DALL-E is good if I know what I want specifically.
           | 
           | I can think of ways to fix Stable Diffusion since it's open-
           | source. I think I could bridge the gaps as I see them in
           | about a weekend of hacking. I'm not sure when I'll get that
           | weekend.
           | 
           | (Footnote: What I want to do is not something I can explain
           | without a technical blog-post-length document or a zoom call;
           | it's about the same level of complexity as the other major SD
           | hacks we've seen)
        
             | Miraste wrote:
             | Something like prompt weighting? I've seen implementations
             | of that floating around.
        
             | cube2222 wrote:
             | Setting a high cfg parameter, like 13, drastically helps
             | with the prompt following.
             | 
             | That said, for me, I agree that dalle does much better
             | pencil sketches.
        
           | irrational wrote:
           | Better in what way? I tried 10 prompts that returned good
           | results in DALLE, but nothing good in stable diffusion.
        
             | davidwparker wrote:
             | Seconded. I got awesome results in making "artwork in the
             | style of Yoshitaka Amano" in DALLE but horrible ones in
             | Stable Diffusion. Maybe the prompt was incorrect there (it
             | would be great if these were more discoverable), but they
             | art in SD was lacking.
        
               | andybak wrote:
               | SD definitely needs more coaxing and naive prompts tend
               | not to fare as well as with Dall-E.
        
           | bottlepalm wrote:
           | There was a good example somewhere I can't find, but of a
           | really complex prompt that Dalle could understand, but SD
           | couldn't. Maybe some of the GPT-3 is being leveraged for
           | parsing.
           | 
           | Anyways I think it's way too early to start taking sides. I
           | enjoy using all these system.
        
             | bioemerl wrote:
             | One of SDs big limitations (understanding from what I had
             | read about it) is positional prompts. dall-e seems to
             | understand x on top of Y, but simple diffusion does not.
        
               | dogcomplex wrote:
               | img2img drawing should take care of that
        
               | posterboy wrote:
               | Ironically, "the cat is on the mat" is a conventional
               | example sentence in linguistics of metonymy (semantics).
               | 
               | I have no examples but imagine things like _at the top of
               | his game_ are immensely problematic, albeit not very
               | visual to begin with.
        
               | adamsmith143 wrote:
               | Doesnt seem to get IN examples either. E.g. a prompt like
               | 'an eagle holding a snake in its beak' ends up generating
               | eagle snake hybrid creatures.
        
           | yreg wrote:
           | I don't think Stable Diffusion is technologically better yet.
           | 
           | Sure, both SD and Midjourney produce absolutely beautiful
           | artworks most of the time. But if you want something specific
           | and out of the ordinary it takes a lot of attempts and
           | promptcrafting (and sometimes you are unable to accomplish
           | what you want at all).
           | 
           | However, my experience is that these prompts (which SD/MJ
           | struggles with) often produce good results in Dalle2 even on
           | the first try.
           | 
           | Of course, OpenAI has very limiting content policy. But if I
           | have something very specific in mind and it passes their
           | rules I currently chose Dalle-2. Even though I've spent much
           | more time with SD.
        
             | istsp wrote:
        
           | DeWilde wrote:
           | Also, unlike DALL-E, SD comes without a content filter and
           | "anti-bias diversity" filter so it gives you what you ask and
           | treats you as an adult.
        
           | iKlsR wrote:
           | After many months of waiting on my invite I got it and I
           | entered the prompt which is my greatest fear for some reason
           | "a red eyed hairy spider with human hands as feet" I got a
           | warning about violating policy/harmful content etc or
           | something. Not only that, the results I got were super
           | underwhelming, after playing with it for a half hour I
           | haven't looked back. Now playing with SD and an upscaler,
           | there is no limit to what I can create. Also I always found
           | it funny the company name hilarious. "Open"AI.
        
         | have_faith wrote:
         | > Best I could do a month ago was a stick figure in MS Paint
         | 
         | That is still the best _you_ can do... which happens to be
         | about the best I can do! Just like my introduction to the
         | computer at a young age has atrophied my handwriting quality.
        
           | bottlepalm wrote:
           | I guess if we're going to get into semantics and the
           | definition of self, where does the 'I' end and something else
           | begin then I don't really do anything. You could also say I
           | can't walk either without the ground.
        
             | dougabug wrote:
             | I think he's calling you a cyborg.
        
         | TheMagicHorsey wrote:
         | I feel like you aren't using the phrase "race to the bottom"
         | correctly here. Generally a race to the bottom implies some
         | kind of detrimental outcome for the world as a result of people
         | failing to internalize externalities generated by a business.
        
           | bottlepalm wrote:
           | It has to do with commoditization and decreasing costs.
           | Taking something technologically sophisticated and having it
           | become open source and accessible so quickly is going from
           | the top of the pyramid - big companies gate keeping betas, to
           | the bottom - the public, available to everyone, cheaply.
           | These companies are desperately trying to monetize this
           | technology, but the value in terms of what people will pay is
           | falling fast. It might not be a sustainable business model
           | for OpenAI or anyone else for very long. Hence the race to
           | the bottom - quickly make a buck before you can't.
        
         | [deleted]
        
         | tough wrote:
         | > it's like Steve Job's bicycle for the mind
         | 
         | I have been thinking the same thing, it's sad Steve will not be
         | able to see it
        
           | fartcannon wrote:
           | Steve would be trying to lock it down in his walled garden.
        
           | guelo wrote:
           | Not sure why that is more sad than all the other dead people
           | that can't see it.
        
             | criddell wrote:
             | Nobody said it is more sad.
        
       | actusual wrote:
       | Lol I just want to be able to use the thing. How long is this
       | waitlist?
        
         | istsp wrote:
        
       | dangero wrote:
       | similar work using Stable Diffusion in a Photoshop plugin:
       | 
       | https://old.reddit.com/r/StableDiffusion/comments/wyduk1/sho...
        
         | rvz wrote:
         | So DALL-E is already old news and the Stable Diffusion
         | ecosystem is once again already ahead especially with this
         | announcement.
         | 
         | Quite funny to see OpenAI panicking and falling on their own
         | sword, as they were supposed to be 'Open' in the first place
         | and are now being disrupted by open source.
        
           | sroussey wrote:
           | I came to say something similar. It feels like "OpenAI" was
           | just a trademark grab to prevent others from using it. Of
           | course, all conspiracy theories work well when looking
           | backward in time.
        
           | Blackthorn wrote:
           | Couldn't happen to a more deserving group of people. Good
           | riddance. Squatting the name "open" and trying to reap the
           | benefits therein while being anything but.
        
             | ItsTooMuch wrote:
             | I thought their research actually is open, at least? That's
             | still something...
        
               | aabhay wrote:
               | Their research is closed -- they don't release model
               | weights, nor in most cases the training or model scripts.
               | Certain things they release, just like any other for-
               | profit research firm.
        
               | skybrian wrote:
               | I don't see what it has to do with profit since this is
               | pretty normal in academia too. Scientists will often
               | publish papers, but not everything they do.
               | 
               | "Open" is not well-defined.
        
               | ItsTooMuch wrote:
               | As the sibling says, in academia this is already more
               | than open...
        
               | Blackthorn wrote:
               | Open means something. It is a, for lack of a better
               | phrase, virtue signal. When you do that but don't
               | actually represent the virtue you are trying to signal,
               | people will understandably get pretty upset about that.
        
             | JackFr wrote:
             | I've been complaining for years about WikiLeaks not being a
             | wiki -- no one wants to listen....
        
               | kylebenzle wrote:
        
               | TakeBlaster16 wrote:
               | To be fair, it started out as a wiki, and they just never
               | changed the name.
               | 
               | There's no CSS here but you can clearly see the MediaWiki
               | template: https://web.archive.org/web/20090422103636/http
               | ://www.wikile...
        
             | ben_w wrote:
             | What benefits? The parent is non-profit.
             | 
             | I'd argue they're imperfect, but they don't look like
             | arses. Big gap between the two, too.
        
               | scoopertrooper wrote:
               | The parent may be non-profit, but OpenAI LP accepts
               | investments and delivers returns to investors like any
               | other regular company. The only difference is that they
               | 'cap' the returns. However, the cap is negotiated with
               | individuals investor and I haven't seen anything
               | disclosing the cap except for the fact that in the
               | opening round the cap was 100x the initial investment.
               | 
               | 100x seems like a pretty generous cap to me.
        
               | aabhay wrote:
               | Are they non-profit? Does receiving $1b investment from a
               | for-profit company still mean you can be non-profit?
        
               | ben_w wrote:
               | Yes to both questions. It's a (set of) specific thing(s)
               | in company law.
               | 
               | https://projects.propublica.org/nonprofits/organizations/
               | 810...
        
           | skybrian wrote:
           | This is just nonsense. I pay (a small amount) for both. They
           | have different strengths and it's fun to compare. Adding new
           | features to a product is not a sign of panic, it's just
           | normal.
        
             | wongarsu wrote:
             | Dall-E so far hasn't been able to grow an ecosystem because
             | of how restricted it is. Meanwhile Stable Diffusion makes
             | trial-and-error and innovation around it easy, and as a
             | result only 9 days after Stable Diffusion's release we see
             | OpenAI release a feature that looks like a copy of a tool
             | from the Stable Diffusion ecosystem.
             | 
             | I agree that Dall-E isn't obsolete. I'd also add MidJourney
             | to that list. All three are great models in their own right
             | with their own pronounced strengths and weaknesses. But
             | when it comes to enabling novel workflows Stable Diffusion
             | seems lightyears ahead of the others.
        
               | krisoft wrote:
               | Except you are wrong. This feature was already available
               | as part of the Dall-e ecosystem. There was a website
               | called patch-e which facilitated this exact same
               | workflow.
        
           | crypto420_69 wrote:
           | Also its quite funny to see OpenAI (with all their
           | researchers and engineers) get disrupted by someone with
           | little to no background (Emad) in AI and ML but who embraced
           | OpenAI's original mission about making AI as open as
           | possible.
        
           | robertlagrant wrote:
           | > and falling on their own sword
           | 
           | That's not what that means.
        
           | mromanuk wrote:
           | The only move left for OpenAI, is honour their name and make
           | their own AI Open Source.
        
             | shrimpx wrote:
             | Or rename themselves OpaqueAI.
        
       | ckluis wrote:
       | I think Dall-E would benefit from a "sketch-based" prompt in
       | addition the text based. This was mindblowing -
       | https://andys.page/posts/how-to-draw/
        
         | teddyh wrote:
         | Someone should name the next image generator OWL, since it
         | "draws the rest of the owl".
        
           | boppo1 wrote:
           | Cheeky:
           | 
           | https://github.com/hlky/stable-diffusion-
           | webui/blob/master/i...
        
             | teddyh wrote:
             | I thought it was odd that I hadn't seen anyone else make
             | that joke. Turns out they had, I just hadn't seen it.
             | Thanks!
             | 
             | Reference, for those who haven't seen the original joke to
             | which my joke was referring: https://www.reddit.com/r/pics/
             | comments/d3zhx/how_to_draw_an_...
             | 
             | (See also: https://knowyourmeme.com/memes/how-to-draw-an-
             | owl)
        
         | sho_hn wrote:
         | It does feel like art's disruptive "Calculator moment" is
         | happening where you can now leave a lot of basic/mechanical
         | tasks to a tool and give more focus to higher-minded problems.
         | 
         | It's going to get so cool and interesting, I think.
         | 
         | A lot of the conversation around art may focus more on
         | composition and objectives of the artist in the new prompt
         | engineering world, with less bias from factors such as
         | rendition quality etc. creeping in since it's so incidental.
         | 
         | New forms of art will emerge and/or gain popularity that focus
         | on trying things the tools aren't good at yet. The human artist
         | of the gaps. The niches will constantly be shifting.
         | 
         | I wonder if we'll learn to recognize the output of certain
         | popular models and perceive them as instruments. "Made by xy on
         | z" instead of "xz on guitar", so to speak. I remember the
         | 90s/early 00s internet when it was always easy to tell when
         | something had been done on Flash, just because of its line
         | anti-aliasing rendition style being so distinct and familiar.
         | 
         | The novelty will wear off, and we'll all start to feel a bit
         | disappointed that the average human's imagination is pretty
         | limited and novel/original ideas remain somewhat rare as the
         | patterns and tropes in all the generated art emerge. It's great
         | you can put the space needle where you want it and get a good-
         | looking city and space ship, but how many variations of a
         | cyberpunky skyline with a space ship do you need? And then
         | we'll celebrate the novel stuff that does happen, as always. I
         | suppose the tropes will evolve faster as the throughput goes
         | up.
        
           | posterboy wrote:
           | It's not as if generative art is new. Nor is figurative
           | painting relevant anymore since the invention of the camera.
           | A basic _Burger joint in Gerhard Richter_ kind of style
           | transfer is very much derivative. This isn 't bad in view of
           | the classics, but it's more like _art-work_ to me.
           | 
           | The true artists in this one are the coders, no doubt
           | (corrolar to the inteligence debate).
           | 
           | On the other hand, you mention an important point with layout
           | but you underestimate the progress these days. Surely there
           | are companies who are working on automated design beyond CAD
           | ( _computer aided design_ ), eg. for specialized antenna.
           | 
           | > we'll all start to feel a bit disappointed that the average
           | human's imagination is pretty limited and novel/original
           | ideas remain somewhat rare as the patterns and tropes in all
           | the generated art emerge
           | 
           | Well, one might argue that Richter's most highly priced piece
           | looks a little like prehistoric art of the pleistocene. It's
           | a little vain to mention it, because I can much better relate
           | to the more basic form, of course. A more frequently sore
           | point would be the pop music industry between professionals
           | and the amateurish.
           | 
           | Anyway, this may be thinking too big. For the time being, the
           | bunch of techniques is better understood as a toolbox,
           | because it will be a long time before it trumps demo-scene
           | productions, for instance. Here it is the technique that
           | counts more often than not. The rest is an acquired taste.
        
           | boppo1 wrote:
           | >basic/mechanical tasks to a tool and give more focus to
           | higher-minded problems.
           | 
           | >rendition quality... [is] so incidental.
           | 
           | There's this thing in painting called 'mark making' and it
           | can be the difference between an all-time-great painting and
           | a throwaway portrait. Mark making speaks to every momentary
           | choice of physical process a painter employs and reveals
           | their thought process. For some of the greatest painters, it
           | reveals their genius.
           | 
           | Do not discount execution. Overlooking "basics" and
           | "mechanics" is what results in disappointing work.
        
             | posterboy wrote:
             | Surely this too can be instrumentalized to evoke emotions,
             | stylized to ease execution or faked to justify a result.
        
             | sho_hn wrote:
             | It's a fair point, and thanks for teaching me a new term!
             | 
             | There's a lovely documentary called "Tim's Vermeer" about
             | Tim Jenison's - one of the founders of NewTek, the people
             | behind Video Toaster and LightWave, incidentally both tools
             | that made hard visual art tasks accessible to wider
             | audiences - hobby side project to prove that Vermeer used
             | sophisticated optical tools to capture and copy his scenes
             | from physical sets, rather than e.g. paint his famous grasp
             | on lighting purely from his own mind. He builds such tools
             | himself and then proceeds to successfully create his own
             | Vermeer-alike painting, despite possessing very artistic
             | skill himself.
             | 
             | It's full of good ruminations (and good at sparking more)
             | on tools-vs-artistry but also execution-vs-method, and
             | whether designing and adopting innovative tools and the
             | tedious process to use them made Vermeer less of a genius,
             | or just a genius of a different kind than otherwise
             | presumed.
             | 
             | It's very accessible and doesn't require knowing anything
             | in particular from the art world.
        
       | NIL8 wrote:
       | I love this idea of extending the canvas to build out the scene.
       | It makes me wonder if anyone's tried using Poe's stories for
       | illustrating with AI? His descriptive writing style seems ideal.
        
       | AJRF wrote:
       | The scrambling to stay relevant after Stable Diffusion is very
       | very enjoyable to watch.
        
       | siavosh wrote:
       | A few weeks ago I was skeptical that this technology would get
       | past the emotional response we get from procedurally generated
       | game environments, but I've been convinced otherwise. The
       | emotional response I get from some of the best of these images
       | are novel and thought provoking. Makes me wonder what percent of
       | what makes us human is now algorithmically solved....
        
         | maxwell wrote:
         | Maybe. I've been often reminded lately of Herbert Goldstone's
         | "Virtuoso" (1958):
         | 
         | http://elateachers.weebly.com/uploads/2/7/0/1/27012625/virtu...
        
         | aidenn0 wrote:
         | My wife likes impressionists and sunflowers. "A lone sunflower
         | in a grassy field at sunset oil painting claude monet" plus
         | stable-diffusion and a few minutes of tweaking some settings;
         | she now has a new desktop background.
        
           | boppo1 wrote:
           | I actually paint and spend a lot of time looking at 'serious'
           | paintings. AI hasn't even scratched the field to a trained
           | eye.
           | 
           | Doesn't mean I'm not excited though. This kind-of feels like
           | I'm watching the camera or printing press being invented.
           | Everyone is comparing it to fine art, but I think ultimately
           | it's going in a different and bigger direction.
        
             | aidenn0 wrote:
             | What I did was, IMO, a different and bigger direction to
             | fine art. I mean _I_ could tell that this wasn 't an
             | impressionistic painting just given that some areas of the
             | grass were too detailed. It looks "just fine" though to
             | untrained eyes, which are well over 90% of the population.
             | 
             | 1. How long would it have taken me to get good enough at
             | painting to exceed what I generated in under an hour? How
             | many people have the motivation to spend that time?
             | 
             | 2. How much would I have had to pay an art student to make
             | a painting better than what I generated in under an hour?
             | 
             | Ten million sub-par Monet knock-offs didn't exist, but
             | could exist very shortly at minimal cost. Even if it never
             | gets any better this is already potentially disruptive, and
             | the models are getting better every month.
        
             | bottlepalm wrote:
             | I've heard this a lot, luckily it's not that hard to test
             | if you can really tell the difference. We need someone to
             | create the 'AI Pepsi challenge' for artists to settle this.
        
         | sleepdreamy wrote:
         | We still know basically nothing about our Brain/Consciousness.
         | I would say we have a lot more to explore/research
        
           | mensetmanusman wrote:
           | Our brain is apparently just a 4 gb large arrangement of
           | electrical weights.
        
             | d23 wrote:
             | Not to be pedantic, but we have on the order of 100B
             | neurons, and afaik each of them can be connected to
             | thousands of other neurons. I assume we probably have a
             | ways to go before we're encoding the amount of information
             | a brain can comprehend.
        
       | amilios wrote:
       | Damn, Dall-E really lost its competitive edge overnight when
       | Stable Diffusion was released. They dropped their prices across
       | the board in response, but honestly I think it still isn't enough
       | to save them. The magic of open-source competition.
        
         | danielbln wrote:
         | They dropped the prices for GPT-3, not for Dall-E.
        
       | minimaxir wrote:
       | Also per the email release, variations/inpainting, the trick used
       | to simulate outpainting before this, now generates 4 images like
       | a normal DALL-E generation instead of 3 (which was arbitrary
       | anyways).
       | 
       | I do wonder how expensive the outpainting is. I'm assuming that
       | each additional step in the timelapse is a full generation, in
       | which case ~15 generations is about $1 total.
        
       | affgrff2 wrote:
       | Everyone says stable diffusion is a free alternative. Where do I
       | get the weights without passing a gatekeeper?
        
         | dceddia wrote:
         | They're currently hosted on Google in a way that you can
         | download them via curl/wget. Here's a guide including the link:
         | https://www.assemblyai.com/blog/how-to-run-stable-diffusion-...
        
       | [deleted]
        
       | msoad wrote:
       | This is cool and useful!
       | 
       | Putting "Girl with a Pearl Earring by Johannes Vermeer" in the
       | kitchen in 2022 does not look good!
        
         | i_like_apis wrote:
         | Because depicting a woman in a kitchen is perpetuating the
         | pernicious male patriarchy? Sorry we're not doing that. You
         | might find some reception for this sort of thing on Twitter
         | though.
        
         | dymk wrote:
         | It's 2022, women are allowed to enjoy cooking and baking just
         | as much as men are.
        
       | zoba wrote:
       | I have been working on an outpainting piece (in Photoshop)
       | currently 10609 x 8144. I am very pleased to see more support for
       | this, though hoping it doesn't kill my current flow.
       | 
       | Seems like it is currently not working on their site.
        
       | dsmmcken wrote:
       | The UX is evolving around AI image generation so fast, everyday
       | is something new. There's so much greenfield exploration space
       | for new interaction models.
       | 
       | 6 months from now, how we interact with these models will
       | probably look entirely different.
        
       | hey_bear wrote:
       | While maybe not "as good as a human" creatively, wonder when this
       | matures a little more, we'll see whole art/design departments go
       | to the wayside and be replaced by stuff like this...
        
       | naillo wrote:
       | I can't help but feel like they're adding this at this particular
       | point since Stable Diffusion has announced they're releasing
       | their 'inpainting' model next week.
        
         | i_like_apis wrote:
         | I really doubt it's related at all, though everyone would think
         | it looks that way. SD has only been out a week and this feature
         | would have taken much more than that to build, test, enroll
         | demo users, make a webpage for, etc.
        
           | naillo wrote:
           | I can't prove it of course but it wouldn't surprise me if
           | they had this pretty much done already long ago (dall-e has
           | been out for several months at this point). The actual
           | implementation doesn't look like it'd take more than a few
           | days to code honestly (and they've got quite competent coders
           | over there). Only speculation of course.
        
             | i_like_apis wrote:
             | Everything looks easier when someone else is doing it.
        
         | [deleted]
        
       | EddySchauHai wrote:
       | Do you reckon we will have Prompt Engineers who are skilled at
       | getting AI to generate what they want before long?
        
       | bob1029 wrote:
       | How long until we can run this over shows like Star Trek
       | Voyager/DS9 and Seinfeld to achieve believable 16:9 scenes?
        
         | deadbunny wrote:
         | Someone has been upscaling DS9 already[1]. Obviously not
         | release anywhere.
         | 
         | Not sure I'd want them in 16:9, hd 4:3 like the other HD
         | releases of TNG and TOS would do me. I understand they shot on
         | video so an official true HD remaster is likely to never
         | happen.
         | 
         | 1. https://www.extremetech.com/extreme/324466-tutorial-how-
         | to-u...
        
           | henriquecm8 wrote:
           | I don't have a deep understanding of how training models
           | work, but I wonder if training a model with every frame of
           | TNG and then outpaint it into 16:9 would work.
        
           | jefftk wrote:
           | DS9 shot on film:
           | https://news.ycombinator.com/item?id=19454370
           | 
           | But it did use a lot of early CGI that would need to be
           | redone.
        
         | kranke155 wrote:
         | Why would you do that? What an awful idea. People made those
         | shows in the 4:3 format, you'd just be adding fluff. This is
         | like adding more description to a book so it becomes an epic
         | novel instead of a novella. I'd say keep to the creators
         | intent...
        
           | d23 wrote:
           | I'm inclined to agree, but if it were coherent I'd take it
           | over what platforms like netflix do and chop off the tops and
           | bottom of the content so it'll fit 16:9.
        
           | joemi wrote:
           | I'm not the person who suggested it, but I wouldn't mind
           | having it fill my (wide)screen when watching. That said, I
           | understand that some film/tv uses the frame very precisely,
           | however I'm not sure that these two particular examples do
           | that throughout their entire episodes. (Though I bet that in
           | Seinfeld in particular it might weaken/ruin a few visual
           | gags.)
        
             | kranke155 wrote:
             | Still seems to me like adding fluff - it seems a bit
             | impossible to me that the AI would add anything pertinent
             | to the plot. It would add "stuff" like corridors and
             | background sets and maybe someone out of focus.
             | 
             | Do the black bars actually bother you that much? You know
             | there are cropped 16:9 widescreen versions of some of these
             | shows (which I personally detest, but I work in the
             | business of moving images).
             | 
             | Genuinely interested in why this bothers people.
        
         | cesis wrote:
         | Next week?
        
         | russdill wrote:
         | Temporal coherence will still take a while to solve but it's
         | not undoable. Making things that look correct upon closer
         | inspection rather than just looking "nice" will probably take
         | some degree of human curation for quite a while.
        
           | bottlepalm wrote:
           | Tesla could use some of that temporal coherence as well.
        
           | dr_dshiv wrote:
           | Anyone working on closed caption models at the moment?
        
       | O__________O wrote:
       | This was already possible with DALL-E using the inpainting
       | feature going from defined image to transparent edge; this just
       | automates what was a manual process before. Do wish the
       | inpainting tool had more options, for example to fade a
       | transparency in, since my understanding is it makes a difference;
       | not to mention magic wand selection/deselection tool.
       | 
       | In case it is not obvious, every time a user generates an
       | additional section of an image using the outpainting feature, it
       | costs a credit.
        
         | ShamelessC wrote:
         | Automation of manual processes is generally useful.
        
         | ml_basics wrote:
         | Yes indeed, and it shows the advantages of Stable Diffusion's
         | model of just releasing the model and letting people do what
         | they want with it - this was straightforward to implement
         | oneself.
         | 
         | And while OpenAI released this feature now, it's probably just
         | a matter of days until even better features built on Stable
         | Diffusion will be released, given how much community energy is
         | focussed on it right now.
        
         | pilotneko wrote:
         | Maybe kludge it with a dithered transparency mask?
        
           | O__________O wrote:
           | Only matter of time before Adobe adds inpainting with hooks
           | to local or API generative tools, using OpenAI to edit works
           | like this is like transporting back to past using basic image
           | editing tools.
        
       | benreesman wrote:
       | So... are we done politely coughing and looking out the window at
       | the idea that the gatekeeping was motivated by altruism so that
       | we can move on and just use this much better innovation model
       | going forward?
       | 
       | Various (subjectively judged) SOTAs on at least some subset of at
       | least this family of tasks is changing somewhere between _daily_
       | and _hourly_ right now. I 've been watching this stuff closely
       | since fairly early ImageNet days and I've never seen a Cambrian
       | explosion of "how the hell did that do that?" events at anything
       | like this cadence.
        
       | aantix wrote:
       | Why would Google hold back on releasing Imagen if there are
       | competitors that are publicly available already?
       | 
       | Imagen isn't special anymore.
        
         | jsnell wrote:
         | A few possible theories, some might be mutually exclusive:
         | 
         | Organizational scar tissue making them more risk averse about
         | the PR risks of letting the genpop use AI generation tools, and
         | create something offensive. With the safe assumption that
         | Google will get blamed, not the user.
         | 
         | Fear of government regulation on AI if they don't self-
         | regulate.
         | 
         | No need to actually release it, since this isn't the core
         | business but just research. (While openai needs to actually
         | create the business.) Corollary: more to lose -- a scandal
         | around offensive content will not hurt openai's non-existent
         | other businessess. It might make some advertisers pull their
         | ads from Google.
         | 
         | The opportunity cost of building a self serve platform is too
         | high. (Can't pull in people writing those kind of apps from
         | projects with more commercial importance. Can't make the ML
         | researchers do that.)
         | 
         | They misjudged how much demand there would be, and thought that
         | building a platform would not be useful for a few years. And if
         | it now turns out to actually be a great business it'll now take
         | them a year to productionize and build a platform.
         | 
         | Their compute requirements are so high that selling access is
         | not viable, the costs are prohibitive for real users.
         | 
         | It's not that different from e.g. self driving cars. Pretty
         | obviously they had better tech from early on, but were not
         | willing to take the risks that Tesla was.
        
         | aabhay wrote:
         | Google is most interested in maintaining mind-share so that
         | researchers don't jump ship. They could always monetize Imagen
         | through Google Cloud but are concerned about risks (NSFW, legal
         | issues, bias, etc.) so would rather wait for others to step
         | into the water first.
        
       | thebeastie wrote:
       | This is moving fast !
       | 
       | Obviously, it's going to be an incredible boon for content
       | creation. I suppose that in the future it'll make creating videos
       | an order of magnitude easier, which will allow a single person or
       | a small team to make a high quality movie where all the assets
       | are generated, so that'll really give us an eye into a lot of
       | people's imaginations, for better or worse.
       | 
       | To leave a thought provoking example, what's going to happen when
       | every adolescent has the ability to make a convincing deepfake?
       | 
       | It'll put nation states in a similar position than they already
       | have with crypto, where they wonder if they should ban, or
       | regulate... doing nothing wont be an option.
        
       ___________________________________________________________________
       (page generated 2022-08-31 23:00 UTC)