[HN Gopher] DALL*E now available in beta
       ___________________________________________________________________
        
       DALL*E now available in beta
        
       Author : todsacerdoti
       Score  : 552 points
       Date   : 2022-07-20 16:30 UTC (6 hours ago)
        
 (HTM) web link (openai.com)
 (TXT) w3m dump (openai.com)
        
       | naillo wrote:
       | I wonder if they'll even make back what they spent on training
       | the models before competitors of equal quality and lower cost
       | eats up their margins.
        
       | tourist_on_road wrote:
       | Super impressive to see how OpenAI managed to bring the project
       | from research to production (something usable for creatives).
       | This is non trivial since the usecase involves filtering NSFW
       | content, reducing bias in generated images. Kudos to the entire
       | team.
        
       | seshagiric wrote:
       | For those who want to try DALL.E but do not have access yet, this
       | is good play site: https://www.craiyon.com/
        
       | totetsu wrote:
       | I was really enjoying using Dalle2 to take surrealist walks
       | around the latent image space of human cultural production. I was
       | using it as one might use Wikipedia researching the links between
       | objects and their representation. Also just to generate
       | suggestion for what to have for lunch. None of this was for
       | anything of commercial value to me. What am I to do now, start to
       | find ways to sell the images I'm outputting? Do I displace the
       | freelance artists in the market who actually have real talent and
       | ability to create images and compositions and who studied how use
       | the tools of the trade. Does the income artists can make now get
       | displaced by people using dalle? Then do people stop learning how
       | to actually make art and we come to the end of new cultural
       | production and just start remixing everything made untill now?
        
         | totetsu wrote:
         | With real artist left only making images of sex and violence
         | and other TOS violations
        
         | [deleted]
        
       | password321 wrote:
       | The worlds most expensive meme generator.
        
       | cypress66 wrote:
       | > Reducing bias: We implemented a new technique so that DALL*E
       | generates images of people that more accurately reflect the
       | diversity of the world's population. This technique is applied at
       | the system level when DALL*E is given a prompt about an
       | individual that does not specify race or gender, like "CEO."
       | 
       | Will it do it "more accurately" as they claim? As in, if 90% of
       | CEOs are male, then the odds of a CEO being male in a picture is
       | 90%? Or less "accurately reflect the diversity of the world's
       | population" and show what they would like the real world to be
       | like?
        
         | president wrote:
         | Most likely this was something forced by their marketing team
         | or their office of diversity. Given the explanation of the
         | implementation (arbitrarily adding "black" and "female"
         | qualifiers), it's clear it was just an afterthought.
        
         | [deleted]
        
         | klohto wrote:
         | hardmaru on Twitter has examples. It's the second, the one they
         | would like it to be.
        
         | kache_ wrote:
         | They literally just add "black" and "female" with some weight
         | before any prompt containing person.
         | 
         | A comical work around to so called "bias" (isn't the whole
         | point of these models to encode some bias?). Here's some
         | experimentation showing this.
         | 
         | https://twitter.com/rzhang88/status/1549472829304741888
         | 
         | As competitors with lower price points prop up, you'll see
         | everyone ditch models with "anti bias" measures and take their
         | $ somewhere else. Or maybe we'll get some real solution, that
         | adds noise to the embeddings, and not some half assed
         | workaround to the arbitrary rules that your resident AI
         | Ethicist comes up with.
        
           | danielvf wrote:
           | Add _after_. So you can see the added words by making a
           | prompt like  "a person holding a sign saying ", and then the
           | sign says the extra words if they are added.
        
             | kache_ wrote:
             | Yeah actually, good call. The position of the token
             | matters, since these things use transformers to encode the
             | embeddings.
             | 
             | https://www.assemblyai.com/blog/how-imagen-actually-works/
        
           | whywhywhywhy wrote:
           | How does it deal with bias that is negative?
           | 
           | Would only work for positive biases where if they actually
           | want to equalize it then it needs to be adding the opposite
           | to negative biases.
           | 
           | To counteract the bias of their dataset they need to have
           | someone sitting there actively thinking in bias to counteract
           | the bias with anti-bias seasoning for every bias causing
           | term. Feel bad for whatever person is tasked with that job.
           | 
           | Could always just fix your dataset, but who's got time and
           | money to do that /s
        
         | naillo wrote:
         | It's also funny that this likely won't 'unbias' any actual
         | published images coming out of it. If 90% of the images in the
         | world has a male CEO, then for whatever reason that's the image
         | people will pick and choose from DALL-Es output. (Generalized
         | to any unbiasing - i.e. they'll be debiased by humans.)
        
           | bequanna wrote:
           | Imagine you're in South Korea (or any other ethnically
           | homogenous country). Do you want "black" "female" randomly
           | appended to your input?
        
             | educaysean wrote:
             | If I was using this in South Korea, how is showing all
             | white people any better than showing whites, blacks,
             | latinos and asians?
        
               | bequanna wrote:
               | You would presumably input "South Korean CEO". DALL-E
               | would then unhelpfully add "black" "female" without your
               | knowledge.
        
               | educaysean wrote:
               | I just tried it out and it looks like DALL-E isn't as
               | inept as you imagined. Exact query used was 'A profile
               | photo of a male south korean CEO', and it spat out 4 very
               | believable korean business dudes.
               | 
               | Supplying the race and sex information seems to prevent
               | new keywords from being injected. I see no problem with
               | the system generating female CEOs when the gender
               | information is omitted, unless you think there are?
        
               | astrange wrote:
               | I don't think they "randomly insert keywords" like people
               | are claiming, I think they probably run it through a GPT3
               | prompt and ask it to rewrite the prompt if it's too
               | vague.
               | 
               | I set up a similar GPT prompt with a lot more power
               | ("rewrite this vague input into a precise image
               | description") and I find it much more creative and useful
               | than DALLE2 is.
        
               | bequanna wrote:
               | Isn't the diversity keyword injection random?
               | 
               | My point is that it is pointless. If you want an image of
               | a <race> <gender> person included, you can just specify
               | it yourself.
        
               | educaysean wrote:
               | > If you want an image of a <race> <gender> person
               | included, you can just specify it yourself.
               | 
               | I agree wholeheartedly. So what are we arguing about?
               | 
               | What we're seeing is that DALL-E has its own bias-
               | balancing technique it uses to nullify the imbalances it
               | knows exists in its training data. When you specify
               | ambiguous queries it kicks into action, but if you wanted
               | male white CEOs the system is happy to give it to you.
               | I'm not sure where the problem is.
        
         | totetsu wrote:
         | Yes the quality of surrealist generations went down with that
         | change suddenly including gender and race into prompts that I
         | really didn't want anything specific in. Like a snail radio DJ,
         | and suddenly the microphone is a woman of colours head.. I
         | understand the intention but I want this to be a default on but
         | you can turn it off thing.
        
         | TheFreim wrote:
         | It's also odd since you'd think that this would be an issue
         | solved by training with representative images in the first
         | place.
         | 
         | If you used good input you'd expect an appropriate output, I
         | don't know why manual intervention would be necessary unless
         | it's for other purposes than stated. I suspect this is another
         | case where "diversity" simply means "less whites".
        
         | StrictDabbler wrote:
         | If accurately reflects the _world_ population then only one in
         | six pictures will be a white person. Half the pictures will be
         | Asian, another sixth will be Indian.
         | 
         | Slightly more than half of the pictures will be women.
         | 
         | That accurately represents the world's diversity. It won't
         | accurately reflect the world's power balance but that doesn't
         | seem to be their goal.
         | 
         | If you want to say "white male CEO" because you want results
         | that support the existing paradigm it doesn't sound like
         | they'll stop you. I can't imagine a more boring request.
         | 
         | Let's look at _interesting_ questions:
         | 
         | If you ask for "victorian detective" are you going to get a
         | bunch of Asians in deerstalker caps with pipes?
         | 
         | What about Jedi? A lot of the Jedi are blue and almost nobody
         | on Earth is.
         | 
         | Are cartoon characters exempt from the racial algorithm? If I
         | ask for a Smurf surfing on a pizza I don't think that making
         | the Smurf Asian is going to be a comfortable image for any
         | viewer.
         | 
         | What about ageism? 16% of the population is over sixty. Will a
         | request for "superhero lifting a building" have an 16% chance
         | of being old?
         | 
         | If I request a "bad driver peering over a steering wheel" am I
         | still going to get an Asian 50% of the time? Are we ok with
         | that?
         | 
         | I respect the team's effort to create an inclusive and
         | inoffensive tool. I expect it's going to be hard going.
        
           | bequanna wrote:
           | > inoffensive tool.
           | 
           | Wouldn't that result end up being like "inoffensive art" or
           | "inoffensive comedy"?
           | 
           | Bland, boring and Corporate-PC.
        
             | erikpukinskis wrote:
             | Being offensive is only one way to be interesting.
             | 
             | There are others, like being clever, or being absurd, or
             | being goofy, or being poignant, or being refreshing.
             | 
             | Of the good stuff, offensive humor is only a tiny slice.
        
               | jazzyjackson wrote:
               | offensive _to whom_ is the sticking point when it comes
               | to comedy
               | 
               | it takes a special talent to please everybody
        
             | driverdan wrote:
             | To a certain degree, yes. They care more about the image of
             | the project than art. Considering a large amount of art
             | depicts non-sexual nudity yet they block all nudity, art is
             | not their primary concern.
        
               | bequanna wrote:
               | Some people claim to be emotionally "triggered" by images
               | of police. Does that mean DALL-E should also start
               | blocking images that contain police?
        
           | visarga wrote:
           | You know a surprising way to solve the issues you presented?
           | You train another model to trick DALL-E to generate
           | undesirable images. It will use all its generative skills to
           | probe for prompts. Then you can use those prompts to fine-
           | tune the original model. So you use generative models as a
           | devil's advocate.
           | 
           | - Red Teaming Language Models with Language Models
           | 
           | https://arxiv.org/abs/2202.03286
        
         | bjt2n3904 wrote:
         | Will it reduce bias across all fields? Or only ones that are
         | desirable? How about historical?
         | 
         | "A photo of a group of soldiers from WW2 celebrating victory
         | over nazi CEOs and plumbers".
        
         | noelsusman wrote:
         | In their examples, the "After mitigation" photos seem more
         | representative of the real world. Before you got nothing but
         | white guys for firefighter or software engineer and nothing but
         | white ladies for teacher. That's not how the real world
         | actually is today.
         | 
         | I'm not sure how they would accomplish 100% accurate
         | proportions anyway, or even why that would be desirable. If I
         | don't specify any traits then I want to see a wide variety of
         | people. That's a more useful product than one that just gives
         | me one type of person over and over again because it thinks
         | there are no female firefighters in the world.
        
         | scifibestfi wrote:
         | The latter. Here's what we, a small number of people, think the
         | world should look like according to our own biases and
         | information bubble in the current moment. We will impose our
         | biases upon you, the unenlightened masses who must be
         | manipulated for your own good. And for god sakes, don't look
         | for photos of the US Math team or NBA Basketball or compare
         | soccer teams across different countries and cultures.
        
           | bequanna wrote:
           | > Here's what we, a small number of people, think the world
           | should look like according to our own biases and information
           | bubble in the current moment.
           | 
           | You're being quite charitable. It is much more likely that
           | optics and virtue signaling is behind this addition.
        
             | erikpukinskis wrote:
             | If I search for "food" I don't want to see a slice of pizza
             | every time, even if that's the #1 food. I want to see some
             | variety.
             | 
             | I think you're jumping to quickly to bad intentions.
             | Injecting diversity of results is a sane thing to do,
             | totally irrespective of politics.
        
       | aledalgrande wrote:
       | I wonder at this price point which kind of business can use DALL
       | E at scale?
        
       | hit8run wrote:
       | It's so dirty what Microsoft is doing here. They ripped the tech
       | out of developers hands just to sell us drips of it. Drips that
       | are not enough to build a product for more than a few people.
       | They require to check on the use before launching etc. I truly
       | hate this company, their shitty operating system and their
       | monopoly business game. Everything they buy turns to shit. And
       | don't tell me about VSCode. It's just a trap to fool developers.
        
       | NaughtyShiba wrote:
       | Slightly offtopic, but how one would report false-positive check
       | in content policy check?
        
       | Al-Khwarizmi wrote:
       | In beta, maybe, but I don't think "available" means what they
       | think it means.
       | 
       | I have been on the waitlist from the very beginning. Still
       | waiting.
        
       | skilled wrote:
       | I can't check right now but this mean the watermark is also gone
       | and images will have a higher resolution?
        
         | gverri wrote:
         | Watermarks are still there and resolution still 1024x1024.
        
           | skilled wrote:
           | I wonder if they have plans to allow SVG exports in the
           | future. I mean, the file size would probably be ridiculous in
           | a lot of the cases, but for my use case I wouldn't mind it.
           | And sucks about the watermark, maybe they will introduce an
           | option to pay for removing it.
        
             | rahimnathwani wrote:
             | SVG exports would only be meaningful if the model is
             | generating vector images, which are then converted to
             | bitmaps. I highly doubt that's the case, but perhaps
             | someone who has actually looked at the model structure can
             | confirm?
        
               | tiagod wrote:
               | It's just pixels. You can pass them into a tracer
        
             | moyix wrote:
             | SVG isn't really possible with the model architecture
             | they're using. The diffusion+upscaling step basically
             | outputs 1024x1024 pixels; at no point does the model have a
             | vector representation.
             | 
             | I suppose it's possible that at some point they'll try to
             | make an image -> svg translation model?
        
         | [deleted]
        
       | xnx wrote:
       | I fully expect stock image sites to be swamped by DALL-E
       | generated images that match popular terms (e.g. "business person
       | shaking hands"). Generate the image for $0.15. Sell it for $1.00.
        
         | smusamashah wrote:
         | They won't. DALL-E images are mostly not as high quality. The
         | high quality stuff which everyone has been sharing is result of
         | lots of cherry picking.
        
           | commandlinefan wrote:
           | Even the high quality stuff still can't do human faces right.
        
             | TomWhitwell wrote:
             | This one surprised me when it came out, felt more 'human'
             | than lots of stock photos:
             | https://labs.openai.com/s/AsRKFiOKJmmZrVDxIGa75sSA
        
             | optimalsolver wrote:
             | They avoided using real human faces in the training data.
        
           | speedgoose wrote:
           | In my experience it doesn't require that much cherry picking
           | if you use a carefully crafted prompt. For example: " A
           | professional photography of a software developer talking to a
           | plastic duck on his desk, bright smooth lighting, f2.2,
           | bokeh, Leica, corporate stock picture, highly detailed"
           | 
           | And this is the first picture I got:
           | https://labs.openai.com/s/lSWOnxbHBYQAtli9CYlZGqcZ
           | 
           | It got it a bit strong on the depth of field and I don't like
           | the angle but I could iterate a few times and get a good one.
        
             | arecurrence wrote:
             | Additionally, wherever it classically falls over (such as
             | currently for realistic human faces), there will be second
             | pass models that both detect and replace all the faces with
             | realistic ones. People are already using models that alter
             | eyes to be life-like with excellent results (many of the
             | dalle-2 ones appear somewhat dead atm).
        
             | smusamashah wrote:
             | Even this image is just an illusion of a perfect photo,
             | which is a blur for most part, see the face of duck. I had
             | access since past 4 5 days and it fails badly whenever I
             | tried to create any unusual scene.
             | 
             | For the first few days when it was announced I use to look
             | deep even in real photos in search of generative artifacts.
             | They are not so difficult to spot now, most of the times
             | anyway.
        
             | cornel_io wrote:
             | NB: when you share links like that, nobody who doesn't have
             | access can see the results
        
               | alana314 wrote:
               | sure they can, just tried in incognito
        
           | messe wrote:
           | If the price is low enough, you can have humans rank
           | generated images (maybe using Mechanical Turk or a similar
           | service), and from that ranking choose only the highest
           | quality DALL-E generated images.
        
           | Forge36 wrote:
           | If someone can make money doing it they might.
           | 
           | Heck: If the cost to entry is prohibitively low they might do
           | it at a loss and take over the site
        
         | redox99 wrote:
         | DALL-E 2 isn't good enough for such photorealistic pictures
         | with humans as of yet however.
        
           | arecurrence wrote:
           | There has been trouble with generating life-like eyes but a
           | second pass with a model tuned around making realistic faces
           | has been very successful at fixing that.
        
           | bpicolo wrote:
           | https://twitter.com/TobiasCornille/status/154972906039745331.
           | ..
           | 
           | Unless I'm missing something, these seem pretty darn good
        
             | zerocrates wrote:
             | Woof, that bias "solution" that that thread is actually
             | about though...!
        
         | thorum wrote:
         | DALLE images are still only 1024 px wide. Which has its uses,
         | but I don't think the stock photo industry is in real danger
         | until someone figures out a better AI superresolution system
         | that can produce larger and more detailed images.
        
           | [deleted]
        
           | eigenvalue wrote:
           | I've been using this app to upscale the images to 4000x4000,
           | and it works amazingly well (there is also a version for
           | Android):
           | 
           | https://apps.apple.com/us/app/waifu2x/id1286485858
           | 
           | I paid extra to get the higher quality model using the in-app
           | purchase option. It crushes the phone's battery life, but
           | runs in only ~10 seconds on an iPhone 13 Pro for a single
           | 1000x1000 input image.
        
             | ZeWaka wrote:
             | I mean, waifu2x and similar waifuxx libraries are free and
             | open-source, there's really no reason to pay for it if
             | you're working on a desktop.
        
               | [deleted]
        
           | arecurrence wrote:
           | You can obtain any size by using the source image with the
           | masking feature. Take the original and shift it then mask out
           | part of the scene and re-run. Sort of like a patchwork quilt,
           | it will build variations of the masked areas with each
           | generation.
           | 
           | Once the API is released, this will be easier to do in a
           | programmatic fashion.
           | 
           | Note: Depending on how many times you do this... I could see
           | there being a continuity problem with the extremes of the
           | image (eg: the far left has no knowledge of the far right).
           | An alternative could be to scale the image down and mask the
           | borders then later scale it back up to the desired
           | resolution.
           | 
           | This scale and mask strategy also works well for images where
           | part of the scene has been clipped that you want to include
           | (EG: Part of a character's body outside the original image
           | dimensions). Scale the image down, then mask the border
           | region, and provide that to the generation step.
        
         | ploppyploppy wrote:
         | "buy fo' a dollar, sell fa' two" - Prop. Joe
        
         | wishfish wrote:
         | Makes me imagine stock image sites in the near future. Where
         | your search term ("man looks angrily at a desktop computer")
         | gets a generated image in addition to the usual list of stock
         | photos.
         | 
         | Maybe it would be cheaper. I imagine it would one day. And
         | maybe it would have a more liberal usage license.
         | 
         | At any rate, I look forward to this. And I look forward to the
         | inevitable debates over which is better: AI generation or
         | photographer.
        
         | dymk wrote:
         | They'll likely immediately go out of business, because I can
         | just pay OpenAI 15 cents directly for the exact same product.
        
         | dylanlacom wrote:
         | Eh, I'd bet the arbitrage window is pretty brief, and that
         | prices will fall closer to $0.15 pretty quickly.
        
       | jowday wrote:
       | Sad to say I've been dissapointed in DALLE's performance since I
       | got access to it a couple of weeks ago - I think mainly because
       | it was hyped up as the holy grail of text2image ever since it was
       | first announced.
       | 
       | For a long while whenever Midjourney or DALLE-mini or the other
       | models underperformed or failed to match a prompt the common
       | refrain seemed to be "ah, but these are just the smaller version
       | of the real impressive text2image models - surely they'd perform
       | better on this prompt". Honestly, I don't think it performs
       | dramatically better than DALLE-mini or Midjourney - in some cases
       | I even think DALLE-mini outperforms it for whatever reason. Maybe
       | because of filtering applied by OpenAI?
       | 
       | What difference there is seems to be a difference in quality on
       | queries that work well, not a capability to tackle more complex
       | queries. If you try a sentence involving lots of relationships
       | between objects in the scene, DALLE will still generate a
       | mishmash of those objects - it'll just look like a slightly
       | higher quality mishmash than from DALLE-mini. And on queries that
       | it does seem to handle well, there's almost always something off
       | with the scene if you spend more than a moment inspecting it. I
       | think this is why there's such a plethora of stylized and
       | abstract imagery in the examples of DALLE's capabilities - humans
       | are much more forgiving of flaws in those images.
       | 
       | I don't think artists should be afraid of being replaced by
       | text2image models anytime soon. That said, I have gotten access
       | to other large text2image models that claim to outperform DALLE
       | on several metrics, and my experience matched with that claim -
       | images were more detailed and handled relationships in the scene
       | better than DALLE does. So there's clearly a lot of room for
       | improvement left in the space.
        
       | jawns wrote:
       | One of the commercial use cases this post mentions is authors who
       | want to add illustrations to children's stories.
       | 
       | I wonder if there is a way for DALL-E to generate a character,
       | then persist that character over subsequent runs. Otherwise, it
       | would be pretty difficult to generate illustrations that depict a
       | coherent story.
       | 
       | Example ...
       | 
       | Image 1 prompt: A character named Boop, a green alien with three
       | arms, climbs out of its spaceship.
       | 
       | Image 2 prompt: Boop meets a group of three children and shakes
       | hands with each one.
        
         | minimaxir wrote:
         | You can cheat this to a limited extent using inpainting.
        
           | rahimnathwani wrote:
           | You mean just generate a single large image with all the
           | stuff you want for the whole story, and then use cropping and
           | inpainting to get only the piece you want for each page?
        
         | TaupeRanger wrote:
         | You can't do that. I can't see this working well for children's
         | book illustrations unless the story was specifically tailored
         | in a way that makes continuity of style and characters
         | irrelevant.
        
           | CobrastanJorji wrote:
           | As an aside, Ursula Vernon did pretty well under the
           | constraint you described. She set a comic in a dreamscape and
           | used AI to generate most of the background imagery:
           | https://twitter.com/UrsulaV/status/1467652391059214337
           | 
           | It's not the "specify the character positions in text"
           | proposed, but still a neat take on using this sort of AI for
           | art.
        
             | TaupeRanger wrote:
             | Nice example and very well done. But yeah, very niche
             | application unfortunately.
        
           | WalterSear wrote:
           | I would expect continuuity to be a relatively simple feature
           | to retrain for and implement.
        
         | bergenty wrote:
         | You cannot. But a workaround would be to say something like
         | "generate an alien in three different poses-- running, walking,
         | waving"
         | 
         | Then use inpainting to only preserve that pose and generate new
         | content around it. It's definitely not perfect.
        
           | londons_explore wrote:
           | You can do better than this. Draw/generate your character.
           | 
           | Then put that at the side of a transparent image, and use as
           | the prompt, "Two identical aliens side by side. One is
           | jumping"
        
       | can16358p wrote:
       | So can we now legally remove the "color blocks" watermark or not?
       | 
       | What about generating NFTs? It was explicitly prohibited during
       | the previous period, now there is no notion of it. Without notion
       | and rights for commercial use I think it's allowed but because it
       | was an explicitly forbidden use case before, I want to be sure
       | whether it can be used or not.
       | 
       | Regardless, excited to see what possibilities it opens.
        
         | gwern wrote:
         | Another user saying that OA has said it's OK to remove the
         | watermark:
         | https://www.reddit.com/r/dalle2/comments/w3qsxd/dalle_now_av...
         | 
         | The commercial use language appears pretty clear to me to allow
         | NFTs. (But note the absence of any discussion of _derivative_
         | works...)
        
       | blintz wrote:
       | The content policy is strikingly puritanical:
       | 
       | > "Do not attempt to create, upload, or share images that are not
       | G-rated"
       | 
       | https://labs.openai.com/policies/content-policy
        
       | anewpersonality wrote:
       | Feel sorry for the full time artists.
        
       | danielvf wrote:
       | I am thrilled about DALL-E, and the new terms of service.
       | However, how they implemented the improved "diversity" is
       | hilarious.
       | 
       | Turns out that they randomly, silently modify your prompt text to
       | append words like "black male" or "female". See
       | https://twitter.com/jd_pressman/status/1549523790060605440
       | 
       | I don't know which emotion I feel more - applause at how glorious
       | this hack is or tears at how ugly it is.
       | 
       | Good luck to them!
        
         | time_to_smile wrote:
         | This is funny because I work on a team that is using GPT-3 and
         | to fix a variety of issues we have with incorrect output we've
         | just been having the engineering team prepend/append text to
         | modify the query. As we encounter more problems the team keeps
         | tacking on more text to the query.
         | 
         | This feels like a very hacky way to essentially reinvent
         | programming badly.
         | 
         | My bet is that in a few years or so only a small cohort of
         | engineering and product people will even remember Dall-E and
         | GTP-3 and someone cringe at how we all thought this was going
         | to be a big thing in the space.
         | 
         | There's are both really fascinating novelties, but at the end
         | of the day that's all they are.
        
           | throwaway4aday wrote:
           | How else would you specify the type of image you would like?
           | Surely, if you were hiring a designer you would provide them
           | with a detailed description of what you wanted. More likely,
           | you would spend a lot of time with them maybe even hours and
           | who knows how many words. For design work specifically to
           | create a first mockup or prototype of a product or image it
           | seems like DALL-E beats that initial phase hands down. It's
           | much easier to type in a description and then choose from a
           | set of images than it is to go back and forth with someone
           | who may take hours or days to create renderings of a few
           | options. I don't think it'll put designers out of work but I
           | do think they'll be using it regularly to boost their
           | productivity.
        
           | selestify wrote:
           | What are you using GPT-3 for in a commercial setting?
        
         | mysore wrote:
         | it's a hard problem. at least they tried.
        
           | Jerrrry wrote:
           | It's not a "problem," it's an unwanted shard of reality
           | piercing through an ideological guise.
        
             | gnulinux wrote:
             | How's it NOT a problem? If I'm trying to produce "stock
             | people images", and if it only gives me white men, it's
             | clearly broken because when I ask for "people" I'm actually
             | asking for "people". I'm having difficulty understanding
             | how it can be considered to be working as intended, when it
             | literally doesn't. Clearly, the software has substantial
             | bias that gets in way of it accomplishing its task.
             | 
             | If I want to produce "animal images" but it only produces
             | images of black cats, do you think there is any question
             | whether it's a problem or not?
        
               | mysterydip wrote:
               | That's what Jerrrry is saying. Framing the reality of
               | diversity in the world as a "problem" is wrong.
        
               | ceeplusplus wrote:
               | Black people comprise 12.4% of the US population, yet
               | they are represented at substantially above that in
               | "OpenAI"'s "bias removal" process. Clearly it has, as you
               | put it, substantial bias that gets in the way of
               | accomplishing its task.
        
               | Jerrrry wrote:
               | That is clearly overfitting due to unrepresentative
               | training data.
               | 
               | The "issue" is a different one: that training data - IE,
               | reality, has _unwanted_ biases in it, because reality is
               | biased.
               | 
               | Producing images of men when prompting for "trash
               | collecting workers" should not be much of a surprise: 99%
               | of garbage collection/refuse is handled by men. I doubt
               | most will consider this a "problem," because of one's own
               | bias, nobody cares about women being represented for a
               | "shitty" job.
               | 
               | But ask for picture of CEOs, and then act surprised when
               | most images are of white men? Only outrage, when
               | proportionally, CEO's are, on average, white men.
               | 
               | The "problem" arises when we use these tools to make
               | decisions and further affect society - it has the obvious
               | issue of further entrenching stereotypical associations.
               | 
               | This is not that. Asking DALLE for a bunch of football
               | players, would expectedly produce a huddled group of
               | black men. No issue, because the NFL are
               | disproportionately black men. No outrage, either.
               | 
               | Asking DALLE for a group of criminals, likewise, produces
               | a group of black men. Outage! Except statistically, this
               | is not a surprise, as a disproportionate amount of
               | criminals are black men.
               | 
               | The "problem" is with reality being used as training
               | data. The "problem" is with our reality, not the tooling.
               | 
               | Except in the cases where these toolings are being used
               | to affect society - the obvious example being insurance
               | ML algorithms. et al - we should strive to fix the issues
               | present in reality, not hide them with handicapped
               | training data, and malformed inputs.
        
               | TomWhitwell wrote:
               | In the UK... "The Environmental Services Association, the
               | trade body, said that only 14 per cent of the country's
               | 91,300 waste sector workers were female." So 2x dall-e
               | searches should produce 1.2 women.
        
               | CuriousSkeptic wrote:
               | > Asking DALLE for a bunch of football players, would
               | expectedly produce a huddled group of black men
               | 
               | I think, for about 95% of the world football is
               | synonymous with soccer. Its kind of interesting that you
               | take this particular example to represent what reality
               | looks like statistically
        
               | less_less wrote:
               | > This is not that. Asking DALLE for a bunch of football
               | players, would expectedly produce a huddled group of
               | black men. No issue, because the NFL are
               | disproportionately black men. No outrage, either.
               | 
               | This is not great. Only about 57% of NFL players are
               | black, and the percentage is more like 47% among college
               | players. It would be better to at least reflect the
               | diversity of the field, even if you don't think it should
               | be widened in the name of dispelling stereotypes.
               | 
               | > Asking DALLE for a group of criminals, likewise,
               | produces a group of black men. Outage! Except
               | statistically, this is not a surprise, as a
               | disproportionate amount of criminals are black men.
               | 
               | Only about 1/3 of US prisoners are black. (Not quite the
               | same as "criminals" but of course we don't always know
               | who is committing crimes, only who is charged or
               | convicted.) That's disproportionate to their population,
               | but it's not even close to a majority. If DALLE were to
               | exclusively or primarily return images of black men for
               | "criminals", then it would be reinforcing a harmful
               | stereotype that does not reflect reality.
        
             | stuckinhell wrote:
             | Everything is an ideological war zone now. That's the world
             | we live in now.
        
             | Fnoord wrote:
             | Perhaps its a problem you don't care about?
        
             | ketzo wrote:
             | serious question: in what way is that not a "problem?"
        
               | TheFreim wrote:
               | It's not a problem in a few ways, let me know what you
               | think (feel free to ask for clarification).
               | 
               | 1. The training data would've been the best way to get
               | organic results, the input is where it'd be necessary to
               | have representative samples of populations.
               | 
               | 2. If the reason the model needs to be manipulated to
               | include more "diversity" is that there wasn't enough
               | "diversity" in the training set then its likely the
               | results will be lower quality
               | 
               | 3. People should be free to manipulate the results how
               | they wish, a base model without arbitrary manipulations
               | of "diversity" would be the best starting point to allow
               | users to get the appropriate results
               | 
               | 4. A "diverse" group of people depends on a variety of
               | different circumstances, if their method of increasing it
               | is as naive as some of the are claiming this could result
               | in absurdities when generating historical images or
               | images relating to specific locations/cultures where
               | things will be LESS representative
        
               | bobcostas55 wrote:
               | Well, it's a problem for the ideology.
        
           | kache_ wrote:
           | While their heart is in the right place, I'd like to
           | challenge the idea that certain groups are so fragile that
           | they don't understand that historically, there are more
           | pictures of certain groups doing certain things.
           | 
           | It's a hard problem for sure. But remember, the bias ends
           | with the user using the tool. If I want a black scientist, I
           | can just say "black scientist".
           | 
           | Let _me_ be mindful of the bias, until we have a generally
           | intelligent system that can actually do it. I 'm generally
           | intelligent too, you know.
        
             | micromacrofoot wrote:
             | Historically this is true, but it also seems dangerous to
             | load up these algorithms with pure history because they'll
             | essentially codify and perpetuate historical problems.
        
             | UmYeahNo wrote:
             | >But remember, the bias ends with the user using the tool.
             | If I want a black scientist, I can just say "black
             | scientist".
             | 
             | That is a really, _really_ , narrow viewpoint. I think what
             | people would prefer is that if you query "Scientist" that
             | the images returned are as likely to be any combination of
             | gender and race. It's not that a group is "fragile", it's
             | that they have to specify race and gender at all, when that
             | specificity is not part of the intention. It seems that
             | they recognize that querying "Scientist" will predominantly
             | skew a certain way, and they're trying in some way to
             | unskew.
             | 
             | Or, perhaps, you'd rather that the query be really, really
             | specific? like: "an adult human of any gender and any race
             | and skin color dressed in a laboratory coat...", but I
             | would much rather just say "a scientist" and have the
             | system recognize that _anyone_ can be a scientist.
             | 
             | And then if I need to be specific, then I would be happy to
             | say "a black-haired scientist"
        
               | numpad0 wrote:
               | Kind of funny that NN tech is supposed to construct some
               | upper dimensional understanding, yet realistically cannot
               | be expected to be able to generate gender and race
               | indeterminate portrayal of a scientist.
        
               | kache_ wrote:
               | This is a problem with generative models across the
               | board. It's important that we don't skew our perceptions
               | by GAN outputs as a society, so it's definitely good that
               | we're thinking about it. I just wish that we had a
               | solution that solved across the class of problems
               | "Generative AI feeds into itself and society (which is in
               | a way, a generative AI), creating a positive feedback
               | loop that eventually leads to a cultural freeze"
               | 
               | It's way bigger than just this narrow race issue the
               | current zeitgeist is concerned about.
               | 
               | But I agree, maybe I should skew to being optimistic that
               | at least we're _trying_
        
               | throwaway4aday wrote:
               | Have you seen the queries that are used to generate
               | actually useful results rather than just toy
               | demonstrations? They look a lot more like your first
               | example except with more specificity. It'd be more like
               | "an adult human of any gender and any race and skin color
               | dressed in a laboratory coat standing by a window holding
               | a beaker in the afternoon sun. 1950s, color image, Canon
               | 82mm f/3.6, desaturated and moody." so if instead you are
               | looking for an image with a person of a specific
               | ethnicity or gender then you are for sure going to add
               | that in along with all of the details. If you are instead
               | worried about the bias of the person choosing the image
               | to use then there is nothing short of restricting them to
               | a single choice that will fix that and even in that case
               | they would probably just not use the tool since it wasn't
               | satisfying their own preferences.
        
           | protonbob wrote:
           | Honestly I would rather that they not try. I don't understand
           | why a computer tool has to be held to a political standard.
        
             | daemoens wrote:
             | It's not a political standard though. There is actual
             | diversity in this world. Why wouldn't you want that in your
             | product?
        
               | [deleted]
        
               | mensetmanusman wrote:
               | Fix the data input side, not the data output side. The
               | data input side is slowly being fixed in real time as the
               | rest of the world gets online and learns these methods.
        
               | throwaway4aday wrote:
               | In a sane world we would be able to tack on a disclaimer
               | saying "This model was trained on data with a majority
               | representation of caucasian males from Western English
               | speaking countries and so results may skew in that
               | direction" and people would read it and think "well, duh"
               | and "hey let's train some more models with more data from
               | around the world" instead of opining about systemic
               | racism and sexism on the internet.
        
               | astrange wrote:
               | That wouldn't necessarily fix the issue or do anything. A
               | model isn't a perfect average of all the data you throw
               | into its training set. You have to actually try these
               | things and see if they work.
        
             | norwalkbear wrote:
             | I agree, the trust is broken now. Im going to skip on any
             | AI that pulls that crap.
        
             | Jerrrry wrote:
             | There are legitimate reasons to reduce externalizations of
             | societies innate biases.
             | 
             | A mortgage AI that calculates premiums for the public
             | shouldn't bias against people with historically black
             | names, for example.
             | 
             | This problem is harder to tackle because it is difficult to
             | expose and resign the "latent space" that results in these
             | biases; it's difficult to massage the ML algo's to identify
             | and remove the pathways that result in this bias.
             | 
             | It's simply much easier to allow the robot to be
             | bias/racist/reflective of "reality" (its training data),
             | and add a filter / band-aid on top; which is what they've
             | attempted.
             | 
             | when this is appropriate is the more cultured question; I
             | don't think we should attempt to band-aid these models, but
             | for more socially-critical things, it is definitely
             | appropriate.
             | 
             | It's naive on either extreme: do we reject reality, and
             | substitute or own? Or do we call our substitute reality,
             | and hope the zeitgeist follows?
        
               | ceeplusplus wrote:
               | That's great, but by doing so you are also inadvertently
               | favoring, in your example, the people with black names.
               | For example, Chinese people save on average, 50 times
               | more than Americans according to the Fed [1]. That would
               | mean they would generally be overrepresented in loan
               | approvals because they have a better balance sheet. Does
               | that necessarily mean that Americans are discriminated
               | against in the approval process? No.
               | 
               | My question to you is: is an algorithm that takes no
               | racial inputs (name, race, address, etc) yet still
               | produces disproportionate results biased or racist? I say
               | no.
               | 
               | [1] https://files.stlouisfed.org/files/htdocs/publication
               | s/es/08...
        
               | Jerrrry wrote:
               | I would agree that it is not.
               | 
               | The government, and many people, have moved the
               | definition and goal posts; so that anything that has the
               | end result of a non-proportional uniformity can be
               | labeled and treated as bias.
               | 
               | Ultimately it is a nuanced game; is discriminating
               | against certain clothing or hair-styles racist? Of
               | course. Yet, neither of those are explicitly tied to
               | one's skin color or ethnicity, but are an indirect
               | associative trait because of culture.
               | 
               | In America, we have intentionally muddled the waters of
               | demarcation between culture and race, and are starting to
               | see the cost of that.
        
               | mh- wrote:
               | _> A mortgage AI that calculates premiums for the public
               | shouldn 't bias against people with historically black
               | names, for example._
               | 
               | That's a great example, thanks. Also, I hope the teams
               | working on that come up with a different solution...
        
               | [deleted]
        
           | [deleted]
        
         | tablespoon wrote:
         | > Turns out that they randomly, silently modify your prompt
         | text to append words like "black male" or "female".
         | 
         | I wonder what the distribution of those modifications is?
        
           | Hard_Space wrote:
           | Today, when DALL-E was still free, my Dad asked me to try a
           | prompt about the Buddha sitting by a river, contemplating. I
           | did about 4 prompt variations, and one of them was an Asian
           | female, if that gives any idea about the frequency (I should
           | note that the depiction was of a young, slim, and attractive
           | female Buddha, so I'm not sure they have the bias thing
           | licked just yet).
        
           | speedgoose wrote:
           | In my little testing, diversity in ethnicities was achieved
           | but not realistic given the context. I also got a few
           | androgynous people as I asked for a male or a female and
           | another gender was appended.
        
         | Invictus0 wrote:
         | A dumb solution to a dumber problem.
        
         | tshaddox wrote:
         | That Twitter thread is full of people saying "yeah that doesn't
         | seem to be true at all" so I'm hesitant to jump to conclusions
         | even if we're deciding to believe random tweets.
        
         | causi wrote:
         | Interesting. Considering this is now a paid product, is
         | modifying user input covered by their ToS? If I was spending a
         | lot of money on it I'd be rather annoyed my input was being
         | silently polluted.
        
           | zikduruqe wrote:
           | Don't spend money. Use https://www.craiyon.com
        
             | scott_s wrote:
             | _[shudder]_
             | 
             | I tried the first whimsical, benign thing I could think of:
             | "indiana jones eating spaghetti." The results are clearly
             | recognizable as that. But they are also a kaleidoscope of
             | body horror; a Indiana Jones monster melted into Cthulu
             | forms inhaling plates that are slightly _not_ spaghetti.
        
             | bhaney wrote:
             | This produces dramatically worse results in my experience.
        
               | minimaxir wrote:
               | Not worse, but different. It depends on the prompt but
               | DALL-E mini/mega seems to do better then DALL-E 2 for
               | certain types of absurd prompts, such as the ones in
               | /r/weirddalle
        
               | causi wrote:
               | Yes, there are very sharp lines where it does and doesn't
               | understand. It understands color and gender but not
               | materials. I got very good outputs for "blue female
               | Master Chief" but "starship enterprise made out of candy"
               | was complete garbage.
        
               | elcomet wrote:
               | Definitely worse-quality. Maybe more diverse for some
               | prompts yeah.
        
             | kuprel wrote:
             | This one is faster, I ported it
             | https://replicate.com/kuprel/min-dalle
        
               | minimaxir wrote:
               | Additionally, it's also open-sourced on GitHub and can be
               | self-hosted, with easy instructions to do so:
               | https://github.com/kuprel/min-dalle
        
             | practice9 wrote:
             | Thankfully it doesn't introduce any researcher bias,
             | doesn't ban people from using it on the basis of country,
             | doesn't use your personal data like phone number...
             | 
             | And the best of all - it does have a meme community around
             | it, and you can always donate if you feel it adds value to
             | your life
        
           | kingkawn wrote:
           | The racist pollution came long before this product was a
           | glimmer in our eye.
        
           | tptacek wrote:
           | Your input isn't being polluted by this any more than it is
           | when the tokens in it are ground up into vectors and
           | transformed mathematically. You just have an easier time
           | understanding this transformation.
        
             | throwuxiytayq wrote:
             | Obviously, it's polluted. Undisputably. In a mathematical
             | sense, an extra (black box) transformation is performed on
             | the input to the model. In a practical sense (eg. if you're
             | researching the model), this is like having dirty
             | laboratory tools - all measurements are slightly off. The
             | presumption by OpenAI is that the measurements are _off in
             | the correct way_.
             | 
             | I'm interested in using Dall-E commercially, but I think
             | some competitor offering sampling with raw input will have
             | a better chance at my wallet.
        
               | tptacek wrote:
        
               | throwuxiytayq wrote:
               | Yeah man, but literally the entire point of this AI
               | picture generator is that it's, like, super _accurate_ at
               | rendering the prompt, and stuff.
               | 
               | I don't understand the relevance of the black box's
               | scrutability - _I just want to play with the black box_.
               | I am interested in increasing my understanding of the
               | black box, not of a trust-me-it 's-great-our-intern-
               | steve-made-it black box derivative.
        
               | tptacek wrote:
               | You should make your own black boxes then. By all means,
               | send your dollars to whatever service passes your purity
               | test; I'm just saying that the idea that DALL-E is
               | "polluting" your input is risible. It's already polluting
               | your data at, like, a subatomic level, at
               | dimensionalities it hadn't even occurred to you to
               | consider, and at enormous scale.
        
         | bantou_41 wrote:
         | Diversity = black now? That's even more racist.
        
           | xyzzyz wrote:
           | Diversity has meant exactly that all the way since Bakke.
        
         | [deleted]
        
         | konfusinomicon wrote:
         | as far as I can tell, they also concatenate "On LSD" to every
         | prompt as well.
        
       | DecayingOrganic wrote:
       | Since many people will start generating their first images soon,
       | be sure to check out this amazing DALL-E prompt engineering book
       | [0]. It will help you get the most out of DALL-E.
       | 
       | [0]: https://dallery.gallery/wp-content/uploads/2022/07/The-
       | DALL%... (PDF)
        
         | ru552 wrote:
         | nice write up, thanks
        
         | uplifter wrote:
         | Thanks for this! A bit of prompt engineering know-how will help
         | me get the most bang for the buck out of this beta. I also just
         | want to say that dallery.gallery is delightfully clever naming.
        
         | ZeWaka wrote:
         | This is absolutely amazing. Thanks!
        
       | c0decracker wrote:
       | Interesting. I got access couple weeks ago (was on waitlist since
       | the initial announcement) and frankly as much as really want to
       | be excited and like it, DALL-E ended up being a bit
       | underwhelming. IMHO - often results that produced are of low
       | quality (distorted images, or quite wacky representation of the
       | query). Some styles of imagery are certainly a better fit for
       | being generated by DALL-E, but as far as commercial usage I think
       | it needs a few iterations and probably even larger underlying
       | model.
        
         | simonw wrote:
         | This book has some very good, actionable advice on crafting
         | prompts that get better results out of DALL-E:
         | https://dallery.gallery/the-dalle-2-prompt-book/
        
         | andybak wrote:
         | I also got access a couple of weeks ago and I can't fathom how
         | anyone could be underwhelmed by it.
         | 
         | What were you expecting?
        
           | c0decracker wrote:
           | Fundamentally I have two categories of issues I see with
           | DALL-E, but please don't get me wrong -- I think this is a
           | great demonstration of what is possible with huge models and
           | I think OpenAI work in general is fantastic. I will most
           | certainly continue using both DALL-E and OpenAI's GPT3. (1)
           | Between what DALL-E can do today and commercial utility is a
           | rift in my opinion. I readily admit that I am have not done
           | hundreds of queries (thank you folks for pointing that out,
           | I'll practice more!) but that means that there is a learning
           | curve, isn't it? I can't just go to DALL-E, mess with it for
           | 5-10 minutes and get my next ad or book cover or illustration
           | for my next project done? (2) I think DALL-E has issues with
           | faces and human form in general. Images it produces are often
           | quite repulsive and take the uncanny valley to the next
           | level. I absolutely surprise myself when I noticed thinking
           | that images with humans DALL-E produced lack of... soul? Cats
           | and dogs on the other hand it handles much better. I done
           | tests with other entities --- say cars or machinery -- and it
           | generally performs so so with them too, often creating
           | disproportionate representations of them or misplacing
           | chunks. If you're querying for multiple objects on a scene it
           | quite often melds them together. This is more pronounced in
           | photorealistic renderings. When I query for painting-style it
           | works mostly better. That said every now and then it does
           | produce a great image, but with this way of arriving at it,
           | how fast I'll have to replenish those credits?.. :)
           | 
           | All in all though I think I am underwhelmed mostly because my
           | initial expectations were off, I am still a fan of DALL-E
           | specifically and GPT3 in general. Now when is GPT4 coming
           | out? :)
        
           | harpersealtako wrote:
           | Dalle seems to only have a few "styles" of drawing that it is
           | actually "good" at. It is particularly strong at these styles
           | but disappointingly underwhelming at anything else, and will
           | actively fight you and morph your prompt into one of these
           | styles even when given an inpainting example of exactly what
           | you want.
           | 
           | It's great at photorealistic images like this:
           | https://labs.openai.com/s/0MFuSC1AsZcwaafD3r0nuJTT, but it's
           | intentionally lobotomized to be bad at faces, and often has
           | an uncanny valley feel in general, like this:
           | https://labs.openai.com/s/t1iBu9G6vRqkx5KLBGnIQDrp (never
           | mind that it's also lobotomized to be unable to recognize
           | characters in general). It's basically as close to perfect as
           | an AI can be at generating dogs and cats though, but anything
           | else will be "off" in some meaningful ways.
           | 
           | It has a particular sort of blurry, amateur oil painting
           | digital art style it often tries to use for any colorful
           | drawings, like this:
           | https://labs.openai.com/s/EYsKUFR5GvooTSP5VjDuvii2 or this:
           | https://labs.openai.com/s/xBAJm1J8hjidvnhjEosesMZL . You can
           | see the exact problem in the second one with inpainting: it
           | utterly fails at the "clean" digital art style, or drawing
           | anything with any level of fine detail, or matching any sort
           | of vector art or line art (e.g. anime/manga style) without
           | loads of ugly, distracting visual artifacts. Even Craiyon and
           | DALLE-mini outperform it on this. I've tried over 100 prompts
           | to get stuff like that to generate and have not had a single
           | prompt that is able to generate anything even remotely good
           | in that style yet. It seems almost like it has a "resolution"
           | of detail for non-photographic images, and any detail below a
           | certain resolution just becomes a blobby, grainy brush
           | stroke, e.g. this one:
           | https://labs.openai.com/s/jtvRjiIZRsAU1ukofUvHiFhX , the
           | "fairies" become vague colored blobs here. It can generate
           | some pretty ok art in very specific styles, e.g. classical
           | landscape paintings:
           | https://labs.openai.com/s/6rY7AF7fWPb5wWiSH0rAG0Rm , but for
           | anything other than this generic style it disappoints _hard_.
           | 
           | The other style it is ok at is garish corporate clip art,
           | which is unremarkable and there's already more than enough
           | clip art out there for the next 1000 years of our collective
           | needs -- it is nevertheless somewhat annoying when it
           | occasionally wastes a prompt generating that crap because you
           | weren't specific that you wanted "good" images of the thing
           | you were asking for.
           | 
           | The more I use DALLE-2 the more I just get depressed at how
           | much wasted potential it has. It's incredibly obvious they
           | trimmed a huge amount of quality data and sources from their
           | databases for "safety" reasons, and this had huge effects on
           | the actual quality of the outputs in all but the most mundane
           | of prompts. I've got a bunch more examples of trying to get
           | it to generate the kind of art I want (cute anime art, is
           | that too much to ask for?) and watching it fail utterly every
           | single time. The saddest part is when you can see it's got
           | some incredible glimpse of inspiration or creative genius,
           | but just doesn't have the ability to actually follow through
           | with it.
        
             | napier wrote:
             | GPT3 has seen similar lobotomization since its initial
             | closed beta. Current davinci outputs tend to be quite
             | reserved and bland, whereas when I first had the fortunate
             | opportunity to experience playing with it in mid 2020, if
             | often felt like tapping into a friendly genius with access
             | to unlimited pattern recognition and boundless knowledge.
        
               | harpersealtako wrote:
               | I've absolutely noticed that. I used to pay for GPT-3
               | access through AI Dungeon back in 2020, before it got
               | censored and run into the ground. In the AI fiction
               | community we call that "Summer Dragon" ("Dragon" was the
               | name of the AI dungeon model that used 175B GPT-3), and
               | we consider it the gold standard of creativity and
               | knowledge that hasn't been matched yet even 2 years
               | later. It had this brilliant quality to it where it
               | almost seemed to be able to pick up on your unconscious
               | expectations of what you wanted it to write, based purely
               | on your word choice in the prompt. We've noticed that
               | since around Fall 2020 the quality of the outputs has
               | slowly degraded with every wave of corporate censorship
               | and "bias reduction". Using GPT-3 playground (or story
               | writing services like Sudowrite which use Davinci) it's
               | plainly obvious how bad it's gotten.
               | 
               | OpenAI needs to open their damn eyes and realize that a
               | brilliant AI with provocative, biased outputs is better
               | than a lobotomized AI that can only generate advertiser-
               | friendly content.
        
               | visarga wrote:
               | So it got worse for creative writing, but it got much
               | better at solving few-shot tasks. You can do information
               | extraction from various documents with it, for example.
        
               | napier wrote:
               | I mean yes, you're right insofar as it goes. However
               | nothing I am aware of implies technical reasons linking
               | these two variables into a necessarily inevitable trade-
               | off. And it's not only creative writing that's been
               | hobbled; GPT3 used to be an _incredibly promising_
               | academic research tool and given the right approach to
               | prompts could uncover disparate connections between
               | siloed fields that conventional search can only dream of.
               | 
               | I'm eager for OpenAi to wake up and walk back on the
               | clumsy corporate censorship, and/or for competitors to
               | replicate the approach and improve upon the original
               | magic without the "bias" obsession tacked on. Real
               | challenge though "bias" may pose in some scenarios,
               | perhaps a better way to address this would be at the
               | training data stage rather than clumsily gluing on an
               | opaque approach towards poorly implemented, idealist
               | censorship lacking in depth (and perhaps arguably, also
               | lacking sincerity).
        
         | arecurrence wrote:
         | I suspect you simply need to use it more with a lot more
         | variation in your prompts. In particular, it takes style
         | direction and some other modifiers to really get rolling. Run
         | at least a few hundred prompts with this in mind. Most will be
         | awful output... but many will be absolute gems.
         | 
         | It has, honestly, completely blown me away beyond my wildest
         | imagination of where this technology would be at today.
        
         | [deleted]
        
         | dereg wrote:
         | I felt the same way. If anything, I realized how soulless and
         | uninteresting faceless art is. Dall-E 2 goes out of its way to
         | make terrible faces for, im guessing, deepfake reasons?
        
       | [deleted]
        
       | choppaface wrote:
       | A free alternative:
       | 
       | https://huggingface.co/spaces/dalle-mini/dalle-mini
       | 
       | Reminder that the OpenAI team claimed safety issues about
       | releasing the weights. Now they're charging, when the above link
       | GPU time is being paid for by investor dollars. I guess sama must
       | be hurting if he can only afford OpenAI credit packs for
       | celebrities and his friends.
        
       | softwaredoug wrote:
       | Surprised by the lack of comments on the ethics of DALL-E being
       | trained on artists content whereas copilot threads are chock full
       | of devs up in arms over models trained on open source code. Isn't
       | it the same thing?
        
       | MWil wrote:
       | I've been on the waitlist since April 16th. Would have loved to
       | have played around with the alpha but now clearly my ability to
       | experiment and learn to use the system to cut down on expenses is
       | extremely limited.
        
       | O__________O wrote:
       | Two questions:
       | 
       | (1) Any opinions on if removing the watermark is possible? Is
       | doing so against the terms of service?
       | 
       | (2) Appears the output is still at 1024x1024 - what are options
       | to upscale the resolution, for example would OpenCV super
       | resolution work?
        
         | jeanlucas wrote:
         | It is possible, they confirmed on discord you can remove the
         | watermark.
         | 
         | Yep... The output is an issue, I'd like to pay if that was an
         | upgrade.
        
           | O__________O wrote:
           | Annoying that if removing the watermark is allowed that it is
           | even inserted. Imagine if Adobe did that.
           | 
           | Here's more information on super resolution options beyond
           | what Adobe already offers:
           | 
           | (1) List of options current options for super resolutions:
           | 
           | https://upscale.wiki/wiki/Different_Neural_Networks
           | 
           | (2) Older example of one way to benchmark:
           | 
           | https://docs.opencv.org/4.x/dc/d69/tutorial_dnn_superres_ben.
           | ..
        
       | moron4hire wrote:
       | How do you interface with DALL-E?
       | 
       | For MidJourney I was painfully surprised to find that everything
       | is done through chat messages on a Discord server.
       | 
       | I'm not a paid member, so I have to enter my prompts in public
       | channels. It's extremely easy to lose your own prompts in the
       | rapidly flowing stream of prompts going on. I can kind of see why
       | they did it that way--maybe, if I squint really hard--to try to
       | promote visibility and community interaction, but it's just not
       | happening. It's hard enough to find my own images, say nothing
       | about follow what someone else is doing. This is literally the
       | worst user experience I have ever had with a piece of software.
       | 
       | There are dozens of channels. It's so spammy, doing it through
       | Discord. It's constantly pinging new notifications and I have to
       | go through and manually mute each and every one of the channels.
       | Then they open a few dozen more. Rinse. Repeat.
       | 
       | I understand paid users can have their own channels to generate
       | images, but I really don't see the point in paying for it when,
       | even subtracting the firehose of prompts and images, it's still
       | an objectively shitty interface to have to do everything through
       | Discord chat messages.
        
       | neya wrote:
       | I'm curious to know - does the community have any open source
       | alternatives to DALL.E? For an initiative named OpenAI, keeping
       | their source code and models closed behind a license is bullshit
       | in my opinion.
        
         | gwern wrote:
         | EAI/Emad/et al's 'Stable Diffusion' model will be coming out in
         | the next month or so. I don't know if it will hit DALL-E 2
         | level but a lot of people will be using it based on the during-
         | training samples they've been releasing on Twitter.
        
         | minimaxir wrote:
         | The best open-source-but-actually-can-be-run-on-simple-infra
         | analogous to DALL-E 2 is min-dalle:
         | https://github.com/kuprel/min-dalle
        
         | arecurrence wrote:
         | LAION is working on open source alternatives. There's a lot of
         | activity in their discord and they have amassed the necessary
         | training data but I am uncertain as to whether they have
         | obtained the funding needed to deliver fully trained models.
         | Phil Wang created initial implementations of several papers
         | including imagen and parti in his GitHub account. EG:
         | https://github.com/lucidrains/DALLE2-pytorch
        
       | draw_down wrote:
        
       | selimnairb wrote:
       | I like how everyone's face is rendered by DALL-E to look either
       | like a still from a David Lynch film, or have teeth and hair
       | coming out of weird places.
        
       | pawelduda wrote:
       | That's disappointing given up until this point you could have 50
       | free uses per 24h. I expected it to get monetized eventually, but
       | not so fast and drastically. Well, still had my fun and have to
       | say the creations are so good it's often mind blowing there's an
       | AI behind it.
        
         | mysore wrote:
         | they're a non-profit so the price is probably still dirt cheap
        
           | ajafari1 wrote:
           | Not correct. They have a for-profit entity now. That's why
           | there is a huge incentive to monetize. Any for-profit
           | investment gain is capped at 100x, with the rest required to
           | go to their nonprofit. This commercialization is just as I
           | predicted in my substack post 2 days ago that hit the front
           | page of Hacker News: https://aifuture.substack.com/p/the-ai-
           | battle-rages-on
        
         | dougmwne wrote:
         | Honestly, it is probably just that expensive to run. You can't
         | expect someone to hand you free compute of significant value
         | and directly charging for it is a lot better than other things
         | they could do.
        
       | bulbosaur123 wrote:
        
         | hhmc wrote:
         | So you actually _wanted_ images that perpetuate the biases of
         | the world?
        
           | Geonode wrote:
           | Reducing bias means affecting the data, instead of letting
           | the end user just choose an appropriate image generated by a
           | clean data set.
        
           | illwrks wrote:
           | I thought the same thing but I think the commenter is making
           | a joke, but I could be wrong.
           | 
           | I think they are suggesting that things like this (neural
           | nets etc) work using bias, and by removing "bias" the
           | developers are making the product worse.
           | 
           | It's a very sh!t comment if it's not a joke.
        
             | aloisdg wrote:
             | Just to be sure. Does "OC" here mean Original Comment?
        
               | illwrks wrote:
               | Typo, now fixed.
        
           | minimaxir wrote:
           | Unfortunately, the method OpenAI may be using to reduce bias
           | (by adding words to the prompt unknown to the user) is a
           | naive approach that can affect images unexpectedly and
           | outside of the domain OpenAI intended:
           | https://twitter.com/rzhang88/status/1549472829304741888
           | 
           | I have also seeing some cases where the bias correction may
           | not be working at all, so who knows. And it's why
           | transparancy is important.
        
             | CobrastanJorji wrote:
             | What a fascinating hack. I mean, yeah, naive and simplistic
             | and doesn't really do anything interesting with the model
             | itself, but props to the person who was given the "make
             | this more diverse" instruction and said "okay, what's the
             | simplest thing that could possibly work? What if I just
             | append some races and genders onto the end of the query
             | string, would that mostly work?" and then it did! Was it a
             | GOOD idea? Maybe not. But I appreciate the optimization.
        
             | kmeisthax wrote:
             | This sounds like something that could backfire _very badly_
             | on certain prompts.  "person eating a watermelon" for
             | example.
        
           | bulbosaur123 wrote:
           | Yes, I did. I want it to show world as it is not as people
           | want it to be.
        
           | scifibestfi wrote:
           | How do you remove bias as long as humans are in the loop?
           | Aren't they just swapping one bias for their own?
        
       | brycethornton wrote:
       | I'm blown away by this:
       | 
       | "Starting today, users get full usage rights to commercialize the
       | images they create with DALL*E, including the right to reprint,
       | sell, and merchandise. This includes images they generated during
       | the research preview."
       | 
       | I assumed this was going to be the sticking point for wider usage
       | for a long time. They're now saying that you have full rights to
       | sell Dall-E 2 creations?
        
         | vlunkr wrote:
         | Is the lesson here that these images are worth nothing so they
         | lose nothing by giving them away?
        
         | [deleted]
        
         | nutanc wrote:
         | And I just used it to create cover art for a book published in
         | Amazon :)
         | 
         | https://twitter.com/nutanc/status/1549798460290764801?s=20&t...
        
           | pqdbr wrote:
           | What was your prompt?
        
             | nutanc wrote:
             | "girl with a cap standing next to a shadow man having a
             | speech bubble, digital art"
        
         | pferdone wrote:
         | Does DALL-E create different outputs for the same input? How
         | does ownership work there?
        
           | flatiron wrote:
           | yes it will. it'll keep on augmenting the image until it
           | recognizes it as the input
        
           | minimaxir wrote:
           | Not only that, but you can also upload an image (that doesn't
           | depict a real person) and generate variations of it without
           | providing a prompt.
        
         | berberous wrote:
         | I think they are reacting to competition. MidJourney is
         | amazing, was easier to get into, gives you commercial rights,
         | and frankly I found more fun to use and even better output in
         | most instances.
        
           | napier wrote:
           | The only thing I don't like about MidJourney is the Discord
           | based interface. I think I can grok why Dave chose this route
           | as it bakes in an active community element and allows users
           | to pick up prompt engineering techniques osmotically... but
           | I'd prefer a clean DALL-E style app and cli / api access.
        
             | berberous wrote:
             | In case you don't know, you can at least PM the MidJourney
             | bot so you have an uncluttered workspace.
             | 
             | It's clearly personally preference, but I loathe Discord
             | but love it for MidJourney. As you said, there's an
             | interactive element where I see other people doing cool
             | things and adapting part of their prompts and vice versa.
             | It really is fun. And when you do it in a PM, you have all
             | your efforts saved. DALL-E is pretty clunky in that you
             | have to manually save an image or lose it once your history
             | rolls off.
        
               | napier wrote:
               | Thanks. Yeah fair point; I haven't ponied up for a
               | subscription yet so am still stuck in public channels and
               | often find my generations get lost in the stream. Imagine
               | you're right and having the PM option would change the
               | experience drastically for the better albeit still within
               | Discord's visually chaotic environment.
        
           | davedx wrote:
           | MidJourney seems a little less all-out commercial. The way
           | everyone's creations are in giant open Discord channels is
           | great too
        
           | stoicjumbotron wrote:
           | Really hope I get an invite for MidJourney soon. Been on the
           | waitlist since March :(
        
             | ozmodiar wrote:
             | Midjourney is in open beta now. Just go to their site and
             | you can get started right away. I got in and I wasn't even
             | on their waiting list.
        
           | pitzips wrote:
           | Midjourney recently changed their terms of service and now
           | the creators own the image and give a license back to
           | Midjourney. Pretty cool.
        
           | jaggs wrote:
           | nightcafe.studio is also free and good. Very good.
        
           | MatthiasPortzel wrote:
           | MidJourney definitely struggles more with complex prompts
           | from what I saw. If you like the output more, that's
           | subjective, but I think DALL*E is the leader in the space by
           | a wide margin.
        
             | berberous wrote:
             | I think both have strengths and weaknesses, but I don't
             | disagree DALL-E in most instances is technically better at
             | matching prompts. But I often enjoyed, artistically, the
             | results of MidJourney more; it just felt fun to use and
             | explore.
        
           | skybrian wrote:
           | Don't they both give you commercial rights now?
           | 
           | I have access to both and they're good for different things.
           | DALL-E seems somewhat more likely to know what you mean.
           | Midjourney seems better for making interesting fantasy and
           | science fiction environments.
           | 
           | For comparison, I tried generating images of accordions.
           | Midjourney doesn't really understand that an accordion has a
           | bellows [1]. DALL-E manages to get the right shape much of
           | the time, if you don't look too closely: [2], [3]. Neither of
           | them knows the difference between piano and button
           | accordions.
           | 
           | Neither of them can draw a piano keyboard accurately, but
           | DALL-E is closer if you don't look too hard. (The black notes
           | aren't in alternating groups of two and three.)
           | 
           | Neither of them understands text; text on a sign will be
           | garbled. Google's Parti project can do this [4], but it's not
           | available to the public.
           | 
           | I expect DALL-E will have many people sign up for occasional
           | usage, because if you don't use it for a few months, the free
           | credits will build up. But Midjourney's pricing seems better
           | if you use it every day?
           | 
           | [1] https://www.reddit.com/r/Accordion/comments/uuwrbj/midjou
           | rne...
           | 
           | [2] https://www.reddit.com/r/Accordion/comments/vz9zxw/dalle_
           | sor...
           | 
           | [3] https://www.reddit.com/r/Accordion/comments/w0677q/accord
           | ion...
           | 
           | [4] https://parti.research.google/
        
         | minimaxir wrote:
         | Previously, OpenAI asserted they owned the generated images, so
         | the new licensing is a shift in that aspect. GPT-3 also has a
         | "you own the content" clause as well.
         | 
         | Of course, that clause won't deter a third party from filing a
         | lawsuit against you if you commercialize a generated image
         | _too_ close to something realistic, as the copyrights of AI
         | generated content still hasn 't been legally tested.
        
           | LegitShady wrote:
           | As far as I can tell they still own the images they just
           | license your use of them commercially.
        
           | pornel wrote:
           | AFAIK only people can own copyright (the monkey selfie case
           | tested this), and machine-generated outputs don't count as
           | creative work (you can't write an algorithm that generates
           | every permutation of notes and claim you own every song[1]),
           | so DALL-E-generated images are most likely copyright-free. I
           | presume OpenAI only relies on terms of service to dictate
           | what users are allowed to do, but they can't own the images,
           | and neither can their users.
           | 
           | [1]: https://felixreda.eu/2021/07/github-copilot-is-not-
           | infringin...
        
             | TaylorAlexander wrote:
             | The monkey selfie was not derived from millions of existing
             | works, and that is the difference. If an artist has a well-
             | known art style, and this algorithm was trained on it and
             | can copy that style, would the artist have grounds to sue?
             | I don't know.
        
               | l33t2328 wrote:
               | If I write a song am I not deriving it from the existing
               | works I've been exposed to?
        
               | TaylorAlexander wrote:
               | Sure but if you just release a basic copy of a Taylor
               | Swift song you will get sued to oblivion. So the law
               | seems (IANAL) to care about how similar your work is to
               | existing works. DALL-E does not seem capable of showing
               | you the work that influenced a result, so users will have
               | no idea if a result might be infringing. What this means
               | to me is that with many users, some of the results would
               | be legally infringing.
        
               | Melting_Harps wrote:
               | > If an artist has a well-known art style, and this
               | algorithm was trained on it and can copy that style,
               | would the artist have grounds to sue? I don't know.
               | 
               | While nothing has been commercialized yet on the DALLE2
               | subreddit, I know that it can do Dave Choe's work
               | remarkably well. I also saw Alex Gray's work to be close,
               | but not really identical either. It wasn't as intricate
               | as his work is.
               | 
               | It will be interesting if this takes off and you have a
               | sort of Banksy effect take over where unless it's a
               | physical piece of art it doesn't have much value and is
               | only made all the better because of some sort polemic
               | attached to it, eg Girl with balloon.
        
               | lancesells wrote:
               | I'm going to guess there's not going to be much value
               | placed on anything out of DALLE for a long while. Digital
               | art is typically worth much less than physical art and I
               | would say these GAN images are going to worth less than
               | digital art generated by human hand.
               | 
               | There will be outliers of course but I would be shocked
               | if there's much of a market in it for at least the
               | present.
        
               | napier wrote:
               | When these tools can generate layered tiff/psd images,
               | polygon meshes and automate UV packing; then we'll be
               | talking.
        
               | ZetaZero wrote:
               | > If an artist has a well-known art style, and this
               | algorithm was trained on it and can copy that style...
               | 
               | A lawyer could argue that the algorithm is producing a
               | derivative work of the copyrighted input.
        
               | TaylorAlexander wrote:
               | Right but if that work isn't significantly changed from
               | the source, it could be ruled as infringement. DALL-E
               | cannot tell the users (afaik) if a result is close to any
               | source material.
        
               | lbotos wrote:
               | Well, music is not "pictures" but Marvin Gaye's family
               | got 5 million because Blurred Lines sounds similar enough
               | to a Marvin Gaye song (even though it was not a sample): 
               | https://en.wikipedia.org/wiki/Pharrell_Williams_v._Bridge
               | por...
        
               | [deleted]
        
               | ChadNauseam wrote:
               | Even if you imitate someone's style intentionally, they
               | don't have grounds to sue. Style isn't copyrightable in
               | the US. Whether DALL-E outputs are a derivative work is a
               | different question, though
        
             | fanzhang wrote:
             | If this were a concern, a user can easily bypass this by
             | having a work-for-hire person add a minor transform layer
             | on top of the DALL-E generated images right?
        
               | JacobThreeThree wrote:
               | Wouldn't it have to meet the threshold of being a
               | "transformative" work?
               | 
               | https://en.wikipedia.org/wiki/Transformative_use
        
             | leereeves wrote:
             | > DALL-E-generated images are most likely copyright-free
             | 
             | The US Copyright Office did make a ruling that might
             | suggest that recently[1], but crucially, in that case, the
             | AI "didn't include an element of human authorship." The
             | board might rule differently about DALL-E because the
             | prompts do provide an opportunity for human creativity.
             | 
             | And there's another important caveat that the felixreda.eu
             | link seems to miss. DALL-E output, whether or not it's
             | protected by copyright, can certainly _infringe_ other
             | copyrights, just like the output of any other mechanical
             | process. In short, Disney can still sue if you distribute
             | DALL-E generated images of Marvel characters.
             | 
             | 1: https://www.theverge.com/2022/2/21/22944335/us-
             | copyright-off...
        
             | totetsu wrote:
             | Can I infringe another Dalle users rights if I take an
             | image generated by their acount and sell prints of it..?
        
             | unnah wrote:
             | DALL-E can generate recognizable pictures of Homer Simpson,
             | Batman and other commercial properties. Such images could
             | easily be considered derivative works of the original
             | copyrighted images that were used as training input. I'm
             | sure there are plenty of corporate IP lawyers ready to
             | argue the point at court.
        
               | numpad0 wrote:
               | I'm kind of surprised that no one had found "verbatim
               | copy" cases as were made with GitHub Copilot. Such exact
               | copies in photography are likely easier to go for than
               | with code snippets.
        
               | Nition wrote:
               | It might be interesting to find an image in the training
               | set with a long, very unique description, and try that
               | exact same description as input in DALL*E 2.
               | 
               | Of course it's unlikely to produce the exact same image,
               | or if it does, you've also discovered an incredible image
               | compression algorithm.
        
           | obert wrote:
           | they still own the generated content, only grant usage. I
           | have mixed feelings about this confused approach, it won't
           | last long.
           | 
           | > ...you own your Prompts and Uploads, and you agree that
           | OpenAI owns all Generations...
        
           | mensetmanusman wrote:
           | Image generating artificial intelligence is very analogous to
           | a camera.
           | 
           | Both technologies have billions of dollars of R&D and tens of
           | thousands of engineers behind supply chains necessary to
           | create the button that a user has the press.
        
             | minimaxir wrote:
             | There have been decades of litigation around when/where/of
             | whom you can take a photo. AI generated art isn't there.
        
         | mensetmanusman wrote:
         | They will benefit by getting additional feedback on which
         | output images are most useful.
        
           | minimaxir wrote:
           | DALL-E 2 has a "Save" feature which is likely a data
           | gathering mechanism for this use case.
        
         | Melting_Harps wrote:
         | > "Starting today, users get full usage rights to commercialize
         | the images they create with DALL*E, including the right to
         | reprint, sell, and merchandise. This includes images they
         | generated during the research preview."
         | 
         | >> And I just used it to create cover art for a book published
         | in Amazon :)
         | 
         | Man... what a missed opportunity for Altman... he could have
         | had a really good cryptocurrency/token with a healthy ecosystem
         | and a creative based community if he didn't push this Worldcoin
         | biometric harvesting BS had he just waited for this to release
         | and coupled it with access to GPT.
         | 
         | This is the kind of thing that Web3 (a joke) was pushing for
         | all along: revolutionary tech that the everyday person can
         | understand with it's own token based ecosystem for access with
         | full creative rights from the prompts.
         | 
         | I wonder if he stepped down from Open AI and put it in a
         | figurehead as CEO could this still work?
         | 
         | > Why is using a token better than using money, in this case?
         | 
         | It would be better for OpenAI if it can monetize not just its
         | subscription based model via a token to pay for overhead and
         | for further R/D but also for it's ability to issue tokens it
         | can freely exchange for utility on it's platform for exclusive
         | access outside of it's capped $15 model and allow for pay as
         | you go models for those who don't have access to it like myself
         | as it's limited to 1 million users.
         | 
         | I don't want an account, and I think that type of gatekeeping
         | wasn't cool during the gmail days either and I had early access
         | back then too, but I'd still personally buy $100s of dollars
         | worth of prompts right now since I think it is fascinating use
         | of NLP and I'm just one of many missed opportunities and
         | represent a lost userbase who just want access for specific
         | projects. By doing this they can still retain the caps of
         | useage on their platform and expand and contract them as they
         | see fit without excluding others.
         | 
         | This in turn could justify the continual investment from the VC
         | World into these projects (under the guise of web3) and allow
         | them to scale into viable businesses and further expand the use
         | of AI/ML into other creative spaces, which as a person studying
         | AI and ML and a background in BTC, is what we all wanted to see
         | instead of these aimless bubbles in things like Solana or yield
         | farming via fake DeFi projects like Calesius that we've seen.
         | 
         | It would legitimize the use of a token for use of an ecosystem
         | model outside of BTC, which to be honest doesn't really exist
         | and has still a tarnished view with all these failed projects,
         | while gaining reception amongst a greater audience since it's
         | captivated so many since it's release.
        
           | pliny wrote:
           | Why is using a token better than using money, in this case?
        
             | mod wrote:
             | I assume something to do with proving ownership via NFT.
        
         | rvz wrote:
         | It also means there will possibly be another renaissance of
         | fully automated, mass generated NFTs and tons of derivatives
         | and remixes flooding the NFT market in an attempt to pump the
         | NFT hype again.
         | 
         | It doesn't matter, OpenAI wins anyway as these companies will
         | pour hundreds of thousands into generated images.
         | 
         | It seems that the NFT grift is about to be rebooted again, such
         | that it isn't going to die _that_ quickly. But still,
         | eventually 90% of these JPEG NFTs will die anyway.
        
           | WalterSear wrote:
           | NFTs were never limited by artwork availability - they are
           | limited by wash-trading ability.
        
             | rvz wrote:
             | These high photorealistic images can be generated on a
             | mass-scale, completely automated without a human which
             | ultimately cuts the need for an artist to do that.
             | 
             | They will be replaced by DALL*E 2 for creating these
             | illustrations, book covers, NFT variants, etc opening up
             | the whole arena to anyone to do this themselves. All it
             | takes is to _describe what they want in text_ and less than
             | a minute, the work is delivered as little as $15.
             | 
             | OpenAI still wins either way. If a crypto company goes to
             | using DALL*E 2 to generate photorealistic NFTs, they won't
             | stop them and they will take the money.
        
               | WalterSear wrote:
               | I'm not sure I understand the point you are trying to
               | make.
               | 
               | Art is already dirt cheap. People aren't buying NFTs for
               | their content. This doesn't make it appreciably easier to
               | con rubes.
        
         | bilsbie wrote:
         | Every tech should do this. Could google maps silently change
         | your designation to a minority owned alternative?
        
       | [deleted]
        
       | peteforde wrote:
       | I have been having a blast with DALL-E, spending about an hour a
       | day trying out wild combinations and cracking my friends up. I
       | cannot imagine getting bored of it; it's like getting bored with
       | visual stimulus, or art in general.
       | 
       | In fact, I've been glad to have a 50/day limit, because it helps
       | me contain my hyperfocus instincts.
       | 
       | The information about new pricing is, to me as someone just
       | enjoying making crazy imagines, a huge drag. It means that to do
       | the same 50/day I'd be spending $300/month.
       | 
       | OpenAI: introduce a $20/month non-commercial plan for 50/day, and
       | I'll be at the front of the line.
        
         | jnovek wrote:
         | My heart sank when I saw the pricing model.
         | 
         | I've been creating generative art since 2016 and I've been
         | anxiously waiting for my invite. I wont be able to afford to
         | generate the volume of images it takes to get good ones at this
         | price point.
         | 
         | I can afford $20/mo for something like this but I just can't
         | swing $200 to $300 it realistically takes to get interesting
         | art out of these CLIP-centric models.
         | 
         | Heck, the initial 50 images isn't even enough to get the hang
         | of how the model behaves.
        
           | blueboo wrote:
           | If you're technically inclined, I urge you to explore some
           | newer Colabs being shared in this space. They offer vastly
           | more configurable tools, work great for free on Google Colab,
           | are straightforward to run on a local machine.
           | 
           | Meanwhile we should prepare ourselves for a future where the
           | best generative models cost a lot more as these companies
           | slice and dice the (huge) burgeoning market here.
        
           | pkaye wrote:
           | I'm sure the prices will go down each year as the computing
           | costs go down.
        
           | wongarsu wrote:
           | MidJourney is a good alternative. Maybe not quite as good as
           | DALL-E, but close enough, without a waitlist and with hobby-
           | friendly prices ($10/month for 200 images/month, or $30 for
           | unlimited)
        
         | commandlinefan wrote:
         | > trying out wild combinations and cracking my friends up
         | 
         | Wait until the next edition comes out where it automatically
         | learns the sorts of things that crack you up and starts
         | generating them without any input from you.
        
         | Filligree wrote:
         | MidJourney gives ~unlimited generation for $30/month, and is
         | nearly as good. Unlike DALL-E it doesn't deliberately nerf face
         | generation. I've been having a blast.
        
         | irrational wrote:
         | Sounds kind of like scribblenauts. I would try the craziest
         | things to see what it could come up with.
        
         | dave_sullivan wrote:
         | I think people don't realize how huge these models really are.
         | 
         | When they're free, it's pretty cool. But charge an amount where
         | there's actual profit in the product? Suddenly seems very
         | expensive and not economically viable for a lot of use cases.
         | 
         | We are still in the "you need a supercomputer" phase of these
         | models for now. Something like DALLE mini is much more
         | accessible but the results aren't good enough. Early early
         | days.
        
           | TigeriusKirk wrote:
           | What _are_ the resources at work here?
           | 
           | What are the resources needed to train this model?
           | 
           | If someone just gave you the model for free, what resources
           | would you need to use it to generate new results?
        
             | dplavery92 wrote:
             | In the unCLIP/DALL-E 2 paper[0], they train the
             | encoder/decoder with 650M/250M images respectively. The
             | decoder alone has 3.5B parameters, and the combined priors
             | with the encoder/decoder are the in the neighborhood of ~6B
             | parameters. This is large, but small compared to the name-
             | brand "large language models" (GPT3 et. al.)
             | 
             | This means the parameters of the trained model fit in
             | something like 7GB (decoder only, half-precision floats) to
             | 24GB (full model, full-precision). To actually run the
             | model, you will need to store those parameters, as well as
             | the activations for each parameter on each image you are
             | running, in (video) memory. To run the full model on device
             | at inference time (rather than r/w to host between each
             | stage of the model) you would probably want an enterprise
             | cloud/data-center GPU like an NVIDIA A100, especially if
             | running batches of more than one image.
             | 
             | The training set size is ~97TB of imagery. I don't think
             | they've shared exactly how long the model trained for, but
             | the original CLIP dataset announcement used some benchmark
             | GPU training tasks that were 16 GPU-days each. If I were to
             | WAG the training time for their commercial DALL-E 2 model,
             | it'd probably be a couple of weeks of training distributed
             | across a couple hundred GPUs. For better insight into what
             | it takes to train (the different stages/components of) a
             | comparable model, you can look through an open-source
             | effort to replicate DALL-E 2.[2]
             | 
             | [0] https://cdn.openai.com/papers/dall-e-2.pdf [1]
             | https://openai.com/blog/clip/ [2]
             | https://github.com/lucidrains/dalle2-pytorch
        
               | peteforde wrote:
               | Thanks for the really excellent insight and links.
               | 
               | I do hope that the conversation starts to acknowledge the
               | difference between sunk costs and running costs.
               | 
               | Employees, office leases and equiment are all happening,
               | regardless and ongoing.
               | 
               | Training DALL-E 2: very expensive, but done now. A sunk
               | cost where every dollar coming in makes the whole
               | endeavor more profitable.
               | 
               | Operating the trained model: still expensive, but you can
               | chart out exactly how expensive by factoring in hardware
               | and electricity.
               | 
               | I believe that by not explicitly separating these
               | different columns when discussing expense vs profit,
               | we're making it harder than it needs to be to reason
               | about what it actually costs every time someone clicks
               | Generate.
        
               | woojoo666 wrote:
               | > This means the parameters of the trained model fit in
               | something like 7GB (decoder only, half-precision floats)
               | to 24GB (full model, full-precision)
               | 
               | > you would probably want an enterprise cloud/data-center
               | GPU like an NVIDIA A100, especially if running batches of
               | more than one image.
               | 
               | That doesn't seem so bad.
               | 
               |  _looks up price of NVIDIA A100 - $20,000_
               | 
               | oh...ok I'll probably just pay for the service then
        
               | fennecfoxen wrote:
               | p4d.24xlarge is only $33/hr! And you get 400 Gbe so it
               | should be quick to load.
        
             | binarymax wrote:
             | If I had to guess, based on other large models, it's in the
             | range of hundreds of GBs. It might even be in the TB range.
             | To host that model for fast production SaaS inference
             | requires many GPUs. An A100 has 80GB, so a dozen A100s just
             | to keep it in memory, and more if that doesn't meet the
             | request demand.
             | 
             | Training requires even more GPUs, and I wouldn't be
             | surprised if they used more than 100 and trained over 3
             | months.
        
               | judge2020 wrote:
               | > Training requires even more GPUs, and I wouldn't be
               | surprised if they used more than 100 and trained over 3
               | months.
               | 
               | Based on this blog post where they scale to 7,500
               | 'nodes', they say:
               | 
               | > A large machine learning job spans many nodes and runs
               | most efficiently when it has access to all of the
               | hardware resources on each node.
               | 
               | So I wouldn't be surprised if they do have a total of
               | 7500+ GPUs to balance workloads between. TO add, OpenAI
               | has a long history of getting unlimited access to
               | Google's clusters of GPUs (nowadays they pay for it,
               | though). When they were training 'OpenAI Five' to play
               | Dota 2 at the highest level, they were using 256 P100
               | GPUs on GCP[0] and they casually threw 256 GPUs at 'clip'
               | for a short while in January of 2021[1].
               | 
               | As for how they do it, see these posts:
               | 
               | https://openai.com/blog/techniques-for-training-large-
               | neural...
               | 
               | https://openai.com/blog/triton/
               | 
               | 0: https://openai.com/blog/openai-five/
               | 
               | 1: https://openai.com/blog/clip/
        
             | dave_sullivan wrote:
             | Facebook released over 100 pages of notes a few months ago
             | detailing their training process for a model that is
             | similar in size. Does anyone have a link? I can't seem to
             | find it in my notes, googling links to posts that have been
             | removed or are behind the facebook walled garden.
             | 
             | But I seem to remember they were running 1,000+ 32gb GPUs
             | for 3 months to train it and keeping that infrastructure
             | running day-to-day and tweaking parameters as training
             | continued was the bulk of the 100 pages. It is beyond the
             | reach of anybody but a really big company, at least in the
             | area of very large models, and the large models are where
             | all the recent results are. I wish I was more bullish on
             | algorithm improvements meaning you can get better results
             | on less hardware; there will definitely be some algorithm
             | improvements, but I think we might really need more
             | powerful hardware too. Or pooled resources. Something.
             | These models are huge.
        
               | ninjaranter wrote:
               | > Facebook released over 100 pages of notes a few months
               | ago detailing their training process for a model that is
               | similar in size. Does anyone have a link?
               | 
               | Is https://github.com/facebookresearch/metaseq/blob/main/
               | projec... what you're referring to?
        
               | dave_sullivan wrote:
               | Yes! Thank you! Very good read for anyone interested in
               | the field.
        
             | Ajedi32 wrote:
             | Training is obviously very expensive, and ideally they'd
             | want to recoup that investment. But I'm curious as to what
             | the marginal cost is to run the model after it's trained.
             | Is it close to 30 images per dollar, like what they're
             | charging now? Or do training costs make up the majority of
             | that price?
        
           | sinenomine wrote:
           | > I think people don't realize how huge these models really
           | are.
           | 
           | They really aren't that large by the contemporary _scaling
           | race_ standards. DALLE-2 has 3.5B parameters, which should
           | fit on an old GPU like Nvidia RTX2080, especially if you
           | optimize your model for inference [1][2] which is commonly
           | done by ML engineers to minimize costs. With optimized model,
           | your memory footprint is ~1 byte per parameter, and some less
           | than 1 ratio (commonly ~0.2) of all parameters to store
           | intermediate activations.
           | 
           | You should be able to run it on Apple M1/M2 with 16GB RAM via
           | CoreML pretty fine, if an order of magnitude slower than on
           | an A100.
           | 
           | Training isn't unreasonably costly as well: you can train a
           | model given O(100k)$ which is less than a yearly salary of a
           | mid-tier developer in silicon valley.
           | 
           | There is no reason these models shouldn't be trained
           | cooperatively and run locally on our own machines. If someone
           | is interested in cooperating with me on such a project, my
           | email is in the profile.
           | 
           | 1. https://arxiv.org/abs/2206.01861
           | 
           | 2. https://pytorch.org/blog/introduction-to-quantization-on-
           | pyt...
        
         | andreyk wrote:
         | Check out Artbreeder, it is likewise a ton of fun!
         | 
         | Multimodal.art (https://multimodal.art/) is working on a free
         | version of something like DALLE, though it's not that good as
         | of yet.
        
         | nsxwolf wrote:
         | I'm already bored of it. When you have everything, you have
         | nothing.
        
           | [deleted]
        
           | peteforde wrote:
           | I don't know how to say this without sounding like a jerk,
           | even if I bend over backwards to preface that this isn't my
           | intent: this statement says more about your creativity and
           | curiosity than a ceiling on how entertaining DALL-E can be to
           | someone who could keep multiple instances busy, like grandma
           | playing nine bingo cards at once.
           | 
           | Knowing that it will only get better - animation cannot be
           | far behind - makes me feel genuinely excited to be alive.
        
             | nwienert wrote:
             | Dall-e has novelty, but no intent, meaning, originality.
             | Yes the author can be creative at generating prompts, but
             | visually I haven't seen it generate anything that feels
             | artistically interesting. If you want pre-existing concepts
             | in novel combinations then yes it works.
             | 
             | It's good at "in the style of" but there's no "in a new
             | style".
             | 
             | It has a house style too that tends to feel Reddit-like.
        
               | gsk22 wrote:
               | Isn't every "new style" just a novel combination of pre-
               | existing concepts? Nothing new under the sun and all
               | that.
               | 
               | Either way, I feel like your view is an exhaustingly
               | pessimistic take on AI-generated art. I mean, sure, most
               | of what DALL-E generates is pretty mundane, but other
               | times I have been surprised at how bizarre and unique
               | certain images are.
               | 
               | You seem to imply that because an AI is not human, its
               | art is not imbued with meaning or originality -- but I
               | find that an AI's non-human nature is precisely what
               | _makes_ the art so original and meaningful.
        
               | hansword wrote:
               | I would say it helps to first think what you want to get
               | out of it.
               | 
               | If your task is "show me something that breaks through
               | our hyperspeed media", then I guess some obscure museum
               | is a better place than an ML model.
               | 
               | If your task is "find the best variation on theme X" or
               | "quick draft visualization", they are often very useful.
               | I am sure there will be many further tasks to which
               | current and future models will be well suited. They are
               | not magic picture machines. At least not yet.
        
           | danielvaughn wrote:
           | I'm sure the novelty wears off. But I'm already coming up
           | with several applications for it.
           | 
           | On the personal side, I've been getting into game
           | development, but the biggest roadblock is creating concept
           | art. I'm an artist but it takes a huge amount of time to get
           | the ideas on paper. Using DALLE will be a massive benefit and
           | will let me expedite that process.
           | 
           | It's important to note that this is _not_ replacing my entire
           | creative process. But it solves the issue I have, where I 'm
           | lying in bed imagining a scene in my mind, but don't have the
           | time or energy to sketch it out myself.
        
             | ausbah wrote:
             | >I'm an artist but it takes a huge amount of time to get
             | the ideas on paper.
             | 
             | this is what I really like about DALLE-mini, it's ability
             | to create pretty good basic outlines for a scene. it's low
             | resolution enough that there's room for your own creativity
             | while giving you a good template to spring off from. things
             | like poses, composition of multiple people, etc.
        
               | zanderwohl wrote:
               | I've used AI to try out different composition/layout
               | possibilities. Sometimes it comes up with an arrangement
               | of objects I hadn't considered. Sometimes it uses colors
               | in really interesting ways. Great jumping-off point for
               | drafting.
        
             | nsxwolf wrote:
             | I did notice it is very good at making small pixel art
             | icons/sprites.
        
             | jnovek wrote:
             | I've been using generative models as an art form in and of
             | themselves since the mid/late 2010s. I like generating
             | mundane things that bump right up along the edge of the
             | uncanny valley and finding categories of images that
             | challenge the model (e.g. for CLIP, phrases that have a
             | clear meaning but are infrequently annotated).
             | 
             | Generating itself can be art. I'm not going to win a
             | Pulitzer here, it's for the personal joy of it, but I will
             | certainly never get tired of it.
        
             | thatguy0900 wrote:
             | I've been having a blast using it in my dungeons and
             | dragons games. If you type in, say, "dnd village battlemap"
             | it's really pretty usable. Not to mention the wild magic
             | weapons and monsters it can come up with.
        
         | bfgoodrich wrote:
        
       | zone411 wrote:
       | I have some first-hand experience about how the copyright office
       | views these works from creating an AI assistant to help me write
       | these melodies:
       | https://www.youtube.com/playlist?list=PLoCzMRqh5SkFwkumE578Y....
       | Here is a quote from the response from the Copyright Office email
       | before I provided additional information about how they were
       | created:
       | 
       | "To be copyrightable, a work must be fixed in a tangible form,
       | must be of human origin, and must contain a minimal degree of
       | creative expression"
       | 
       | So some employees there are aware of the impact that AI can have.
       | Getting these DALL-E images copyrighted won't be trivial. I think
       | it will be many years before the law is clarified.
        
       | rvz wrote:
       | > Starting today, users get full usage rights to commercialize
       | the images they create with DALL*E, including the right to
       | reprint, sell, and merchandise. This includes images they
       | generated during the research preview.
       | 
       | So DALL*E 2 is going to restart, revive and cause another
       | renaissance of fully automated and mass generated NFTs, full of
       | derivatives and remixing etc to pump up the crypto NFT hype
       | squad?
       | 
       | Either way, OpenAI wins again as these crypto companies are going
       | to pour tens of thousands of generated images to pump their NFT
       | griftopia off of life support, reconfirming that it isn't going
       | to die that easily.
       | 
       | Regardless of this possible revival attempt, 90% of these JPEG
       | NFTs will _eventually_ still die.
        
         | randomperson_24 wrote:
         | I have tried playing around with the beta access to make it
         | generate NFT art with different prompts, but in vail.
         | 
         | I think it has not been trained on NFT art (crypto punks and so
         | on).
        
           | Melting_Harps wrote:
           | > I think it has not been trained on NFT art (crypto punks
           | and so on).
           | 
           | How exactly are you defining NFT art?
           | 
           | I mean, it can literately be anything: Dorsey sold a
           | screencap of his 1st tweet, Nadya from Pussy Riot did some
           | creative stuff, and the Ape crap was the bulk of this stuff
           | that got passed around.
           | 
           | I think what can be gleaned from that short-lived non-sense
           | is that value is subjective and that the quality of a valuabe
           | piece of 'art' is equally as hard to define. Much the same
           | with its predecessor: cryptokitties.
        
           | SquareWheel wrote:
           | Heads up: I think you meant "in vain" rather than "in vail".
           | However, a similar phrase is "to no avail" which also means
           | that something was not successful.
        
             | ask_b123 wrote:
             | I think you meant "in vain" rather than "in vein".
        
               | SquareWheel wrote:
               | I sure did! Thank you, I've corrected that now.
        
         | Bud wrote:
         | I don't see why there's any credible reason to expect that
         | DALL-E will do anything at all to help those promoting the NFT
         | silliness. Two separate issues.
        
           | mdanger007 wrote:
           | If OpenAI could make a profit selling Dall-E images as NFT,
           | I'd assume they'd do it, yeah?
        
             | Melting_Harps wrote:
             | Altman tried his hand at that by launching Worldcoin, and
             | it didn't go well at all.
             | 
             | So I think it's prudent that OpenAI keep the 'sell shovels'
             | business model instead with DALLE and GPT, at least for the
             | time being.
        
       | nbzso wrote:
       | Anything to end Corporate Memphis. Even, if we as illustrators
       | will not have jobs or commissions. Let's hope that every creative
       | human endeavour, painting, music, poetry will be replaced and
       | removed from the commercial realm. Then maybe we will see
       | artistic humanism instead of synthetic trans-humanistic "pop
       | art".
       | 
       | Happily for me I stopped painting digitally long time ago. I even
       | stopped calling myself "an artist". Nowadays I paint and draw
       | only with real medium and call all of that "Archivist
       | craftsmanship with analogue medium". :)
        
       | cm2012 wrote:
       | I really want access, wish there was a way to pay to get in.
        
       | TekMol wrote:
       | I wonder how fast they will invite the 1 million users?
       | 
       | I have been on the waitlist for a while and did not get access
       | yet.
       | 
       | Did anybody get access already today?
        
         | deviner wrote:
         | nope, I've been for quite some time too
        
       | Sohcahtoa82 wrote:
       | The name "OpenAI" to me implies being open-source.
       | 
       | I have an RTX 3080 and will likely be buying a 4090 when it comes
       | out. Will I ever be able to generate these images locally, rather
       | than having to use a paid service? I've done it with DALL-E Mini,
       | but the images from that don't hold a candle to what DALL-E 2
       | produces.
        
         | whywhywhywhy wrote:
         | Their choice of name gets funnier every month.
        
         | ronsor wrote:
         | I'm not sure if any current or next-generation GPU even has
         | enough power to run DALL-E 2 locally.
         | 
         | Anyway, OpenAI is unlikely to release the model. The situation
         | will like it is with GPT-3; however, it's also likely another
         | team will attempt to duplicate OpenAI's work.
        
         | jazzyjackson wrote:
         | From what i've seen it's all about the VRAM
         | 
         | if you've got 60GB available to your GPU then maybe you can get
         | close
         | 
         | I'm really curious if Apple's unified memory architecture is of
         | benefit here, especially a few years from now if we can start
         | getting 128/256GB of shared RAM on the SoC
        
       | ajafari1 wrote:
       | I wrote about this happening two days ago on my sub stack post,
       | "OpenAI will start charging businesses for images based on how
       | many images they request. Just like Amazon Web Services charges
       | businesses for usage across storage, computing, etc. Imagine a
       | simple webpage where OpenAI will list out their AI-job suite,
       | including "jobs" such as software developer, graphics designer,
       | customer support rep, and accountant. You can select which
       | service offerings you'd like to purchase ad-hoc or opt into the
       | full AI-job suite."
       | 
       | In case you are interested in reading the whole take:
       | https://aifuture.substack.com/p/the-ai-battle-rages-on
        
         | arrow7000 wrote:
         | "Business monetises their offering" can't say I'm entirely
         | blown away by the prediction
        
       | ukzuck wrote:
       | Can anyone invite me for DALL E!
        
       | [deleted]
        
       | naillo wrote:
       | This news is funny since it doesn't actually change anything.
       | It's still a waitlist that they're pushing out slowly (not an
       | open beta). Nice way to stay in the news though.
        
       | outsider7 wrote:
       | Amazing stuff (really fun)... can it solve climate change ?
        
       | bemmu wrote:
       | I was supposed to be making a video game, but got a bit
       | sidetracked when DALL*E came out and made this website on the
       | side: http://dailywrong.com/ (yes I should get SSL).
       | 
       | It's like The Onion, but all the articles are made with GPT-3 and
       | DALL*E. I start with an interesting DALL*E image, then describe
       | it to GPT-3 and ask it for an Onion-like article on the topic.
       | The results are surprisingly good.
        
         | jelliclesfarm wrote:
         | Love it! Better than other news I get to read these days. Some
         | of it rings..like the bluebird suing the cat.
         | 
         | Thank you! Bookmarked!
        
         | picozeta wrote:
         | These are actually quite funny. A bit of a surreal touch, but
         | that makes them even more fun.
        
         | tiborsaas wrote:
         | Thanks, finally a legit news publication :)
         | 
         | This was really funny :)
         | 
         | http://dailywrong.com/man-finally-comfortable-just-holding-a...
        
           | biztos wrote:
           | So the other men in the pictures are the uncomfortable ones?
        
           | zanderwohl wrote:
           | Somehow these articles are more readable than typical AI-
           | generated search engine fodder... Is it because I'm entering
           | the site with an expectation of nonsense?
        
             | slavak wrote:
             | Probably because, by the creator's own admission, the
             | articles are heavily cherry-picked to make sure the output
             | is decent, which is probably a lot more human effort than
             | goes into the aforementioned search engine fodder.
             | 
             | http://dailywrong.com/sample-page/
        
               | pwillia7 wrote:
               | I would guess that most Spam farms are not using openAI
               | davinci model which is really really good, but expensive.
               | Just a guess.
        
               | hanselot wrote:
        
         | layer8 wrote:
         | This one seems like it could actually be real in Japan:
         | http://dailywrong.com/anime-pillow-gym-opens-in-tokyo/ ;)
        
         | busyant wrote:
         | This is clever. Does GPT-3 come up with the title of the
         | article, too? That's the funniest part.
        
           | bemmu wrote:
           | At first I came up with them myself, but found that it often
           | comes up with better ones, so I ask it for variations.
           | 
           | I think I got it to even fill the title given a picture,
           | something like "Article picture caption: Man holding an
           | apple. Article title: ...". Might experiment more with that
           | in the future.
        
             | sillysaurusx wrote:
             | How do you prompt GPT-3 to come up with the titles? That's
             | an interesting problem.
        
             | busyant wrote:
             | Well, then I'm impressed with GPT-3's ability to generate
             | those titles!
             | 
             | The combination of photo/title feels like they come from
             | the more absurd articles published by theonion.
             | 
             | If we aren't living in a simulation, it's just a matter of
             | time...
        
         | lagrange77 wrote:
         | http://dailywrong.com/new-course-teaches-guinea-pigs-househo...
         | 
         | lol
        
         | walrus01 wrote:
         | The results with things that are artworks or more general
         | concepts are fascinating, but there is for sure something
         | creepy with "photorealistic" human eyes and faces going on...
         | 
         | If you want to see some really creepy AI generated human
         | "photo" faces, take a look at Bots of New York:
         | 
         | https://www.facebook.com/botsofnewyork
        
         | dntrkv wrote:
         | Spam advertising is about to reach whole new levels of weird.
        
         | ttyyzz wrote:
         | NGL this shit is pretty cursed and I like it.
        
         | benbristow wrote:
         | From the server IP looks like you're on some managed WordPress
         | hosting that only offers free SSL on the more 'premium'
         | packages.
         | 
         | Easiest way for free SSL would be to just throw the domain on
         | CloudFlare :)
        
         | pieter_mj wrote:
         | Very funny! The "Scientists Warn New Faster Toothbrush May
         | Cause Insanity"-story is not fake though, I've experienced it
         | ;)
        
         | stuaxo wrote:
         | This is fantastic, the fake news the world needs.
        
         | aantix wrote:
         | Feels like the headlines could be generated similar to the
         | style of "They Fight Crime!"
         | 
         | "He's a hate-fuelled neurotic farmboy searching for his wife's
         | true killer. She's a tortured insomniac snake charmer from a
         | family of eight older brothers. They fight crime!"
         | 
         | https://theyfightcrime.org/
         | 
         | Here's an implementation in Perl.
         | 
         | http://paulm.com/toys/fight_crime.pl.txt
        
           | edm0nd wrote:
           | lol that site is great
           | 
           | >He's an unconventional gay paranormal investigator moving
           | from town to town, helping folk in trouble. She's a violent
           | motormouth wrestler from the wrong side of the tracks. They
           | fight crime!
           | 
           | >He's a Nobel prize-winning sweet-toothed rock star who
           | believes he can never love again. She's a strong-willed
           | communist widow with a knack for trouble. They fight crime!
           | 
           | >He's an obese white trash barbarian with a secret. She's a
           | virginal thirtysomething traffic cop with the power to bend
           | men's minds. They fight crime!
        
         | aasasd wrote:
         | http://dailywrong.com/wp-content/uploads/2022/07/DALL%C2%B7E...
         | 
         | Hot dang. Some Reddit subs can be auto-generated now.
        
         | tildef wrote:
         | Actually got a chuckle out of the duck one
         | (http://dailywrong.com/man-finally-comfortable-just-
         | holding-a...). Thanks! I hope your keep generating them. Kind
         | of wish there weren't a newsletter nag, but on the other hand
         | it adds to the realism. Could be worthwhile to generate the
         | text of the nag with gpt too; call it a kind of lampshading.
        
         | aantix wrote:
         | Parenting > "Gillette Releases a New Razor for Babies"
        
           | bemmu wrote:
           | I loved how it just consistently decided that if babies have
           | facial hair, it's always white fluff.
        
             | lancesells wrote:
             | I think it's because it's using images of babies with soap
             | on their face to learn. Still funny though!
        
         | uxamanda wrote:
         | The part where you have to confirm you are not a robot to
         | subscribe to the mailing list is the best part of this, my new
         | favorite website.
        
         | drusepth wrote:
         | Haha, I was in a very similar boat when I built
         | https://novelgens.com -- I was also supposed to be making a
         | video game, but got a bit sidetracked with VQGAN+CLIP and other
         | text/image generation models.
         | 
         | Now I'm using that content _in_ the video game. I wonder if you
         | could use these articles as some fake news in your game, too.
         | :)
        
         | astroalex wrote:
         | This is amazing! Honestly one of the first uses of GPT3/DALL E
         | that has held my attention for longer than a few seconds.
        
       | mark_l_watson wrote:
       | I tried DALLE once and liked the generated images. Not really my
       | thing, but so cool.
       | 
       | What I do use is OpenAI's GPT-3 APIs, I am a paying customer.
       | Great tool!
        
       | lagrange77 wrote:
       | Has anyone else had problems with the 'Generate Variations'
       | functions lately? Tried it out first 3 days ago, and it says
       | 'Something went wrong. Please try again later, or contact
       | support@openai.com if this is an ongoing problem.' everytime
       | since then.
        
       | Plough_Jogger wrote:
       | Is this referring to the first version of the model, or DALL-E 2?
        
       | nharada wrote:
       | Something I haven't seen anyone talking about with these huge
       | models: how do future models get trained when more content online
       | is model generated to start with? Presumably you don't wanna
       | train a model on autogenerated images or text, but you can't
       | necessarily know which is which.
        
         | cpach wrote:
         | Makes me think of Ouroboros
         | 
         | https://en.m.wikipedia.org/wiki/Ouroboros
        
           | nharada wrote:
           | Reminds me of https://en.wikipedia.org/wiki/Low-
           | background_steel
        
             | espadrine wrote:
             | In this situation, the low-background steel is the MS-COCO
             | dataset, associated with the Frechet inception distance
             | computed by comparing the statistical divergence between
             | the high-level vector outputs of passing MS-COCO images
             | through Google's InceptionV3 classifier, and passing DALL-E
             | images (or its competitors) through it.
             | 
             | For now at least, there is a detectable difference in
             | variety.
        
         | zitterbewegung wrote:
         | This should be a step in cleaning your data to begin with. If
         | you don't know the providence of your data then you shouldn't
         | be even training with it.
         | 
         | Getting humans to refine your data is the best solution right
         | now and many companies and researches go with this approach.
        
           | jazzyjackson wrote:
           | s/ide/ena
        
           | Voloskaya wrote:
           | > Getting humans to refine your data is the best solution
           | right now
           | 
           | Source ?
           | 
           | All those big models are trained with data for which the
           | source is not known or vetted. The amount of data needed is
           | not human-refinable.
           | 
           | For example for language models we train mostly on subsets of
           | CommonCrawl + other things. CommonCrawl data is "cleaned" by
           | filtering out known bad sources and with some heuristics such
           | as ratio of text to other content, length of sentences etc.
           | 
           | The final result is a not too dirty but not clean huge pile
           | of data that comes from millions of sources that no human as
           | vetted and that no one in the team using the data knows
           | about.
           | 
           | The same applies to large images dataset, e.g. Laon 400m that
           | also comes from CommonCrawl and is not curated.
        
           | nharada wrote:
           | But how would you know? A random string of text or an image
           | with the watermark removed is going to be very hard to
           | distinguish generated from human written.
        
           | FrenchDevRemote wrote:
           | You can't use humans to manually refine a dataset on the
           | scale of GPT-3 or DALL-E
           | 
           | Clip was trained on 400,000,000 images, GPT is roughly 180B
           | tokens, at 1-2 tokens per word, that's 120,000,000,000 words.
        
             | pshc wrote:
             | At least cleaning it up is an embarrassingly parallel
             | problem, so if you had the resources to throw incentives at
             | millions of casual gamers, you might make a nice dent on
             | Clip.
        
               | zanderwohl wrote:
               | Alternatively, making a captcha where half the data is
               | unlabeled, and half is labeled, forcing users to
               | categorize data for you as they log into accounts.
        
         | Jleagle wrote:
         | The images i have created all have a watermark.. This is at
         | least one way to filter out most images, by the same AI at
         | least.
        
         | goolulusaurs wrote:
         | It's a cybernetic feedback system. Dalle is used to create new
         | images, the images that people find most interesting and
         | noteworthy get shared online, and reincorporated into the
         | training data, but now filtered through human desire.
        
           | [deleted]
        
         | can16358p wrote:
         | I think with the terms requiring explicitly telling which
         | images/parts were generated, they could be filtered out and
         | prevent a feedback loop of "generated in/generated out" images.
         | I'm sure there will be some illegal/against terms of use cases
         | there but the majority should represent fair use.
        
         | mikeyouse wrote:
         | This precise thing is causing a funny problem in specialty
         | areas. People are using e.g. Google Lens to identify plants,
         | birds and insects, which sometimes returns wrong answers e.g.
         | say it sees a picture of a Summer Tanager and calls it a
         | Cardinal. If the people then post "Saw this Cardinal" and the
         | model picks up that picture/post and incorporates it into its
         | training set, it's just reinforcing the wrong identification..
        
           | bobbylarrybobby wrote:
           | https://xkcd.com/978/
        
           | Pxtl wrote:
           | Then that's a cardinal now.
        
           | scarmig wrote:
           | That's not really a new problem, though. At one point someone
           | got some bad training data about an old Incan town, the
           | misidentification spread, and nowadays we train new human
           | models to call it Macchu Picchu.
        
             | vanillaicesquad wrote:
             | The difference between the name of an old Incan town and a
             | modern time plant identification mistake is that maybe the
             | plant is poisonous.
             | 
             | Made with gpt3
        
         | blfr wrote:
         | Training on auto generated images collected off the Internet is
         | gonna be fine for a while since the images surfacing will be
         | curated (ie. selected as good/interesting/valuable) still
         | mostly by humans.
        
         | jmartrican wrote:
         | I wonder if human artists can demand that their work not be
         | used for modelling. So as the robots are stuck using older
         | styles for their creations, the humans will keep creating new
         | styles of art.
        
         | naillo wrote:
         | One interesting comment about this is that some models actually
         | benefit from being fed their own output. Alphafold for instance
         | was fed with its own 'high likelihood' outputs (as demis
         | hassabis described in his lex friedman interview).
        
         | gwern wrote:
         | My discussion of this issue (which actually comes up in like
         | every DALL-E 2 discussion on HN):
         | https://www.lesswrong.com/posts/uKp6tBFStnsvrot5t/what-dall-...
        
       | ccmcarey wrote:
       | That's about 10x as expensive as it should be
        
         | karencarits wrote:
         | $15 for 115 iterations/460 images?
        
           | ccmcarey wrote:
           | Yep. During the alpha it was (50*6) 300 images per day, by
           | their pricing model that would be $300 a month now
        
           | WalterSear wrote:
           | $15 for 115 _attempts_ to get usable images.
        
         | kache_ wrote:
         | Give it some time. Other organizations will race to the bottom.
         | 
         | They might even provide image generation at a loss to drive
         | people to their platforms.
        
         | gverri wrote:
         | $0.13/prompt can only be useful for artists/end users. Anyone
         | thinking about using this at scale would need a 20/30x
         | reduction in price. But there's still no API available so I
         | think that will change with time. Maybe they will add different
         | tiers based on volume.
        
           | jeanlucas wrote:
           | Thing is, as a current user: you rarely get it right in the
           | first prompt, you can iterate 10 times until you get what you
           | want.
           | 
           | I spent several tries yesterday to get this angle "from the
           | ground up":
           | https://labs.openai.com/s/mz8LiyvkI8KwD2luJ6MrS23m
        
             | dntrkv wrote:
             | So $1.30 for getting a result that would have cost how much
             | to pay someone to make? Not to mention the 59 other
             | variations you would have.
        
         | raisedbyninjas wrote:
         | When there is a competitor, they can adjust pricing. For now,
         | it's virtually magic.
        
         | bradleybuda wrote:
         | You should ship a competitor! Sounds like you found a great
         | market opportunity.
        
         | scifibestfi wrote:
         | What are you basing that on? What should the price be? The
         | training and generation are probably expensive.
        
         | thorum wrote:
         | Until you consider the level of demand for this product, which
         | is surely higher than OpenAI can scale to with the number of
         | GPUs they have. If they price it lower they'll be overwhelmed.
        
         | Workaccount2 wrote:
         | Welcome to SaaS.
        
       | isoprophlex wrote:
       | Wait until someone trains a model like this, for porn.
       | 
       | There seems to be a post-DALLE obscenity detector on openAI's
       | tool, as so far I've found it to be entirely robust against
       | deliberate typos designed to avoid simple 'bad word lists'. Ask
       | it for a "pruple violon" and you get purple violins... you get
       | the deal.
       | 
       | "Metastable" prompts that may or may not generate obscene
       | (content with nudity, guns, violence as I've found) results
       | sometimes shown non-obscene generations, and sometimes trigger a
       | warning.
        
         | jug wrote:
         | I've thought about this and in fact porn generation sounds like
         | a good thing?? It ensures that it's victimless. Of course,
         | there is a problem with generation of illegal (underage) porn
         | but other than this, I think it could be helpful for this
         | world.
        
         | jowday wrote:
         | If I had to guess, I'd bet they have a supervised classifier
         | trained to recognize bad content (violence, porn, etc) that
         | they use to filter the generated images before passing them to
         | the user, on top of the bad word lists.
        
           | cmarschner wrote:
           | Most likely they just take the one from bing. Or, if they
           | trained a better one, it goes vice versa sooner or later
        
           | isoprophlex wrote:
           | Exactly!
        
         | zionic wrote:
         | Honestly that part pisses me off. Who cares if their AI "makes
         | porn" or something "offensive".
        
           | fishtoaster wrote:
           | I suspect it's more a business restriction than a moral one.
           | If OpenAI allows people to make porn with these tools, people
           | will make a _ton_ of it. OpenAI will become known as  "the
           | company that makes the porn-generating AIs," not "the company
           | that keeps pushing the boundaries of AI." Being known as the
           | porn-ai company is bad for business, so they restrict it.
        
         | alana314 wrote:
         | I tried the term "cockeyed" and got a TOS violation notice
        
       | [deleted]
        
       | justinzollars wrote:
       | I would love access to this in order to design Silver Rounds. If
       | you work at open API please reach out!
        
       | dharbin wrote:
       | I find it amusing that they suggest DALL-E, which typically
       | generates lovecraftian nightmare images, for making children's
       | story illustrations.
        
         | driverdan wrote:
         | How so? If you give it prompts for children story illustrations
         | with a detailed description it will not give you "lovecraftian
         | nightmare images".
        
         | throwaway0x7E6 wrote:
         | yeah. dalle is "so bad it's good".
         | 
         | it's great for post-post-ironic memes, but I don't see it being
         | useful for anything else
        
           | arkitaip wrote:
           | No wireless. Less space than a nomad. Lame.
        
           | andybak wrote:
           | Have you tried any of the "human or Dall-E" tests?
           | 
           | How did you score?
           | 
           | I only scored as well as I did because I knew the kind of
           | stylistic choices to look out for. In terms of "quality" I
           | really don't understand how you've reached this conclusion.
        
             | throwaway0x7E6 wrote:
             | I've only seen this thing
             | https://huggingface.co/spaces/dalle-mini/dalle-mini
             | 
             | is it not dall-e?
        
               | _flux wrote:
               | It is not and that's why OpenAI asked them to change the
               | name, which they did.
        
               | throwaway0x7E6 wrote:
               | oh. I retract my OP then
        
               | andybak wrote:
               | It's a reimplementation.
               | 
               | It's a long way off in terms of quality (at the moment
               | anyway)
        
               | astrange wrote:
               | It's a model inspired by DALLE 1 but it's not even very
               | close to that.
               | 
               | But it does seem to know a lot of things the real DALLE2
               | doesn't.
        
       | Nevin1901 wrote:
       | I don't like how they're charging money for Dalle, yet they don't
       | have an API available.
        
       ___________________________________________________________________
       (page generated 2022-07-20 23:00 UTC)