hngopher.com

       [HN Gopher] Spent $15 in DALL*E 2 credits creating this AI image
       ___________________________________________________________________
        
       Spent $15 in DALL*E 2 credits creating this AI image
        
       Author : pat-jay
       Score  : 283 points
       Date   : 2022-08-11 16:53 UTC (6 hours ago)
        
 (HTM) web link (pub.towardsai.net)
 (TXT) w3m dump (pub.towardsai.net)
        
       | fnordpiglet wrote:
       | If you think it's hard to get an AI to render what's in your
       | mind, try another human artist. Specifying something visually
       | complex with an assumption that it'll be precisely what you're
       | imagining is shockingly hard. I'm not surprised prompt creation
       | is so complex. At least with the AI bots the turn around time for
       | iteration is tight. That said humans likely iterate fewer times,
       | but each iteration takes a long time.
        
       | anigbrowl wrote:
       | Can't wait for 'Tell HN: how I make mid six figures as a prompt
       | engineer'.
        
         | Nition wrote:
         | Absolutely. See also: https://promptbase.com
         | 
         | And we're still in the early days.
        
           | anigbrowl wrote:
           | WTAF
           | 
           | Unwillingly considering whether the easy bucks are worth the
           | greasy feeling.
        
         | Workaccount2 wrote:
         | "We let our graphic designer go so we could onboard a AI Prompt
         | Engineer"
         | 
         | "How much are we paying him?"
         | 
         | "About $225k plus bonus and equity"
         | 
         | "And how much was the graphic designer paid?
         | 
         | "$55k"
         | 
         | "..."
        
           | rfrey wrote:
           | It's the graphic design industry's own fault for not
           | gradually renaming themselves as Pixel Intensity Engineers.
        
       | _pastel wrote:
       | If you're interested in browsing creative prompts, I highly
       | recommend the reddit community at r/dalle2.
       | 
       | Some are impressive:                 -
       | www.reddit.com/r/dalle2/comments/uzosy1/the_rest_of_mona_lisa
       | - www.reddit.com/r/dalle2/comments/vstuns/super_mario_getting_his
       | _citizenship_at_ellis
       | 
       | And others are hilarious:                 - www.reddit.com/r/dall
       | e2/comments/v0pjfr/a_photograph_of_a_street_sign_that_warns_drive
       | rs       -
       | www.reddit.com/r/dalle2/comments/wbbkbb/healthy_food_at_mcdonalds
       | - www.reddit.com/r/dalle2/comments/wlfpax/the_elements_of_fire_wa
       | ter_earth_and_air_digital
        
         | Nition wrote:
         | Clickable links for the lazy (it seems that the http:// is
         | required to make it work):
         | 
         | http://www.reddit.com/r/dalle2/comments/uzosy1/the_rest_of_m...
         | 
         | http://www.reddit.com/r/dalle2/comments/vstuns/super_mario_g...
         | 
         | http://www.reddit.com/r/dalle2/comments/v0pjfr/a_photograph_...
         | 
         | http://www.reddit.com/r/dalle2/comments/wbbkbb/healthy_food_...
         | 
         | http://www.reddit.com/r/dalle2/comments/wlfpax/the_elements_...
        
         | jeffchien wrote:
         | /r/weirddalle is also great for some inspiration, though most
         | of the entries are memes generated by Dall-e Mini/Craiyon. I
         | often find art styles and modifiers that I never considered,
         | like "Byzantine mosaic" or "Kurzgesagt video thumbnail".
         | 
         | https://www.reddit.com/r/weirddalle/top/?sort=top&t=all
        
         | mFixman wrote:
         | My favourite one is Kermit the Frog in the style of different
         | movies.
         | 
         | https://www.reddit.com/r/dalle2/comments/v1sc2z/kermit_the_f...
        
       | hombre_fatal wrote:
       | Love the stylistic ones. Amazing how it generates such good anime
       | and vaporwave variants, like the neon vaporwave backboard.
       | 
       | I ran out of credits way too fast, so I like to see other people
       | playing with it and their iterative process.
        
       | pigtailgirl wrote:
       | -- spent a day with DALL-E - here are some of my favorites:
       | https://imgur.com/a/uD5yjV3 --
        
         | planetsprite wrote:
         | Were a lot of your prompts just "attractive girl hat and
         | sunglasses high quality photography"
        
           | pigtailgirl wrote:
           | -- hat pic are playing with "variations" mode - the prompt
           | was: "portrait photo, california beach with female model
           | wearing hat and sunglasses, studio, lens flare, colourful,
           | 4k, high definition, 35mm, HD" --
        
         | prashp wrote:
         | You like your lobsters
        
           | pigtailgirl wrote:
           | -- they're the little lobsters we have over here (akazae)! -
           | quite expensive - _very_ good =) -
           | https://en.wikipedia.org/wiki/Metanephrops_japonicus --
        
             | tough wrote:
             | They reminded me of this little guys we have in the med
             | https://en.wikipedia.org/wiki/Nephrops_norvegicus
        
       | krisoft wrote:
       | > it was difficult to find images where the entire llama fit
       | within the frame
       | 
       | I had the same trouble. In my experiment I wanted to generate a
       | Porco Rosso style seaplane. illustration. Sadly none of the
       | generated pictured had the whole of the airplane in them. The
       | wingtips or the tail always got left off.
       | 
       | I found this method to be a reliable workaround: I have
       | downloaded the image I liked the most. Used an image editing
       | software to extend the image in the direction I wanted it to be
       | extended and filled the new area with a solid colour. Cropped a
       | 1024x1024 size rectangle such that it had about 40% generated
       | image, and 60% solid colour. Uploaded the new image and asked
       | DALL-E to infill the solid area while leaving the previously
       | generated area unchanged. Selected from the generated extensions
       | the one I liked the best, downloaded it and merged it with the
       | rest of the picture. Repeated the process as required.
       | 
       | You need a generous amount of overlap so the network can figure
       | out which parts is already there and how best to fit the rest.
       | It's a good idea to look at the image segment you need to be
       | infilled. If you as a human can't figure out what it is you are
       | seeing, then the machine won't be able to figure it out either.
       | It will generate something, but it will look out of context once
       | merged.
       | 
       | The other trick I found: I wanted to make my picture a canvas
       | print, and thus I needed a higher resolution image. Higher even
       | then what I can reasonably hope with the above extension trick.
       | What I did is that I have upscaled the image (used bigjpg.com,
       | but there might be better solutions out there.) After that I had
       | a big image, but of course there weren't many small scale details
       | now on it. So I have sliced it up to 1024x1024 rectangles,
       | uploaded the rectangles to DALL-E and asked it to keep the
       | borders intact but redraw the interior of them. This second trick
       | worked particularly well on an area of the picture which shown a
       | city under the airplane. It has added nice small details like
       | windows and doors and roofs with texture without disturbing the
       | overall composition.
       | 
       | What I did:
        
         | devin wrote:
         | MidJourney allows you to specify other aspect ratios. DALL-E's
         | square constraint makes a lot of things more difficult than
         | they need to be IMO.
        
           | GaggiX wrote:
           | Also with Stable Diffusion. It's a really cool feature to
           | have and playing around.
        
         | bredren wrote:
         | I had similar problems trying to get the whole of a police car
         | overgrown with weeds.
         | 
         | https://imgur.com/a/U5Hl2gO
         | 
         | I was testing to see how close I could get to replicating a
         | t-shirt graphic concept I saw.
         | 
         | I had been using ~"A telephoto shot of A neglected police car
         | from the 1980s Viewed from a 3/4 angle sits in the distance.
         | The entire vehicle is visible but it is overgrown with grass
         | and flowery vines"
         | 
         | This process sounds great, though it seems like DALLE needs to
         | offer tools to do this automagically.
        
         | Miraste wrote:
         | What prompts did you use for the infill and detail generation?
        
           | krisoft wrote:
           | Good question! All of them had the same postfix ", studio
           | ghibli, Hayao Miyazaki, in the style of Porco Rosso,
           | steampunk". I used this for all the generations in the hopes
           | of anchoring the style.
           | 
           | With the prefix of the prompt I described the image. I
           | started the extension operations with "red seaplane over
           | fantasy mediterranean city" but then I quickly realised that
           | this was making the network generate floating cities in the
           | sky for me. :D So then I varied the prompt. "red seaplane on
           | blue sky" in the upper regions and "fantasy mediterranean
           | city" in the lower ones.
           | 
           | I went even more specific and used "mediterranean sea port,
           | stone bridge with arches" prefix for a particular detail
           | where I wanted to retain the bridge (which I liked) but
           | improve on the arches. (which looked quite dingy)
           | 
           | (I have just counted and it seems I have used 27 generations
           | for this one project.)
        
             | fragmede wrote:
             | > I quickly realised that this was making the network
             | generate floating cities in the sky for me
             | 
             | Maybe Dalle-2 is just secretly a studio Ghibli/Miyazaki
             | movie fan.
        
         | andreyk wrote:
         | Wow, I've had the same trouble and these are some great tips!
         | Thanks for sharing
        
           | krisoft wrote:
           | Anytime! I have uploaded the image in question: the initial
           | prompt with first generated images, the extended raw image,
           | and then the one with the added details on the city.
           | 
           | https://imgur.com/a/QEU7EJ2
        
             | mdorazio wrote:
             | This is a fantastic end result. Thanks for sharing your
             | process to get there.
        
         | [deleted]
        
       | keepquestioning wrote:
       | DALL-E is truly magic. It got me believing we are close to AGI.
       | 
       | I wonder what Gary Marcus or Filip Pieknewski think about it.
       | Surely they must be eating crow.
        
         | outworlder wrote:
         | > It got me believing we are close to AGI.
         | 
         | We are not. But maybe we are closer to replicating some of our
         | internal brain workings.
        
         | dougmwne wrote:
         | Yesterday I saw one of Gandalf eating samples at Costco. I was
         | laughing hysterically for a minute. AI is not supposed to have
         | a sense of humor. That was supposed to be the last province of
         | the human, but it is quite awhile since a human made me laugh
         | like that.
        
           | outworlder wrote:
           | I don't think intelligence requires humor. It could be just a
           | quirk of our brains.
        
           | WoodenChair wrote:
           | > AI is not supposed to have a sense of humor.
           | 
           | And this AI doesn't. Your anecdote is totally unrelated to
           | the idea of AGI in the gp post. The fact that it made you
           | laugh is a happenstance. It was not "trying" to make you
           | laugh.
        
             | dougmwne wrote:
             | It's only unrelated if there's no proto-AGI going on. Many
             | images give me a moment of doubt, even though I absolutely
             | know that I'm looking at nothing more than the output of a
             | pile of model weights, says I the pile of neurons.
        
           | Comevius wrote:
           | If I write a Python script that cuts together a bunch of
           | pictures and the output makes you laught the script hardly
           | deserves all the credit. It's us humans that create meaning.
        
           | kube-system wrote:
           | It's funny in the way that mad libs are funny. It's
           | unexpected. The _reason_ it is unexpected is because the
           | computer is dumb, not because it is smart.
        
             | dougmwne wrote:
             | I think the humor came from the vibe, humiliation,
             | dejection. Like seeing a beloved math teacher caught in an
             | adult video store.
             | 
             | I also saw this one recently from Midjourney. Would not
             | call the humor random.
             | 
             | https://www.reddit.com/r/midjourney/comments/w73rhv/prompt_
             | t...
        
           | NateEag wrote:
           | What was the prompt for that image?
           | 
           | What wrote the prompt?
        
             | dougmwne wrote:
             | But the prompt was not funny, only the image.
        
           | LegitShady wrote:
           | I saw that on reddit. The face was horrific and not at all
           | human like. It didn't have a sense of humour - it just took a
           | prompt and mashed some things together, but the prompt was
           | funny and the image was horrifying. Not even uncanny valley
           | shit, but "Gandalf was in a bad motorcycle and will never
           | look like a human again" bad.
           | 
           | It's still up on the dalle2 subreddit.
        
         | jmfldn wrote:
         | This tells us little about AGI. It might seem like it does but
         | this is an incredibly narrow specific set of technologies. They
         | work together to produce some startling results (with many
         | limitations) but this is just another narrow application.
         | 
         | I suspect AGI, depending on how its defined, will be with us in
         | some form in the next few decades at most. Just a hunch. This
         | is nothing to do with that mission though imho. Maybe you can
         | read into it something like, "we are solving lots of discrete
         | problems like this, maybe we can somehow glue them together
         | into a higher level program"? That might give you something AI-
         | esque? My guess is that 'true' AGI will have an elegant
         | solution rather than a big bag of stuff glued together.
        
           | thfuran wrote:
           | We're pretty much just a big bag of stuff glued together.
        
         | croes wrote:
         | When I see some of the bad pictures it produces I think we are
         | nowhere near AGI
        
           | outworlder wrote:
           | Most people would draw even worse pictures given the same
           | prompts.
        
             | donkarma wrote:
             | most neural networks would draw even worse pictures given
             | the same prompts
        
         | Comevius wrote:
         | Machine learning just glues together existing things, which is
         | how art is created. As amusing these pictures are, it's us
         | humans who bring meaning to them, both when producing what
         | these algorithms use as input and when consuming their output.
         | We are the actual magic behind DALL-E.
         | 
         | An AGI wouldn't need us to this extent, or at all. An AGI would
         | also be able to come up with new ways to represent ideas, even
         | ways that are foreign to us.
        
       | sebringj wrote:
       | The images remind me of one of my dreams where logic and
       | reasoning are thrown out and the pure gist of the thing is taken.
       | I wonder if it is because it is built with vector operations and
       | calculus to determine the closest match or fuzzy matches for
       | essentially everything it eventually determines sans cognition,
       | things would tend to be more fuzzy or quasi-close but not quite
       | there. Very entertaining post.
       | 
       | I have my own api key as well but not with DALL-E 2 access just
       | yet but seems similar in terms of prompting text in stages to get
       | what you want. It feels kind of like negotiating with it in some
       | way.
        
         | outworlder wrote:
         | > The images remind me of one of my dreams (...)
         | 
         | A lot of dreams scenery seems to throw logic and reasoning out
         | of the window. Even small sensory inputs can make a huge
         | difference to a dream sequence. And in many case they don't
         | make sense even in the context of the dream.
         | 
         | I haven't personally experienced any hallucinations myself, but
         | some DALL-E images seem awfully familiar to what some people
         | describe.
         | 
         | I know that comparisons between brains and machine learning
         | (including neural networks) are superficial at best, but I
         | still wonder if DALL-E is mimicking, in its own way, a portion
         | of our larger brain processing 'pipeline'.
        
           | sebringj wrote:
           | Spot on, like the more basic part of a raw dream feed without
           | rhyme or reason. Maybe even laying the groundwork for an
           | experience architecture's input when that day finally comes,
           | who knows.
        
         | antoniuschan99 wrote:
         | first thing I noticed was that it had no distinct features of a
         | basketball. looks more like a bowling ball with the swirly
         | things on it. Kind of adds to your dream thought.
        
           | outworlder wrote:
           | Human dream sequences often have problems with faces, text
           | and mirrors. You can train yourself to try to focus on these
           | features when dreaming.
           | 
           | Most people in our dreams don't even have faces that we would
           | recognize. When they do have faces, sometimes it is not even
           | the right face.
        
       | humbleferret wrote:
       | "In working with DALL*E 2, it's important to be specific about
       | what you want without over-stuffing or adding redundant words."
       | 
       | I found this to be the most important point from this piece.
       | Often people don't really know what they really want when it
       | comes to creative work, let alone to some omniscient algorithm.
       | In spite of that, it's a delight to see something you love from
       | an unspecific prompt that you won't find with anything you
       | receive from a human.
       | 
       | Dall.E 2 never ceases to amaze me.
       | 
       | For anyone interested in learning about what Dall.E 2 can do, the
       | author also links to the Dall.E 2 prompt book (discussed in this
       | post https://news.ycombinator.com/item?id=32322329).
        
       | JadoJodo wrote:
       | I tried a number of these generators a week ago (or so), all with
       | the same prompt: "A child looking longingly at a lollipop on the
       | top shelf" with pretty abysmal (and sometimes horrifying)
       | results. I'm not sure if my expectations are too high, but maybe
       | I was doing it wrong?
        
         | Marazan wrote:
         | Dalle(and others) are great, almost magical, at specific types
         | of images and abysmal at others.
        
       | foobarbecue wrote:
       | It's fascinating to me that in the first image, the llama's
       | jersey has a drawing of a llama on it. I wonder if that was in
       | the prompt?
        
       | conception wrote:
       | https://pitch.com/v/DALL-E-prompt-book-v1-tmd33y
       | 
       | The DALL-E 2 prompt book. If anything, pretty neat look at how
       | the various prompts come out and some of the art created by it.
        
       | Vox_Leone wrote:
       | Can I use NLP to generate input for DALL-E 2? That would be cool.
        
         | MonkeyMalarky wrote:
         | I want to see a few iterations of describing an image with AI,
         | generating it, describing it again, generating it... Like when
         | passing a piece of text through Google translate back and
         | forth.
        
           | pamelafox wrote:
           | I tried that! Results were mixed:
           | https://twitter.com/pamelafox/status/1542593090472386561
           | 
           | It needs a better text to image model, I think. Maybe you can
           | fork it and improve?
        
             | MonkeyMalarky wrote:
             | Interesting! I really like the flute > cup > bathtub
             | sequence. It has a real dreamlike disjointedness to it.
        
           | turdnagel wrote:
           | There was a tool that could find the "equilibrium" called
           | Translation Party. I don't think it works anymore. I'd love
           | to see one that goes back and forth between DALL-E and an
           | image description algorithm.
        
           | rmbyrro wrote:
           | According to internet popular belief, you'd end up with a
           | picture of a certain ignominious dictator that unfortunately
           | destroyed Europe in the 1940's. [1]
           | 
           | [1] https://en.wikipedia.org/wiki/Godwin%27s_law
        
         | minimaxir wrote:
         | You can, in fact, use GPT-3 to engineer prompts for DALL-E 2 in
         | a sense.
         | 
         | https://twitter.com/simonw/status/1555626060384911360
        
         | jcims wrote:
         | I used GPT-3 to 'write' a children's book and asked it to
         | include descriptions of the illustrations.
         | 
         | https://docs.google.com/presentation/d/1y8EE_p8bw9dIEDguT1bT...
         | 
         | The fact that it's a derivative of an existing work is
         | noteworthy, but I gave it absolutely no guidance on the topic.
         | If i suggest something it will give it a go with similar
         | fervor. eg https://imgur.com/a/N1qWaSV
        
           | jfk13 wrote:
           | Your link doesn't seem to be publicly accessible.
        
       | falcor84 wrote:
       | >the ball is positioned in such a way that the llama has no real
       | hope of making the shot
       | 
       | I love that we're at the level where the physical "realism" of
       | correctly representing quadrupedals playing basketball is a thing
       | now. I suppose the next level AI will be expected to model a full
       | 3d environment with physical assumptions based on the prompt and
       | then run the simulation
        
         | TheOtherHobbes wrote:
         | That's the only way to get reliably usable output.
         | 
         | There's a lot of "80% there but not quite" in the current
         | version, which makes it more of a novelty than a useful content
         | generator.
         | 
         | The problem with moving to 3D is there are no almost no 3D data
         | sources that combine textures, poses (where relevant),
         | lighting, 3D geometry and (ideally) physics.
         | 
         | They can be inferred to some extent from 2D sources. But not
         | reliably.
         | 
         | Humans operate effortlessly in 3D and creative humans have no
         | issues with using 3D perceptions creatively.
         | 
         | But as for as most content is concerned it's a 2D world. Which
         | is why AI art bots know the texture of everything and the
         | geometry of nothing.
         | 
         | AI generation is going to be stuck at nearly-but-not-quite
         | until that changes.
        
           | namrog84 wrote:
           | While not fully. There is a lot of freely available 3d models
           | that can used as a starting point. Id love a dalle2 for 3d
           | model generation. Even if no texture lighting physics was
           | there.
        
       | Karawebnetwork wrote:
       | I was curious to compare results with Craiyon.ai
       | 
       | Here is "llama in a jersey dunking a basketball like Michael
       | Jordan, shot from below, tilted frame, 35deg, Dutch angle,
       | extreme long shot, high detail, dramatic backlighting, epic,
       | digital art": https://imgur.com/a/7LoAtRx
       | 
       | Here is "Llama in a jersey dunking a basketball like Michael
       | Jordan, screenshots from the Miyazaki anime movie", much worst:
       | https://imgur.com/a/g99G7Bn
        
         | speedgoose wrote:
         | Craiyon did step up a lot in its understanding recently. The
         | image quality is still not the best but it if you ignore the
         | blurriness, the scary faces, and the weird shapes, it can
         | sometimes be better than dall.e.
        
         | samspenc wrote:
         | Fascinating, are there any other similar products in this same
         | category as DALL.E and Craiyon?
        
           | peab wrote:
           | wombo.ai and midjourney
        
       | jiggywiggy wrote:
       | Wow the blogs posted here are awesome, the octopus and this lama
       | are awesome.
       | 
       | Myself cant seem to get it to work. I think it's not very good at
       | real things. Tried fitness related images, all is weird. Probably
       | with fantasy kinda stuff its better since it has to be less
       | accurate.
        
       | EMIRELADERO wrote:
       | I wonder how this would play out with the new Stable Diffusion
        
         | vanadium1st wrote:
         | I've tried out a couple of prompts from the post in Stable
         | Diffusion and as expected the results were much weaker. It has
         | drawn some alpacas and basketballs with little relation between
         | the objects.
         | 
         | I've been playing with Stable Diffusion a lot, and in my
         | experience its results are much weaker then what's shown in
         | this post. The artistic pictures that it generates are
         | beautiful, often more beautiful then Dalle-2 ones. But it has a
         | real problem understanding the basic concepts of anything that
         | is not the simplest task like "draw a character in this or that
         | style". And explaining the situations in detail doesn't help -
         | the AI just stumbles upon basic requests.
         | 
         | Seems like Stable Diffusion has a much more shallow
         | understanding of what it draws and can only produce good result
         | for things very similar to the images it learned from. For
         | example, it could generate really good dutch still life
         | paintings for me - with fruits, bottles and all the regular
         | expected objects for this genre of painting. But when I've
         | asked it to add some unusual objects to the painting (like a
         | Nintendo switch, or a laptop) - it couldn't grasp this concept
         | and just added more warbled fruit. Even though the system
         | definitely knows how a Switch looks like.
         | 
         | The results in the post are much more impressive. I doubt that
         | Dalle-2 saw a lot of similar images in training, but in all of
         | the styles and examples it definitely understood how a llama
         | would interact with a basketball, what are their relative sizes
         | and stuff like that. On surface results from different engines
         | might look similar, but to me this is an enormous difference in
         | quality and sophistication.
        
           | GaggiX wrote:
           | Stable Diffusion has a smaller text encoder than Dalle 2 and
           | other models (Imagen, Parti, Craiyon) so that it can fit into
           | consumer GPUs. I believe StabilityAI will train models based
           | on a larger text encoder, the text encoder is frozen and does
           | not require training, so scaling the text encoder is quite
           | free. For now this is the biggest bottleneck with Stable
           | Diffusion, the generator is really good and the image quality
           | alone is incredible (managing to outperform Dalle 2 most of
           | the time).
        
       | netfortius wrote:
       | How could all this play into "flooding" the NFT markets?
        
         | dymk wrote:
         | It's hard to flood the NFT market any further. It was almost
         | all autogenerated art before DALL-E was publicly available.
        
         | pwython wrote:
         | They're already using DALL-E for that 2021 fad.
         | 
         | I'm more curious of how this will effect stock photography.
         | Soon anyone can generate the exact image they're looking for,
         | no matter how obscure.
        
         | LegitShady wrote:
         | NFTs are just numbers on a blockchain. The picture is a canard.
         | In the US I don't think you can copyright DALL-E images as they
         | aren't created by a human, so you spend money to make them and
         | anyone else can use them.
        
       | renewiltord wrote:
       | This is really good fun, actually. Spent some time fucking around
       | with it and it can make some impressive photorealistic stuff like
       | "hoverbus in san francisco by the ferry building, digital photo".
       | 
       | I mostly use it and Midjourney for material for my DnD campaign,
       | but I'm going to need to do a little more work to make the whole
       | thing coherent. Only tried it once and it was okay.
       | 
       | The interesting part is that it can do things like "female ice
       | giant" reasonably whereas google will just give you sexy bikini
       | ice giant for stuff like that which is not the vibe of my
       | campaign!
        
       | BashiBazouk wrote:
       | Is there randomization or will the same prompts produce the same
       | image sets?
        
         | minimaxir wrote:
         | Always random. (in theory a seed is possible but not offered)
        
           | croes wrote:
           | So the services that sell Dall-E 2 prompts are useless
        
             | minimaxir wrote:
             | There's _some_ stability offered by specific prompts
             | though.
        
       | Taylor_OD wrote:
       | I love this.
        
       | f0e4c2f7 wrote:
       | I recently made PromptWiki[0] to try to document useful prompts
       | and examples.
       | 
       | I think we're at the beginning of exploring what these image
       | models can do and what the best ways to work with them are.
       | 
       | [0] https://promptwiki.com
        
       | aj7 wrote:
       | I tried "machining a Siamese cat on the lathe" but with
       | disappointing results.
        
       | kayfhf wrote:
        
       | simias wrote:
       | I'm usually very much a skeptic when it comes to "revolutionary"
       | tech. I think the blockchain is crap. I think fully self-driving
       | cars are still a long way away. I think that VR and the metaverse
       | are going to remain gimmicks in the foreseeable future.
       | 
       | But this DALL-E thing, it's really blowing my mind. That and deep
       | fakes, now that's sci-fi tech. It's both exciting and a bit
       | scary.
       | 
       | The idea that in the not so far future one will be able to create
       | images (and I presume later, audio and video) of basically
       | anything with just a simple text prompt is rife with potential
       | (both good and bad). It's going to change the way we look at art,
       | it's also going to give incredibly powerful creative tools to the
       | masses.
       | 
       | For me the endgame would be an AI sufficiently advanced that one
       | could prompt "make an episode of Seinfeld that centers around
       | deep fakes" and you'd get an episode virtually indistinguishable
       | from a real one. Home-made, tailor-made entertainment.
       | Terrifyingly amazing. See you in a few decades...
        
       | obloid wrote:
       | "Image intentionally modified to blur and hide faces"
       | 
       | I thought this was strange. Why hide an AI generated face?
        
         | ticviking wrote:
         | They're being used to create fake profile pictures.
        
           | kube-system wrote:
           | I'm not sure why anyone bothers. StyleGAN2 profile photos are
           | literally all over social media and they're good enough to
           | fool the human reviewers every time I report them.
        
       | vbezhenar wrote:
       | Is it hard to reimplement that algorithm? I want to see what
       | people would do with porn-enabled image generator. Hopefully
       | pornhub already hiring data scientists.
        
       | kristiandupont wrote:
       | I picture in a few years we will be playing around with a code
       | generation tool, and people will be drawing similar conclusions.
       | "You have to be really specific about what you like. If you just
       | say 'chat tool', it will allow you to chat to one other person
       | only."
        
       | tambourine_man wrote:
       | > It's important to tell DALL*E 2 exactly what you want
       | 
       | That's not as easy as it sounds. Specially in the surreal cases
       | that DALL-E is usually requested.
       | 
       | Sometimes you don't know what you want until you see it. Other
       | times you do, but are not able to express in ways that the
       | computer can understand.
       | 
       | I see being able to communicate efficiently with the machine as a
       | future in demand skill
        
         | upupandup wrote:
         | I asked DALL-E for 'bottomless naked women' and I was banned.
        
           | bpye wrote:
           | I suspect this is a joke, but I did find that it was a little
           | overzealous with the filtering. I was trying to get someone
           | (not a specific person) shouting or with an angry expression,
           | and a few prompts I came up with were blocked. Not banned
           | though.
        
             | astrange wrote:
             | I kept getting a scene with "two people holding hands"
             | blocked, it allowed "two people kissing" and then when I
             | tried "and wife" instead of "two people" it banned me.
             | (They unbanned me when I emailed them though.)
             | 
             | Oddly, the ones it blocked were more sfw than several
             | others it allowed, but of course I don't know what the
             | outputs would've been...
        
         | mattwad wrote:
         | At least 10% of web dev today is being good at search prompts
         | for Google. (And that's not necessarily a bad thing, it's just
         | about finding the right tool or pattern for your specific
         | problem)
        
           | tambourine_man wrote:
           | Oh yeah. Knowing the keywords is what makes you an expert
        
       | neonate wrote:
       | https://archive.ph/RwY42
        
       | sgtFloyd wrote:
       | My two cents: the techniques OP uses are absolutely valid, but
       | I've found much more success "sampling" styles and poses from
       | existing works.
       | 
       | Rather than trying to perfectly describe my image, I like to use
       | references where the source material has what you want. With
       | minimal direction these prompts get impressively close:
       | 
       | "larry bird as a llama, dramatic basketball dunk in a bright
       | arena, low angle action shot, from the movie Madagascar (2005)"
       | https://labs.openai.com/s/wxbIbXa0HRwwGUqQaKSLtzmR
       | 
       | "Michael Jordan as a llama dunking a basketball, Space Jam
       | (1996)" https://labs.openai.com/s/mX4T5Iak8CMO1rPAmjRb7oyH
       | 
       | At this point I'd experiment with more stylized/recognizable
       | references or add a couple "effects" to polish up the results.
        
       | turdnagel wrote:
       | My current move is creating initial versions of images with
       | Midjourney, which seems to be a bit more "free-spirited" (read:
       | less _literal_, more flexible) and then using DALL-E's replace
       | tool to fill in the weird looking bits. It works pretty well, but
       | it's a multi-step process and requires you have pay for
       | Midjourney and DALL-E.
        
       | karaterobot wrote:
       | I ran into this too. When I got my invite, I told a friend I
       | would learn how to talk to DALL-E by having it make some concept
       | art for the game he was designing. I ran through all of my free
       | credits, and most of the first $15 bucket and never really got
       | anything usable.
       | 
       | Even when I re-used the _exact prompts_ from the DALL-E Prompt
       | Book, I didn 't get anything near the level of quality and
       | fidelity to the prompt that their examples did.
       | 
       | I know it's not a scam, because it's clearly doing amazing stuff
       | under the hood, but I went away thinking that it wasn't as
       | miraculous as it was claimed to be.
        
         | jfk13 wrote:
         | I suspect that many of the "impressive" examples that we see
         | from tools like this have been carefully selected by human
         | curators. I'm sure it's not at the level of "monkeys +
         | typewriters = Shakespeare [if you're sufficiently selective]",
         | but the general idea is still applicable.
        
           | grumbel wrote:
           | Most of DALL-E2 output is great out of the box, the selection
           | process is just fine tuning the results to create something
           | the human in front of the computer likes. DALL-E2 can't
           | mindread, so the image produced might not match what the
           | human had in mind.
           | 
           | There is however one thing to be aware of, the titles posted
           | on /r/dalle2/ and other places are often not the prompts that
           | DALL-E2 got. Instead they are a fun description of the image
           | done by a human after the fact. Random example:
           | 
           | "Chased by an amongus segway"
           | 
           | * https://www.reddit.com/r/dalle2/comments/wkv7za/chased_by_a
           | n...
           | 
           | But the actual prompt was:
           | 
           | "Award winning photo of a mole driving a red off road car
           | through a field"
           | 
           | * https://labs.openai.com/s/xnaoxiWeSjiQX1QyVUCHGkl1
           | 
           | Which is quite a bit less impressive, as the actual prompt
           | doesn't really match the image very well. And if you put
           | "Chased by an amongus segway" into DALL-E2, you won't get an
           | image of that quality either.
        
       | coldcode wrote:
       | It's fun to play around with it, but like the author found, what
       | you get is often strange or useless. I also find 1k images too
       | small to do much with but I realize making 4k images would be
       | cost prohibitive. I also wish it could generate vector images as
       | well as pixel images. That would be fun to use.
        
       | jordanmorgan10 wrote:
       | A lot of these posts showing up on HN. I wonder - is it because
       | it is so new, or is it because the ways in which we are to use
       | this technology are so nascent that we are discovering how to use
       | it more precisely daily?
        
         | dougmwne wrote:
         | I believe it's for a few reasons. First, it is jaw dropping
         | incredible for most people in tech who have at least a hint of
         | how most ML works. Second, the AI image generation field is
         | racing ahead, in academics and new trained models, so there's
         | lots of new news. Thirdly some really great models like Dall-e
         | have been opened for wider access and lots of everyday users
         | are discovering its capabilities and doing blog write-up's
         | which are not news, but are surely interesting to most.
        
       | pleasantpeasant wrote:
       | There was a thread on r/DigitalArt about people debating if
       | you're really an artist if you're using these AI creator
       | websites.
       | 
       | Some guy spent hours feeding the AI pictures he liked to get an
       | end result he was happy with.
        
       ___________________________________________________________________
       (page generated 2022-08-11 23:00 UTC)