[HN Gopher] Spent $15 in DALL*E 2 credits creating this AI image ___________________________________________________________________ Spent $15 in DALL*E 2 credits creating this AI image Author : pat-jay Score : 283 points Date : 2022-08-11 16:53 UTC (6 hours ago) (HTM) web link (pub.towardsai.net) (TXT) w3m dump (pub.towardsai.net) | fnordpiglet wrote: | If you think it's hard to get an AI to render what's in your | mind, try another human artist. Specifying something visually | complex with an assumption that it'll be precisely what you're | imagining is shockingly hard. I'm not surprised prompt creation | is so complex. At least with the AI bots the turn around time for | iteration is tight. That said humans likely iterate fewer times, | but each iteration takes a long time. | anigbrowl wrote: | Can't wait for 'Tell HN: how I make mid six figures as a prompt | engineer'. | Nition wrote: | Absolutely. See also: https://promptbase.com | | And we're still in the early days. | anigbrowl wrote: | WTAF | | Unwillingly considering whether the easy bucks are worth the | greasy feeling. | Workaccount2 wrote: | "We let our graphic designer go so we could onboard a AI Prompt | Engineer" | | "How much are we paying him?" | | "About $225k plus bonus and equity" | | "And how much was the graphic designer paid? | | "$55k" | | "..." | rfrey wrote: | It's the graphic design industry's own fault for not | gradually renaming themselves as Pixel Intensity Engineers. | _pastel wrote: | If you're interested in browsing creative prompts, I highly | recommend the reddit community at r/dalle2. | | Some are impressive: - | www.reddit.com/r/dalle2/comments/uzosy1/the_rest_of_mona_lisa | - www.reddit.com/r/dalle2/comments/vstuns/super_mario_getting_his | _citizenship_at_ellis | | And others are hilarious: - www.reddit.com/r/dall | e2/comments/v0pjfr/a_photograph_of_a_street_sign_that_warns_drive | rs - | www.reddit.com/r/dalle2/comments/wbbkbb/healthy_food_at_mcdonalds | - www.reddit.com/r/dalle2/comments/wlfpax/the_elements_of_fire_wa | ter_earth_and_air_digital | Nition wrote: | Clickable links for the lazy (it seems that the http:// is | required to make it work): | | http://www.reddit.com/r/dalle2/comments/uzosy1/the_rest_of_m... | | http://www.reddit.com/r/dalle2/comments/vstuns/super_mario_g... | | http://www.reddit.com/r/dalle2/comments/v0pjfr/a_photograph_... | | http://www.reddit.com/r/dalle2/comments/wbbkbb/healthy_food_... | | http://www.reddit.com/r/dalle2/comments/wlfpax/the_elements_... | jeffchien wrote: | /r/weirddalle is also great for some inspiration, though most | of the entries are memes generated by Dall-e Mini/Craiyon. I | often find art styles and modifiers that I never considered, | like "Byzantine mosaic" or "Kurzgesagt video thumbnail". | | https://www.reddit.com/r/weirddalle/top/?sort=top&t=all | mFixman wrote: | My favourite one is Kermit the Frog in the style of different | movies. | | https://www.reddit.com/r/dalle2/comments/v1sc2z/kermit_the_f... | hombre_fatal wrote: | Love the stylistic ones. Amazing how it generates such good anime | and vaporwave variants, like the neon vaporwave backboard. | | I ran out of credits way too fast, so I like to see other people | playing with it and their iterative process. | pigtailgirl wrote: | -- spent a day with DALL-E - here are some of my favorites: | https://imgur.com/a/uD5yjV3 -- | planetsprite wrote: | Were a lot of your prompts just "attractive girl hat and | sunglasses high quality photography" | pigtailgirl wrote: | -- hat pic are playing with "variations" mode - the prompt | was: "portrait photo, california beach with female model | wearing hat and sunglasses, studio, lens flare, colourful, | 4k, high definition, 35mm, HD" -- | prashp wrote: | You like your lobsters | pigtailgirl wrote: | -- they're the little lobsters we have over here (akazae)! - | quite expensive - _very_ good =) - | https://en.wikipedia.org/wiki/Metanephrops_japonicus -- | tough wrote: | They reminded me of this little guys we have in the med | https://en.wikipedia.org/wiki/Nephrops_norvegicus | krisoft wrote: | > it was difficult to find images where the entire llama fit | within the frame | | I had the same trouble. In my experiment I wanted to generate a | Porco Rosso style seaplane. illustration. Sadly none of the | generated pictured had the whole of the airplane in them. The | wingtips or the tail always got left off. | | I found this method to be a reliable workaround: I have | downloaded the image I liked the most. Used an image editing | software to extend the image in the direction I wanted it to be | extended and filled the new area with a solid colour. Cropped a | 1024x1024 size rectangle such that it had about 40% generated | image, and 60% solid colour. Uploaded the new image and asked | DALL-E to infill the solid area while leaving the previously | generated area unchanged. Selected from the generated extensions | the one I liked the best, downloaded it and merged it with the | rest of the picture. Repeated the process as required. | | You need a generous amount of overlap so the network can figure | out which parts is already there and how best to fit the rest. | It's a good idea to look at the image segment you need to be | infilled. If you as a human can't figure out what it is you are | seeing, then the machine won't be able to figure it out either. | It will generate something, but it will look out of context once | merged. | | The other trick I found: I wanted to make my picture a canvas | print, and thus I needed a higher resolution image. Higher even | then what I can reasonably hope with the above extension trick. | What I did is that I have upscaled the image (used bigjpg.com, | but there might be better solutions out there.) After that I had | a big image, but of course there weren't many small scale details | now on it. So I have sliced it up to 1024x1024 rectangles, | uploaded the rectangles to DALL-E and asked it to keep the | borders intact but redraw the interior of them. This second trick | worked particularly well on an area of the picture which shown a | city under the airplane. It has added nice small details like | windows and doors and roofs with texture without disturbing the | overall composition. | | What I did: | devin wrote: | MidJourney allows you to specify other aspect ratios. DALL-E's | square constraint makes a lot of things more difficult than | they need to be IMO. | GaggiX wrote: | Also with Stable Diffusion. It's a really cool feature to | have and playing around. | bredren wrote: | I had similar problems trying to get the whole of a police car | overgrown with weeds. | | https://imgur.com/a/U5Hl2gO | | I was testing to see how close I could get to replicating a | t-shirt graphic concept I saw. | | I had been using ~"A telephoto shot of A neglected police car | from the 1980s Viewed from a 3/4 angle sits in the distance. | The entire vehicle is visible but it is overgrown with grass | and flowery vines" | | This process sounds great, though it seems like DALLE needs to | offer tools to do this automagically. | Miraste wrote: | What prompts did you use for the infill and detail generation? | krisoft wrote: | Good question! All of them had the same postfix ", studio | ghibli, Hayao Miyazaki, in the style of Porco Rosso, | steampunk". I used this for all the generations in the hopes | of anchoring the style. | | With the prefix of the prompt I described the image. I | started the extension operations with "red seaplane over | fantasy mediterranean city" but then I quickly realised that | this was making the network generate floating cities in the | sky for me. :D So then I varied the prompt. "red seaplane on | blue sky" in the upper regions and "fantasy mediterranean | city" in the lower ones. | | I went even more specific and used "mediterranean sea port, | stone bridge with arches" prefix for a particular detail | where I wanted to retain the bridge (which I liked) but | improve on the arches. (which looked quite dingy) | | (I have just counted and it seems I have used 27 generations | for this one project.) | fragmede wrote: | > I quickly realised that this was making the network | generate floating cities in the sky for me | | Maybe Dalle-2 is just secretly a studio Ghibli/Miyazaki | movie fan. | andreyk wrote: | Wow, I've had the same trouble and these are some great tips! | Thanks for sharing | krisoft wrote: | Anytime! I have uploaded the image in question: the initial | prompt with first generated images, the extended raw image, | and then the one with the added details on the city. | | https://imgur.com/a/QEU7EJ2 | mdorazio wrote: | This is a fantastic end result. Thanks for sharing your | process to get there. | [deleted] | keepquestioning wrote: | DALL-E is truly magic. It got me believing we are close to AGI. | | I wonder what Gary Marcus or Filip Pieknewski think about it. | Surely they must be eating crow. | outworlder wrote: | > It got me believing we are close to AGI. | | We are not. But maybe we are closer to replicating some of our | internal brain workings. | dougmwne wrote: | Yesterday I saw one of Gandalf eating samples at Costco. I was | laughing hysterically for a minute. AI is not supposed to have | a sense of humor. That was supposed to be the last province of | the human, but it is quite awhile since a human made me laugh | like that. | outworlder wrote: | I don't think intelligence requires humor. It could be just a | quirk of our brains. | WoodenChair wrote: | > AI is not supposed to have a sense of humor. | | And this AI doesn't. Your anecdote is totally unrelated to | the idea of AGI in the gp post. The fact that it made you | laugh is a happenstance. It was not "trying" to make you | laugh. | dougmwne wrote: | It's only unrelated if there's no proto-AGI going on. Many | images give me a moment of doubt, even though I absolutely | know that I'm looking at nothing more than the output of a | pile of model weights, says I the pile of neurons. | Comevius wrote: | If I write a Python script that cuts together a bunch of | pictures and the output makes you laught the script hardly | deserves all the credit. It's us humans that create meaning. | kube-system wrote: | It's funny in the way that mad libs are funny. It's | unexpected. The _reason_ it is unexpected is because the | computer is dumb, not because it is smart. | dougmwne wrote: | I think the humor came from the vibe, humiliation, | dejection. Like seeing a beloved math teacher caught in an | adult video store. | | I also saw this one recently from Midjourney. Would not | call the humor random. | | https://www.reddit.com/r/midjourney/comments/w73rhv/prompt_ | t... | NateEag wrote: | What was the prompt for that image? | | What wrote the prompt? | dougmwne wrote: | But the prompt was not funny, only the image. | LegitShady wrote: | I saw that on reddit. The face was horrific and not at all | human like. It didn't have a sense of humour - it just took a | prompt and mashed some things together, but the prompt was | funny and the image was horrifying. Not even uncanny valley | shit, but "Gandalf was in a bad motorcycle and will never | look like a human again" bad. | | It's still up on the dalle2 subreddit. | jmfldn wrote: | This tells us little about AGI. It might seem like it does but | this is an incredibly narrow specific set of technologies. They | work together to produce some startling results (with many | limitations) but this is just another narrow application. | | I suspect AGI, depending on how its defined, will be with us in | some form in the next few decades at most. Just a hunch. This | is nothing to do with that mission though imho. Maybe you can | read into it something like, "we are solving lots of discrete | problems like this, maybe we can somehow glue them together | into a higher level program"? That might give you something AI- | esque? My guess is that 'true' AGI will have an elegant | solution rather than a big bag of stuff glued together. | thfuran wrote: | We're pretty much just a big bag of stuff glued together. | croes wrote: | When I see some of the bad pictures it produces I think we are | nowhere near AGI | outworlder wrote: | Most people would draw even worse pictures given the same | prompts. | donkarma wrote: | most neural networks would draw even worse pictures given | the same prompts | Comevius wrote: | Machine learning just glues together existing things, which is | how art is created. As amusing these pictures are, it's us | humans who bring meaning to them, both when producing what | these algorithms use as input and when consuming their output. | We are the actual magic behind DALL-E. | | An AGI wouldn't need us to this extent, or at all. An AGI would | also be able to come up with new ways to represent ideas, even | ways that are foreign to us. | sebringj wrote: | The images remind me of one of my dreams where logic and | reasoning are thrown out and the pure gist of the thing is taken. | I wonder if it is because it is built with vector operations and | calculus to determine the closest match or fuzzy matches for | essentially everything it eventually determines sans cognition, | things would tend to be more fuzzy or quasi-close but not quite | there. Very entertaining post. | | I have my own api key as well but not with DALL-E 2 access just | yet but seems similar in terms of prompting text in stages to get | what you want. It feels kind of like negotiating with it in some | way. | outworlder wrote: | > The images remind me of one of my dreams (...) | | A lot of dreams scenery seems to throw logic and reasoning out | of the window. Even small sensory inputs can make a huge | difference to a dream sequence. And in many case they don't | make sense even in the context of the dream. | | I haven't personally experienced any hallucinations myself, but | some DALL-E images seem awfully familiar to what some people | describe. | | I know that comparisons between brains and machine learning | (including neural networks) are superficial at best, but I | still wonder if DALL-E is mimicking, in its own way, a portion | of our larger brain processing 'pipeline'. | sebringj wrote: | Spot on, like the more basic part of a raw dream feed without | rhyme or reason. Maybe even laying the groundwork for an | experience architecture's input when that day finally comes, | who knows. | antoniuschan99 wrote: | first thing I noticed was that it had no distinct features of a | basketball. looks more like a bowling ball with the swirly | things on it. Kind of adds to your dream thought. | outworlder wrote: | Human dream sequences often have problems with faces, text | and mirrors. You can train yourself to try to focus on these | features when dreaming. | | Most people in our dreams don't even have faces that we would | recognize. When they do have faces, sometimes it is not even | the right face. | humbleferret wrote: | "In working with DALL*E 2, it's important to be specific about | what you want without over-stuffing or adding redundant words." | | I found this to be the most important point from this piece. | Often people don't really know what they really want when it | comes to creative work, let alone to some omniscient algorithm. | In spite of that, it's a delight to see something you love from | an unspecific prompt that you won't find with anything you | receive from a human. | | Dall.E 2 never ceases to amaze me. | | For anyone interested in learning about what Dall.E 2 can do, the | author also links to the Dall.E 2 prompt book (discussed in this | post https://news.ycombinator.com/item?id=32322329). | JadoJodo wrote: | I tried a number of these generators a week ago (or so), all with | the same prompt: "A child looking longingly at a lollipop on the | top shelf" with pretty abysmal (and sometimes horrifying) | results. I'm not sure if my expectations are too high, but maybe | I was doing it wrong? | Marazan wrote: | Dalle(and others) are great, almost magical, at specific types | of images and abysmal at others. | foobarbecue wrote: | It's fascinating to me that in the first image, the llama's | jersey has a drawing of a llama on it. I wonder if that was in | the prompt? | conception wrote: | https://pitch.com/v/DALL-E-prompt-book-v1-tmd33y | | The DALL-E 2 prompt book. If anything, pretty neat look at how | the various prompts come out and some of the art created by it. | Vox_Leone wrote: | Can I use NLP to generate input for DALL-E 2? That would be cool. | MonkeyMalarky wrote: | I want to see a few iterations of describing an image with AI, | generating it, describing it again, generating it... Like when | passing a piece of text through Google translate back and | forth. | pamelafox wrote: | I tried that! Results were mixed: | https://twitter.com/pamelafox/status/1542593090472386561 | | It needs a better text to image model, I think. Maybe you can | fork it and improve? | MonkeyMalarky wrote: | Interesting! I really like the flute > cup > bathtub | sequence. It has a real dreamlike disjointedness to it. | turdnagel wrote: | There was a tool that could find the "equilibrium" called | Translation Party. I don't think it works anymore. I'd love | to see one that goes back and forth between DALL-E and an | image description algorithm. | rmbyrro wrote: | According to internet popular belief, you'd end up with a | picture of a certain ignominious dictator that unfortunately | destroyed Europe in the 1940's. [1] | | [1] https://en.wikipedia.org/wiki/Godwin%27s_law | minimaxir wrote: | You can, in fact, use GPT-3 to engineer prompts for DALL-E 2 in | a sense. | | https://twitter.com/simonw/status/1555626060384911360 | jcims wrote: | I used GPT-3 to 'write' a children's book and asked it to | include descriptions of the illustrations. | | https://docs.google.com/presentation/d/1y8EE_p8bw9dIEDguT1bT... | | The fact that it's a derivative of an existing work is | noteworthy, but I gave it absolutely no guidance on the topic. | If i suggest something it will give it a go with similar | fervor. eg https://imgur.com/a/N1qWaSV | jfk13 wrote: | Your link doesn't seem to be publicly accessible. | falcor84 wrote: | >the ball is positioned in such a way that the llama has no real | hope of making the shot | | I love that we're at the level where the physical "realism" of | correctly representing quadrupedals playing basketball is a thing | now. I suppose the next level AI will be expected to model a full | 3d environment with physical assumptions based on the prompt and | then run the simulation | TheOtherHobbes wrote: | That's the only way to get reliably usable output. | | There's a lot of "80% there but not quite" in the current | version, which makes it more of a novelty than a useful content | generator. | | The problem with moving to 3D is there are no almost no 3D data | sources that combine textures, poses (where relevant), | lighting, 3D geometry and (ideally) physics. | | They can be inferred to some extent from 2D sources. But not | reliably. | | Humans operate effortlessly in 3D and creative humans have no | issues with using 3D perceptions creatively. | | But as for as most content is concerned it's a 2D world. Which | is why AI art bots know the texture of everything and the | geometry of nothing. | | AI generation is going to be stuck at nearly-but-not-quite | until that changes. | namrog84 wrote: | While not fully. There is a lot of freely available 3d models | that can used as a starting point. Id love a dalle2 for 3d | model generation. Even if no texture lighting physics was | there. | Karawebnetwork wrote: | I was curious to compare results with Craiyon.ai | | Here is "llama in a jersey dunking a basketball like Michael | Jordan, shot from below, tilted frame, 35deg, Dutch angle, | extreme long shot, high detail, dramatic backlighting, epic, | digital art": https://imgur.com/a/7LoAtRx | | Here is "Llama in a jersey dunking a basketball like Michael | Jordan, screenshots from the Miyazaki anime movie", much worst: | https://imgur.com/a/g99G7Bn | speedgoose wrote: | Craiyon did step up a lot in its understanding recently. The | image quality is still not the best but it if you ignore the | blurriness, the scary faces, and the weird shapes, it can | sometimes be better than dall.e. | samspenc wrote: | Fascinating, are there any other similar products in this same | category as DALL.E and Craiyon? | peab wrote: | wombo.ai and midjourney | jiggywiggy wrote: | Wow the blogs posted here are awesome, the octopus and this lama | are awesome. | | Myself cant seem to get it to work. I think it's not very good at | real things. Tried fitness related images, all is weird. Probably | with fantasy kinda stuff its better since it has to be less | accurate. | EMIRELADERO wrote: | I wonder how this would play out with the new Stable Diffusion | vanadium1st wrote: | I've tried out a couple of prompts from the post in Stable | Diffusion and as expected the results were much weaker. It has | drawn some alpacas and basketballs with little relation between | the objects. | | I've been playing with Stable Diffusion a lot, and in my | experience its results are much weaker then what's shown in | this post. The artistic pictures that it generates are | beautiful, often more beautiful then Dalle-2 ones. But it has a | real problem understanding the basic concepts of anything that | is not the simplest task like "draw a character in this or that | style". And explaining the situations in detail doesn't help - | the AI just stumbles upon basic requests. | | Seems like Stable Diffusion has a much more shallow | understanding of what it draws and can only produce good result | for things very similar to the images it learned from. For | example, it could generate really good dutch still life | paintings for me - with fruits, bottles and all the regular | expected objects for this genre of painting. But when I've | asked it to add some unusual objects to the painting (like a | Nintendo switch, or a laptop) - it couldn't grasp this concept | and just added more warbled fruit. Even though the system | definitely knows how a Switch looks like. | | The results in the post are much more impressive. I doubt that | Dalle-2 saw a lot of similar images in training, but in all of | the styles and examples it definitely understood how a llama | would interact with a basketball, what are their relative sizes | and stuff like that. On surface results from different engines | might look similar, but to me this is an enormous difference in | quality and sophistication. | GaggiX wrote: | Stable Diffusion has a smaller text encoder than Dalle 2 and | other models (Imagen, Parti, Craiyon) so that it can fit into | consumer GPUs. I believe StabilityAI will train models based | on a larger text encoder, the text encoder is frozen and does | not require training, so scaling the text encoder is quite | free. For now this is the biggest bottleneck with Stable | Diffusion, the generator is really good and the image quality | alone is incredible (managing to outperform Dalle 2 most of | the time). | netfortius wrote: | How could all this play into "flooding" the NFT markets? | dymk wrote: | It's hard to flood the NFT market any further. It was almost | all autogenerated art before DALL-E was publicly available. | pwython wrote: | They're already using DALL-E for that 2021 fad. | | I'm more curious of how this will effect stock photography. | Soon anyone can generate the exact image they're looking for, | no matter how obscure. | LegitShady wrote: | NFTs are just numbers on a blockchain. The picture is a canard. | In the US I don't think you can copyright DALL-E images as they | aren't created by a human, so you spend money to make them and | anyone else can use them. | renewiltord wrote: | This is really good fun, actually. Spent some time fucking around | with it and it can make some impressive photorealistic stuff like | "hoverbus in san francisco by the ferry building, digital photo". | | I mostly use it and Midjourney for material for my DnD campaign, | but I'm going to need to do a little more work to make the whole | thing coherent. Only tried it once and it was okay. | | The interesting part is that it can do things like "female ice | giant" reasonably whereas google will just give you sexy bikini | ice giant for stuff like that which is not the vibe of my | campaign! | BashiBazouk wrote: | Is there randomization or will the same prompts produce the same | image sets? | minimaxir wrote: | Always random. (in theory a seed is possible but not offered) | croes wrote: | So the services that sell Dall-E 2 prompts are useless | minimaxir wrote: | There's _some_ stability offered by specific prompts | though. | Taylor_OD wrote: | I love this. | f0e4c2f7 wrote: | I recently made PromptWiki[0] to try to document useful prompts | and examples. | | I think we're at the beginning of exploring what these image | models can do and what the best ways to work with them are. | | [0] https://promptwiki.com | aj7 wrote: | I tried "machining a Siamese cat on the lathe" but with | disappointing results. | kayfhf wrote: | simias wrote: | I'm usually very much a skeptic when it comes to "revolutionary" | tech. I think the blockchain is crap. I think fully self-driving | cars are still a long way away. I think that VR and the metaverse | are going to remain gimmicks in the foreseeable future. | | But this DALL-E thing, it's really blowing my mind. That and deep | fakes, now that's sci-fi tech. It's both exciting and a bit | scary. | | The idea that in the not so far future one will be able to create | images (and I presume later, audio and video) of basically | anything with just a simple text prompt is rife with potential | (both good and bad). It's going to change the way we look at art, | it's also going to give incredibly powerful creative tools to the | masses. | | For me the endgame would be an AI sufficiently advanced that one | could prompt "make an episode of Seinfeld that centers around | deep fakes" and you'd get an episode virtually indistinguishable | from a real one. Home-made, tailor-made entertainment. | Terrifyingly amazing. See you in a few decades... | obloid wrote: | "Image intentionally modified to blur and hide faces" | | I thought this was strange. Why hide an AI generated face? | ticviking wrote: | They're being used to create fake profile pictures. | kube-system wrote: | I'm not sure why anyone bothers. StyleGAN2 profile photos are | literally all over social media and they're good enough to | fool the human reviewers every time I report them. | vbezhenar wrote: | Is it hard to reimplement that algorithm? I want to see what | people would do with porn-enabled image generator. Hopefully | pornhub already hiring data scientists. | kristiandupont wrote: | I picture in a few years we will be playing around with a code | generation tool, and people will be drawing similar conclusions. | "You have to be really specific about what you like. If you just | say 'chat tool', it will allow you to chat to one other person | only." | tambourine_man wrote: | > It's important to tell DALL*E 2 exactly what you want | | That's not as easy as it sounds. Specially in the surreal cases | that DALL-E is usually requested. | | Sometimes you don't know what you want until you see it. Other | times you do, but are not able to express in ways that the | computer can understand. | | I see being able to communicate efficiently with the machine as a | future in demand skill | upupandup wrote: | I asked DALL-E for 'bottomless naked women' and I was banned. | bpye wrote: | I suspect this is a joke, but I did find that it was a little | overzealous with the filtering. I was trying to get someone | (not a specific person) shouting or with an angry expression, | and a few prompts I came up with were blocked. Not banned | though. | astrange wrote: | I kept getting a scene with "two people holding hands" | blocked, it allowed "two people kissing" and then when I | tried "and wife" instead of "two people" it banned me. | (They unbanned me when I emailed them though.) | | Oddly, the ones it blocked were more sfw than several | others it allowed, but of course I don't know what the | outputs would've been... | mattwad wrote: | At least 10% of web dev today is being good at search prompts | for Google. (And that's not necessarily a bad thing, it's just | about finding the right tool or pattern for your specific | problem) | tambourine_man wrote: | Oh yeah. Knowing the keywords is what makes you an expert | neonate wrote: | https://archive.ph/RwY42 | sgtFloyd wrote: | My two cents: the techniques OP uses are absolutely valid, but | I've found much more success "sampling" styles and poses from | existing works. | | Rather than trying to perfectly describe my image, I like to use | references where the source material has what you want. With | minimal direction these prompts get impressively close: | | "larry bird as a llama, dramatic basketball dunk in a bright | arena, low angle action shot, from the movie Madagascar (2005)" | https://labs.openai.com/s/wxbIbXa0HRwwGUqQaKSLtzmR | | "Michael Jordan as a llama dunking a basketball, Space Jam | (1996)" https://labs.openai.com/s/mX4T5Iak8CMO1rPAmjRb7oyH | | At this point I'd experiment with more stylized/recognizable | references or add a couple "effects" to polish up the results. | turdnagel wrote: | My current move is creating initial versions of images with | Midjourney, which seems to be a bit more "free-spirited" (read: | less _literal_, more flexible) and then using DALL-E's replace | tool to fill in the weird looking bits. It works pretty well, but | it's a multi-step process and requires you have pay for | Midjourney and DALL-E. | karaterobot wrote: | I ran into this too. When I got my invite, I told a friend I | would learn how to talk to DALL-E by having it make some concept | art for the game he was designing. I ran through all of my free | credits, and most of the first $15 bucket and never really got | anything usable. | | Even when I re-used the _exact prompts_ from the DALL-E Prompt | Book, I didn 't get anything near the level of quality and | fidelity to the prompt that their examples did. | | I know it's not a scam, because it's clearly doing amazing stuff | under the hood, but I went away thinking that it wasn't as | miraculous as it was claimed to be. | jfk13 wrote: | I suspect that many of the "impressive" examples that we see | from tools like this have been carefully selected by human | curators. I'm sure it's not at the level of "monkeys + | typewriters = Shakespeare [if you're sufficiently selective]", | but the general idea is still applicable. | grumbel wrote: | Most of DALL-E2 output is great out of the box, the selection | process is just fine tuning the results to create something | the human in front of the computer likes. DALL-E2 can't | mindread, so the image produced might not match what the | human had in mind. | | There is however one thing to be aware of, the titles posted | on /r/dalle2/ and other places are often not the prompts that | DALL-E2 got. Instead they are a fun description of the image | done by a human after the fact. Random example: | | "Chased by an amongus segway" | | * https://www.reddit.com/r/dalle2/comments/wkv7za/chased_by_a | n... | | But the actual prompt was: | | "Award winning photo of a mole driving a red off road car | through a field" | | * https://labs.openai.com/s/xnaoxiWeSjiQX1QyVUCHGkl1 | | Which is quite a bit less impressive, as the actual prompt | doesn't really match the image very well. And if you put | "Chased by an amongus segway" into DALL-E2, you won't get an | image of that quality either. | coldcode wrote: | It's fun to play around with it, but like the author found, what | you get is often strange or useless. I also find 1k images too | small to do much with but I realize making 4k images would be | cost prohibitive. I also wish it could generate vector images as | well as pixel images. That would be fun to use. | jordanmorgan10 wrote: | A lot of these posts showing up on HN. I wonder - is it because | it is so new, or is it because the ways in which we are to use | this technology are so nascent that we are discovering how to use | it more precisely daily? | dougmwne wrote: | I believe it's for a few reasons. First, it is jaw dropping | incredible for most people in tech who have at least a hint of | how most ML works. Second, the AI image generation field is | racing ahead, in academics and new trained models, so there's | lots of new news. Thirdly some really great models like Dall-e | have been opened for wider access and lots of everyday users | are discovering its capabilities and doing blog write-up's | which are not news, but are surely interesting to most. | pleasantpeasant wrote: | There was a thread on r/DigitalArt about people debating if | you're really an artist if you're using these AI creator | websites. | | Some guy spent hours feeding the AI pictures he liked to get an | end result he was happy with. ___________________________________________________________________ (page generated 2022-08-11 23:00 UTC)