[HN Gopher] Is DALL-E 2 'gluing things together' without underst... ___________________________________________________________________ Is DALL-E 2 'gluing things together' without understanding their relationships? Author : Hard_Space Score : 172 points Date : 2022-08-04 13:24 UTC (9 hours ago) (HTM) web link (www.unite.ai) (TXT) w3m dump (www.unite.ai) | rasz wrote: | Dalle is fancy content aware fill on steroids (style translation | etc). | avalys wrote: | I think DALL-E is clearly just gluing things together that it | found in a massive dataset, and doesn't have any understanding of | the underlying concepts. I thought it was easy to see the signs | of this in examining its output. Same for GPT-3. | | However, what's amazing about DALL-E and these other statistical, | generative models to me is that it's made me think about how much | of my daily thought processes are actually just gluing things | together from some kind of fuzzy statistical model in my head. | | When I see an acquaintance on the street, I don't carefully | consider and "think" about what to say to them. I just blurt | something out from some database of stock greetings in my head - | which are probably based on and weighted by how people have | reacted in the past, what my own friends have used similar | greetings, and what "cool" people say in TV and other media in | similar circumstances. "Hey man how's it going?" | | If I was asked to draw an airplane, I don't "think" about what an | airplane looks like from first principles - I can just synthesize | one in my head and start drawing. There are tons of daily | activities like this that we do like this that don't involve | anything I'd call "intelligent thought." I have several relatives | that, in the realm of political thought, don't seem to have | anything more in their head than a GPT-3 models trained on Fox | News (that, just like GPT-3, can't detect any logical | contradictions between sentences). | | DALL-E has convinced me that even current deep learning models | are probably very close to replicating the performance of a | significant part of my brain. Not the most important part or the | most "human" part, perhaps. But I don't see any major conceptual | roadblock between that part and what we call conscious, | intelligent thought. Just many more layers of connectivity, | abstraction, and training data. | | Before DALL-E I didn't believe that simply throwing more compute | at the AGI problem would one day solve it. Now I do. | multibit wrote: | >I have several relatives that, in the realm of political | thought, don't seem to have anything more in their head than a | GPT-3 models trained on Fox News (that, just like GPT-3, can't | detect any logical contradictions between sentences). | | If more people were to realize that we're all probably like | this, trained on some particular dataset (like mainstream vs | reactionary news/opinion), I wonder if that would lead to a | kind of common peace and understanding, perhaps stemming only | from a deep nihilism. | dougmwne wrote: | I think our intelligence has 2 components, our conscious selves | that are good at formal reasoning and meaning making, and a | massive neural network that is good at statistical inference. I | suspect the reason we need less training data for learning than | dall-e is because our conscious selves can tease out the formal | relationships and models, rather than relying purely on | inference which requires enough datapoints. | AbraKdabra wrote: | I mean, if it is "gliung things together" is doing it pretty | well... | QuadmasterXLII wrote: | My intuition is that DALL-E is more a demonstration of how hard | image synthesis is for humans, than how intelligent the algorithm | is. The image generation models have orders of magnitude fewer | parameters than the large language models. | bawolff wrote: | AI system is good at what it was trained to do and bad at what it | was not trained to do! News at 11! | nodja wrote: | I've been following the image generation field for a couple | months now and while the answer to the title is "yes for most | things" it is easily fixed. Use a better text encoder. | | My favorite anecdote for showing how having a text encoder that | actually understands the world is important to image generation | is when querying for "barack obama" on a model trained on a | dataset that has never seen Barack Obama the model somehow | generates images of random black men wearing suits[1]. This is, | in my non-expert opinion, a clear indication that the model's | knowledge of the world is leaking through to the image generator. | So if my understanding is right, as long as a concept can be | represented properly in the text embeddings of a model, the image | generation will be able to use that. | | If my anecdote doesn't convince you, consider that one of | google's findings on the Imagen paper was that increasing the | size of the text encoder had a much bigger effect on not only the | quality of the image, but also have the image follow the prompt | correctly, including having the image generator being able to | spell words. | | I think the next big step in the text to image generation field, | aside from the current efforts to optimize the diffusion models, | will be to train an efficient text encoder that can generate high | quality embeddings. | | [1] Results of querying "barack obama" to an early version of | cene555's imagen reproduction effort. | https://i.imgur.com/oUo3QdF.png | snek_case wrote: | > when querying for "barack obama" on a model trained on a | dataset that has never seen Barack Obama the model somehow | generates images of random black men wearing suits[1]. This is, | in my non-expert opinion, a clear indication that the model's | knowledge of the world is leaking through to the image | generator. | | That's super interesting. It's not just black men in suits | either. It's older black men, with the American flag in the | background who look like they might be speaking. Clearly the | model has a pretty in-depth knowledge of the context | surrounding Barack Obama. | | I would say the image generation model is also doing a pretty | great job at stitching those concepts together in a way that's | coherent. It's not a random jumble. It's kind of what you would | expect if you asked a human artist to draw a black American | president. | LesZedCB wrote: | it does understand their relationships. | | the difference are the subjects of the domain it learns about | exist purely as 2 dimensional images. | | once these models get larger and include wider ranges of sensory | data beyond just imagery (as can be seen with models like GATO), | they are clearly better able to "glue together" concepts across | multiple domains. | | i would argue we absolutely do nothing different with regard to | 'gluing things together'. we just have a wider range of sensory | inputs. | throwaway675309 wrote: | Yes, SHRDLU it most definitely is _not_. | upupandup wrote: | Blake Lemon claimed that Google's chatbot was sentient which I | disagreed with and its demonstrated through this article. AI can | be optimized to respond in a way that can easily fool someone | into thinking they are talking to a human but at the end of the | day sentiency requires consciousness and that is not something | that can be digitally produced. | | You can teach a parrot to respond to basic arithmetic but they | are not aware of the concept of math rather they are acting in | pathways set to induce the desired response. | | A truly conscious entity would simply have a mind of its own and | will not do our bidding just like any other humans. They would be | extremely selfish and apathetic, the idea that bunch of GPUs | sitting in a datacenter is sentient is sheer lunacy. | | This Blake Lemon character will not be the last, there are always | those that seek to be in the lime light with zero regards for | authenticity. Such is sentient behavior. | boredumb wrote: | GPT/DALL-E/etc... All of these models are of course gluing things | together in some manner, but who cares? That's the point right? | The AI pill i've taken is that you don't need AGI in order to | make things that are useful for people and business. If you've | ever ran a business and had to dive into creatives for blogs, SEO | content, social media posts, etc then you spent an inordinate | amount of time creating it or outsourced it and in both cases the | final copy is NOT going to get you a literary prize any time soon | but it is absolutely enough to inform potential customers, start | ranking on google and start gaining social media clout. GPT will | also not garner you a literary award but can absolutely get you | quality copy that users, customers, google, facebook users will | not be the wiser that you generated it with AI instead of paid a | third party to hack it together for you. | | (https://neuralmates.com/ I recently started putting together a | web app to MVP this, and I hope to be able to integrate DALLE-2 | soon to be able to start generating images for folks as well.) | lancesells wrote: | This is the kind of response that turns me off on all of these | amazing technologies. Blanketing the world with low-level | content in the hopes to gain ranking and clout sounds awful. | | On the flip side I'm guessing we'll have some gpt-3/dall-e | blocker extensions that help reduce some of it. | boredumb wrote: | Sure, but It's the reality of it. | | Also - you already live in this world but it's fueled by low | cost copy writers and ghost accounts on fiver, I'd bet you | are going to see the low water mark for content increase | quite a bit in quality and volume over the next few years due | to GPT3 being leagues better than the current state of | content mills. | kache_ wrote: | define understand | kazinator wrote: | The authors here underestimate how much semantics is involved in | gluing things together. | simion314 wrote: | There were so many articles about this project, but I can't find | a demo page to test 3,4 outputs myself (unless it was some page I | already tried when it was linked and I got nothing similar with | what is shown on main page) | ___rubidium___ wrote: | Fascinating. I've noticed the same problem with Midjourney and | Nightcafe as well. | mellowside wrote: | I've seen different results than the author. Had it since it was | in beta and learned that it needs better prompts to generate | better images. As someone on their Discord said "prompt writing | is a craft you need to learn". | aaroninsf wrote: | Strong recommend for the book _The Alignment Problem_ which while | only somewhat touching on systems which is an excellent lay | overview of the state of ML which does a good effort to explain | the principles involved, the challenges and blockers, and the | ways in which progress has been made. | | The field is moving so quickly it will stop short of the current | status quo, but, it's still remarkable. | | I have been myself playing with MidJourney, which like DALL-E 2 | is a text prompt to image generation system; it has different | goals and favors aestheticized output over photorealism. | | The premises of that project and its current execution (as an | explicit exercise in collective- rather that siloed-and- | contractual relations) are genuinely remarkable and I believe | represent a watershed. The rate of evolution of what it is doing | are something to behold. | | I have generated around 7500 images with MidJourney so far and am | slowly developing what feels like an intuition for the way it | "sees" images, which as the devs say in regular office hours, is | unlike how humans see them. | | The limitations, and superpowers, of the system as it exists are | already deeply provocative. When things scale again to the next | level, the degree of uncanniness and challenge to our | preconceptions about the nature and locus of intelligence in | ourselves may be genuinely shaking. | | Or so I currently think. | | I highly recommend taking time to really feel out these systems, | because the ways they do and do not succeed and fail serves as a | very potent first-hand education in the opportunities, and | perhaps much more important, perils, of their application in | other more quotidian areas. | | It's one thing for them to reliable produce "nightmare fuel" | because of their inability to retain very high level coherence | down through low level details, when they are drawing limbs, | hands, faces... | | ...it's another thing entirely when analogous failure modes | quietly permeate their ability to recognize illness, or, approve | a loan, or recommend an interest rate. | | Or--as the example which opens _The Alignment Problem_ spells out | --recommend whether someone should or should not offered bail. (A | real world example with life-changing consequences for people who | interact with ML in this path, in something over 30 states | today... at least, as of publication). | siglesias wrote: | I discovered something like this recently when I tried the prompt | "man throwing his smartphone into a river," and for the life of | me I could not get DALL-E to render the phone separated from the | hand (I tried "like a boomerang," "tossing," "into an ocean," | "like a baseball," etc). And then it occurred to me that by the | training data, there are virtually no pictures of a person and a | phone where the phone is separated! So DALL-E might have thought | that the phone was just an appendage to the body, the way the | hand is (which, what does this say about society!). I might as | well have asked DALL-E to render someone throwing their elbow | into a river. | | Another interesting case is animal-on-animal interactions. A | prompt like, "small french bulldog confronts a deer in the woods" | often yields weird things like the bulldog donning antlers! As | far as the algorithm is concerned, it sees a bulldog, ticking the | box for it, and it sees the antlers, ticking the box for "deer." | The semantics don't seem to be fully formed. | gwern wrote: | I dunno man, I punched that exact prompt ("man throwing his | smartphone into a river") in DALL-E 2 just now, and in 2/4 | samples, the smartphone is clearly separate from the hand: | labs.openai.com/s/uIldzs2efWWnm3i9XjsHI7or | labs.openai.com/s/jSk4qhAxSiL7QJo7zeGp6m9f | | > The semantics don't seem to be fully formed. | | Yes, not so much 'formed' as 'formed and then scrambled'. This | is due to unCLIP, as clearly documented in the DALL-E 2 paper, | and even clearer when you contrast to the GLIDE paper (which | DALL-E 2 is based on) or Imagen or Parti. Injecting the | contrastive embedding to override a regular embedding tradesoff | visual creativity/diversity for the semantics, so if you insist | on exact semantics, DALL-E 2 samples are only a lower bound on | what the model can do. It does a reasonable job, better than | many systems up until like last year, but not as good as it | could if you weren't forced to use unCLIP. You're only seeing | what it can do after being scrambled through unCLIP. (This is | why Imagen or Parti can accurately pull off what feels like | absurdly complex descriptions - seriously, look at the examples | in their papers! - but people also tend to describe them as | 'bland'.) | visarga wrote: | If you want multiple objects, each with individual | attributes, the unCLIP model still has to make a single | embedding vector representation. But the single vector is too | small to contain an ever more detailed scene description. | That's why it has failure modes like assigning the wrong | colour to cubes and not being able to spell text. | | On the other hand the previous approach - autoregressive | generation - allows full access through the attention | mechanism to the prompt. | | For example Imagen encodes text to a sequence of embeddings. | | > Imagen comprises a frozen T5-XXL [52] encoder to map input | text into a sequence of embeddings and a 64x64 image | diffusion model, followed by two super-resolution diffusion | models | anothernewdude wrote: | Certainly seems that way, but I didn't get to use DALL-E 2 | because its a Closed-AI. | 2OEH8eoCRo0 wrote: | What are we trying to accomplish with AI? What is "real" | intelligence? What are we hoping that an AI does? What test does | it need to pass? | roland00 wrote: | Isn't this just Chaos Theory, Calculus, and the problem of Jerks | (third derivative of position in regards to time)? | | When you slice complex reality into smaller pieces, within the | smaller piece you have a rough idea of velocity, and changes in | velocity (aka acceleration), but you have no idea of future | speed-bumps, aka the jerks (third derivative of position in | regards to time) for that information is outside the frame of | reference when you divided reality into smaller pieces. | | Thus you have pictures of people / objects in systems but you are | not truly understanding relationships thus you miss things even | though you feel like you see things. It is all a theme park for | our own amusement, it is not real, only hyper-real which becomes | uncanny when we start noticing how the images are off. | BiteCode_dev wrote: | Amusingly, I get creative new ideas when I glue things together | and ignore what I thought I understood about their relationships. | TremendousJudge wrote: | Happens to me too -- it's a great way to make new things. | However, the "creation" I'd argue happens when you look at the | pile of random stuff and generate a new understanding, and | decide that it is valuable. The difference between trash and | art only exists in the head of the artist. Same thing happens | with DALL-E output, really. | gizajob wrote: | I recently used it to create a picture of "an analogue | synthesizer made out of meat and gristle". I don't really know | how you could glue these concepts together, because nothing in | reality causes them to ever go together, but it did a great job | anyway. | JoshTko wrote: | Do humans assume things without understanding their | relationships? | can16358p wrote: | Regardless of the content, the article has been "published 10 | seconds ago" for the last few minutes, and has been submitted to | HN 1 hour ago. | otikik wrote: | How does one define "understand their relationships"? | | To me it is a matter of degrees and has multiple axes. | | When my 6yo son draws a chair, it's not the same as when Van Gogh | draws one, which is different to when an expert furniture | designer draws one. They all "understand" in different ways. A | machine can also "understand". It might do it in different | degrees and across different axes that the ones humans usually | have, that's all. How we transform that understanding into action | is what is important I think. | dezmou wrote: | Another cool example is when you ask DALLE-E to make a "Jaguar | XF", it does not really know if I talk about the car or the | animal, yet it manage to blend the two together https://img- | comment-fun.9cache.com/media/a6wG6Kq/aq04bGmR_70... | dukeofdoom wrote: | Using midjourney, I'm fascinated with the geometric shapes it | comes with. Just yesterday I was looking at a sort of a hexagon | made with intersecting hearts. Not only was it aesthetically | pleasing, it was such an unusual composition, that I think it | would be extremely hard for an artist to come up with, maybe even | a mathematician. So I think it's already capable of doing more | sophisticated things than human artists. | Cognitron wrote: | MidJourney is really good at creating abstract images. I've | been playing with the multi-prompt syntax (::) a lot lately, | just combining different words that evoke form or texture to | see what it will come up with. I start by testing things one at | a time and just take a guess at what might look cool together. | I end up making strange images with prompts like "fractal :: | stained glass :: feathers", "antelope canyon :: tattered fabric | :: cave", "rubik's cube :: Japanese woodblock", you get the | idea. I rarely try to make pictures of real things anymore. | Definitely fun to play around with. | dukeofdoom wrote: | I'm pretty new to it. Any tips on making retro pixel art | assets. I want a stone or brick wall I could tile into a game | I'm making. | Cognitron wrote: | I think most of us are pretty new to it since they just | opened the beta to a ton of people. I haven't tried to do | anything like that with it though. All I can say is it | tends to take a lot of experimentation to get what you | want. And you'll probably have to do some work to the | images to get them to tile seamlessly. If you can't get the | pixelated look quite how you want it, you might try using | it to generate a base image and then use a separate tool to | pixelate it. Some people are using it as part of a larger | pipeline. | | The discord is full of people sharing their experiments and | approaches though. Maybe try asking in the prompt-craft | channel to see if someone else has attempted something | similar. | | Good luck! | thriftwy wrote: | Maybe you could give DALL-E creativity lessons by finding such | nonsense queries and adding creative interpretations of those | into learning set. | Marazan wrote: | https://imgur.com/ggnm920 | | "a photo of 6 kittens sitting on a wooden floor. Each kitten is | clearly visible. No weird stuff." | | Like, lets start with the fact that there are 7 of them (2 of the | 4 images from the prompt had 7 kittens). Now lets continue on | with how awful they look. | | It is startling the difference in image quality between DAlle-2 | asked for a single subject vs DAlle-2 being asked for a group of | stuff. | | And its obvious, if you know how the tech works, why this is the | case. | xwdv wrote: | For art generation, gluing things together without understanding | relationships is basically creativity. Humans can assign meaning | to the final output. | visarga wrote: | There is some degree of understanding how to combine concepts, | otherwise the avocado chair would not be possible. | knighthack wrote: | I don't see how the 'understanding of relationships' should be | taken as the key intent of DALL-E 2. | | Consider procedural generation: it can create abstractions of | both utter beauty or garbage without understanding context. You | need to guide it towards something meaningful. | | Just the fact that DALL-E can 'glue things together' without need | for human inspiration - yet where its output and intent can be | understood by a human appraising it - that is not only a feat in | itself, but I would say its key _feature_. | adamsmith143 wrote: | Why is there this obsession with systems or algorithms having | "understanding"? No one thinks these things have internal states | equivalent to "understanding". "Understanding" or not you can't | deny the capability of these systems. | arey_abhishek wrote: | AI will never understand the actual context because not | everything we feel/experience can be captured and communicated to | a machine. For example, human language is incomplete and doesn't | encode every information because it doesn't need to when used | with other humans. | | I think it's a romantic notion to imagine that AI will not be a | Chinese room. | | Even human intelligence feels like a Chinese room. Especially | noticeable when using complicated devices like flight controls. | I've been playing the MSFT Flight simulator, and I don't fully | understand the relationship between the different instruments. | But I can still fly planes(virtually). | | We'd be better off if we considered AI similar to an appliance | like a microwave or a refrigerator. Does a fridge need to | understand or taste what's inside it to be helpful? | RootKitBeerCat wrote: | Having spent about 200$ on credits... yes, it glues together your | words through transformers into images... of snippets of the | identified corpus of images it has... with some specific prompt | phrasings having cool weighting for making really nicely looking | glued together results... it's a model: classical computing - | everything is turtles and duct tape and rivets and glue, all the | way down... | pera wrote: | Interesting, DALL-E Mini / Craiyon actually generates a pretty | accurate result for "a monkey touching an iguana" (at least | semantically): | | https://i.imgur.com/Oq62gQI.png | affgrff2 wrote: | Maybe it's in the training data? At least some special kind of | monkey (human) touching some lizard is. | johndough wrote: | https://rom1504.github.io/clip- | retrieval/?index=laion5B&useM... | | There seem to exist several pictures of marmoset monkeys | touching iguanas, but DALL-E mini shows macaque monkeys. This | makes me believe that DALL-E mini has at least some | generalization capabilities. | walrus01 wrote: | imgur is providing a prompt before viewing this image "this | image may contain erotic imagery, are you over 18, etc". | | i certainly hope not... | aeturnum wrote: | The Fair Witness was a job that Heinlein made up for Stranger in | a Strange Land. Fair Witnesses were supposed to reliably report | what they saw without judgement - including their subjective | judgement in their report. The example exchange is: "Is that | house over there brown?" "It is brown on this side." | | Dall-E (and other ML systems) feel like fair witnesses for our | cultural milieu. They basically find a series of weighted | connections between every phrase we've thought to write down or | say about all images and can blend between those weights on the | fly. By any assessment it's an amazing feat - as is the feat to | view their own work and modify it (though ofc it's from their | coordinate system so one does expect it would work). | | In one sense - asking if the machine "understands" is beside the | point. It does not need to 'understand' to be impressive (or even | what people claim when they're not talking to Vice media or | something). | | In another sense, even among humans, "understanding" is both a | contested term _and_ a height that we all agree we don 't all | reach all of the time. One can use ideas very successfully for | many things without "understanding" them. | | Sometimes people will, like, turn this around and claim that: | because humans don't always understand ideas when they use them, | we should say that ML algorithms are doing a kind of | understanding. I don't buy it - the map is not the territory. How | ML algorithms interact with semantics is wholly unlike how humans | interact with them (even though the learning patterns show a lot | of structural similarity). Maybe we are glimpsing a whole new | kind of intelligence that humans cannot approach - an element of | Turing Machine Sentience - but it seems clear to me that | "understanding" in the Human Sentience way (whatever that means) | is not part of it. | blocked_again wrote: | This can be easily answered by giving the following input to | dalle. | | """ | | A map highlighting the countries the ancient Romans invaded since | Pepsi was introduced. | | """ | Rackedup wrote: | Havoc wrote: | Isn't that true for all the models? | [deleted] | rkwasny wrote: | People are "gluing things together" for a very long time, that's | what everyone does! | | All ideas as remixes of previous ideas | andybak wrote: | I'm a big Dall-E fan but this is no surprise to anyone who's used | it for more than 5 minutes. | | It was one of the things highlighted by Google when they | announced Imagen as a differentiator: | https://imagen.research.google | | The article touches on this but the headline is slightly | deceptive. | causi wrote: | Imagen would be amazing for an "Images Against Humanity" game | where you score points by generating the funniest picture. | Zircom wrote: | Absolutely stealing this idea, look forward to being credited | for the idea in the future show hn post :P | andybak wrote: | (Incidentally - I want to make it clear I don't mean "Dall-E | has no understanding of the elements and it's just a fancy | image search/compositor" - there are results that absolutely | require some semantic "understanding". It's just there are | limits and spatial understanding is one of the weak spots) | kemiller wrote: | Anyone who has ever raised a small child will realize this is how | kids learn too. | rocgf wrote: | Does anyone actually believe DALL-E "understands" what it's | doing? For any reasonable definition of "understands", I assume | most people would be skeptical. | | So if we go with that, then yes, it just glues things together | without understanding their relationship. I'd just be tempted to | say it doesn't really matter that it doesn't understand, except | maybe for some philosophical questions. It's still incredible | based on its output. | visarga wrote: | > Does anyone actually believe DALL E "understands" what it's | doing? | | I believe it understands enough to make tens of thousands of | people interested and debating its merits. The GANs of 5 years | ago were baby toys compared to DALL-E. They were drawing 5 legs | to a dog and limited to a few object classes. Now people debate | if it "really understands" and if it is "(slightly) conscious". | LesZedCB wrote: | can we come up with a 'reasonable' definition of understands | that isn't couched in human exceptionalism and could still | apply to any computer model? | | i think dall-e understands, within the sensory domain it's | trained from. | rocgf wrote: | I think DALL-E understands the same way an Excel sheet would | understand SUM(B3:B6), obviously at a way larger scale and | with a more complicated function. It's all just rather simple | maths all the way down. | visarga wrote: | Not similar. The SUM() function was written by humans in a | top-down fashion, DALL-E learned from the visual and | textual data by a completely different process. DALL-E | distilled something about the world and is using that to | generate. | [deleted] | chpatrick wrote: | I think "understanding" is a super vague word. | i_am_toaster wrote: | Bingo. This is also a fair comparison to how todays politics | works in a nutshell. Overload a word with a vague context and | people immediately attribute their own perspective to come to a | conclusion that conflicts with an opposing argument. | | Relating back to this headline, using "understanding" creates | lots of messages with differing views because everyone has | their own take on the word. If instead you said something like, | "DALLE fakes understanding of concepts to create new images," I | bet you'd get even closer to the "political message board" | style of comments because you are now taking an objective | position (yes/no,true/false,good/bad) on a subjective word | (understanding). | IceMetalPunk wrote: | Now, let's be critical of possible _reasons_ for this. It 's | important to remember two things: 1) Any NN has zero experience | with the world beyond its training data. Things that seem obvious | to us from our experience are not obvious to a system that has | never experienced those things. And 2) DALL-E 2 was trained on | image-caption pairs scraped from the internet, basically. | | So it's quite possible the reason it doesn't understand things | like "X under Y" very well is that its training set doesn't have | a lot of captions describing positional information like that, as | opposed to any failure in the architecture to even _potentially_ | understand these things. | vannevar wrote: | I'm sure it is, but "gluing things together" coherently in | response to a text prompt is a stupendous achievement. It's not | AGI, but it's miles ahead of where we were even a few years ago | and opens the door to automating a class of jobs I don't think | anyone back then believed could be automated, short of AGI. | seanmcdirmid wrote: | Just think what this could do for a game experience like | Scribblenauts. Just being able to glue a fixed number of | concepts in a huge number of ways...game designers are going to | have to learn how to leverage ML. | monkeynotes wrote: | Their research showed that Dall-E had most success with real | world stuff it had been trained on. Is this surprising? I mean, | if I didn't know much about iguanas I'd also have a hard time | representing them. | _nhynes wrote: | I ended up reading the book _Blindsight_ (Peter Watts) that 's | been floating around in comments recently. A major theme in the | book is intelligence and its relation to consciousness | (including whether consciousness is even beneficial). If you | agree with the idea, you'd consider that DALL-E is indeed | intelligent even though it appears to be a "Chinese Room". | Humans would be "gluing things together" in just the same way, | but with this odd introspective ability that makes it seem | different. | Pulcinella wrote: | I think the book does make a point that maybe one benefit of | consciousness the ability to filter through all the | information and spam that conscious beings produce. E.g. The | scramblers may view all the radio waves we blast everywhere | as attempts at warfare and reducing the fitness of other | species. Why else would a species emit so much information if | not to DDOS their enemies?! tl;dr consciousness is a defense | against ****posting and trolling caused by conscious beings. | the8472 wrote: | re the chinese room, you might want to consider the | computational and memory complexity of a lookup table. | https://www.scottaaronson.com/papers/philos.pdf page 14 | twak wrote: | fantastic book. Made me consider the question of whether | consciousness exists at all or if it is just some hack by | evolution to allow introspection. | | I haven't found a definition of consciousness which is | quantifiable or stands up to serious rigour. If it can't be | measured and isn't necessary for intelligence, perhaps there | is no magic cut-off between the likes of Dall-E and human | intelligence. Perhaps the Chinese-room is as conscious as a | human (and a brick)? | otikik wrote: | It is a hack alright. A very beautiful hack, refined in | hundreds of thousands of iterations, with a very high | price. | | > to allow introspection | | Evolution doesn't do things "to anything". It repeats what | works, and kills the rest. Our brains have allowed us to | adapt to the changes in the environment better than the | rest. Conscience came with the pack. It might not have an | actual "purpose"- it could be an "appendix". | | My personal belief is that consciousness started as the | self-preservation instinct that most animals have, and we | developed introspection as a way to strengthen our ties to | other members of our family or tribe. And then we "won" | (for now) | somenameforme wrote: | I think in these sort of topics we are often discussing two | different meanings of consciousness: physical vs | philosophical. The arbitrary mixing and matching of them is | likely one the reasons for the lack of clarity. Physical | consciousness is most easily identified by the notion that | you're unconscious when sleeping. The other is the far more | interesting philosophical consciousness. | | Philosophical consciousness is what the oft misunderstood | quote _cogito ergo sum_ , I think therefore I am, was | hitting on. Descartes was not saying that consciousness is | defined by thinking. He was trying to identify what he | could know was really real in this world. When one goes to | sleep, the dreams we have can often be indistinguishable | from a reality in themselves, until we awake and find it | was all just a dream. So what makes one think this reality | isn't simply one quite long and vivid dream from which we | may one day awake? | | But this wasn't an appeal to nihilism, the exact opposite. | The one thing he could be certain of is that he, or some | entity within him, was observing everything. And so, at the | minimum, this entity must exist. And the presence of this | entity is what I think many of us are discussing when we | speak of consciousness. In contrast to physical | consciousness, you are philosophically conscious even when | sleeping. | | Of course like you said philosophical consciousness cannot | be proven or measured and likely never will be able to be, | which makes it an entirely philosophical topic. It is | impossible for me to prove I am conscious to you, or vice | versa, no matter what either of us does. Quite the private | affair, though infinitely interesting to ponder. | akomtu wrote: | There is a lot of literature on this topic, but nearly | all of it is in the "restricted" section of the library: | it's the section with book shelves for alchemy, | occultism, buddhism and so on. Western literature is very | shallow in this regard. | | There a few interesting thoughts about consciousness that | I've found in those books. One is that the boundary | between consciousness and "real matter" is imaginary: | consciousness exists only because of change in that | matter, when the change stops - so does consciousness, | consciousness creates reality for itself, and the two are | in fact just two sides of the coin. In other words, | static consciousness isnt a thing, and hence the need for | "reality". | | Human consciousness is a sum of many consciousnesses that | exist at wildly different levels of reality. There are | primitive cellular consciousnesses, and those sometimes | influence our mental consciousness. Our neural | cerebrospinal system has an advanced consciousness | capable of independent existence: it manages all the | activity of internal organs, and only loosly interacts | with our higher mental consciousness. That cerebrospinal | system is even self-conscious in a primitive way: it can | observe its own internal changes and distinguish them | from impulses from the outside. There's emotional and | mental consciousness that mainly lives in the brain and | is somewhat aware of the dark sea of lower consciousness | below it. | | Most people are conscious in dreams, as they can perceive | in that state. However they cant make (yet) distinction | between inner processes (self) and external effects | (others), so to them it appears as if everything is | happening inside their mind, i.e. they are not self- | conscious. That's consciousness of a toddler. Some are | more advanced, they start seeing the me-others difference | and can form memories from dreams. | patcon wrote: | Also read the book due to comments here. Was very aligned | with some of my interests in relationships between | language, information, life/intelligence and metaphor <3 | | You might get a kick out of this paper (though some may | find it's proposal a bit bleak, I think there's a way to | integrate it without losing any of the sense of wonder of | the experience of being alive :) ) | | It analogizes conscious experience to the a rainbow "which | accompanies physical processes in the atmosphere but exerts | no influence over them". | | Chasing the Rainbow: The Non-conscious Nature of Being | (2017) https://www.frontiersin.org/articles/10.3389/fpsyg.2 | 017.0192... | | > Though it is an end-product created by non-conscious | executive systems, the personal narrative serves the | powerful evolutionary function of enabling individuals to | communicate (externally broadcast) the contents of internal | broadcasting. This in turn allows recipients to generate | potentially adaptive strategies, such as predicting the | behavior of others and underlies the development of social | and cultural structures, that promote species survival. | Consequently, it is the capacity to communicate to others | the contents of the personal narrative that confers an | evolutionary advantage--not the experience of consciousness | (personal awareness) itself. | | So consciousness is more about what it subjectively feels | like to be under pressure/influence to broadcast valuable | internal signals to other (external) agents in our | processes of life; aka other humans in the super-organism | of humanity. I analogize it to what a cell "experiences" | that compel it to release hormonal signals in a | multicellular organism. | f38zf5vdt wrote: | > Made me consider the question of whether consciousness | exists at all or if it is just some hack by evolution to | allow introspection. | | Can't it be both? What's the difference? Evolution just | responds to the environment, so a method of complex | interaction with the environment like "consciousness" or | "ever-polling situational awareness" seems like par for the | course. | croes wrote: | Evolution is the result of the environment not a | response. | | Giraffes didn't get a long neck because the food was out | of reach, giraffes have a lock neck because the one | without just died. | f38zf5vdt wrote: | It's a response caused by survival. The environment is | the stimuli and the genetics that are not eliminated are | cultivated. | | If I introduce an antibiotic into a culture of bacteria | and they evolve resistance, then they appear to be | responding to it on a collective level. | croes wrote: | "Since bacteria are extremely numerous, random mutation | of bacterial DNA generates a wide variety of genetic | changes. Through mutation and selection, bacteria can | develop defense mechanisms against antibiotics." | | https://health.mo.gov/safety/antibioticresistance/general | inf... | | Mutation happens all the time because cell replication | isn't perfect, some mutation are irrelevant, some deadly, | some bring better chance of survival. | | It's not a response just the result. Or how does the | bacteria know it's an antibiotic and not just water? It | doesn't, water just isn't a evolutionary filter, | antibiotics are. | doph wrote: | Exactly (and cannot recommend Blindsight highly enough). Of | course DALL-E is a Chinese Room. The most exciting and | subversive idea in Blindsight is that consciousness is | maladaptive, and the scramblers are _more_ fit for the lack | of it. Long after neutral nets have surpassed our functional | intelligence, we'll still be criticizing their ability to | navel-gaze as well as humans. | FeepingCreature wrote: | Of course, DALL-E can only get away with not being | introspective because it has reinforcement learning holding | its hand. Bit harder to curate a clean training set without | consciousness. | | Personally, I suspect consciousness is intrinsically | connected with whatever lets humans learn things by looking | at and thinking about a single example. | soulofmischief wrote: | I get what you're saying, but it's important to also | consider how evolution provides for us as well by giving | us neural structures already tuned for things like | recognizing faces and recognizing danger. I do wonder how | much realtime danger analysis has to do with | consciousness. | FeepingCreature wrote: | I'd doubt it has much of a connection. Consciousness can | do many things, but generally not quickly. | TrevorJ wrote: | Reinforcement learning _is_ the thing that integrates the | network across time though. In one sense, I think it 's | doing at least part of the job that consciousness does: | mediating between memory(past) and the present. | [deleted] | fatherzine wrote: | Consciousness is very much adaptive, though it requires a | sense of the metaphysical / transcendent to behave in that | way. Theologically speaking, God brings Life and the Devil | brings Death. Sadly, our modern brave new world has taken | "Gott ist Tot" as a triumphalist proclamation, and not the | lamentation it was meant to be. | wpietri wrote: | It's far too early to tell if consciousness is adaptive. | Has it enabled humans to be wildly successful? In the | short term, sure! But the long term is unclear. | | E.g., look at how the technological progress it makes is | at risk of destroying human civilization. A series of | ecological boom/bust cycles could lead toward negative | selection for consciousness. Or we might manage to wipe | out ourselves or, say, most mammals, leading life on | earth to start over as it did 65m years ago. | | But even without that, it's not clear to me that | consciousness will really win out. Look at the number of | successful people who are not only painfully | unreflective, but need to be to keep doing what they're | doing. I could name a lot of people, but today's good | example is Alex Jones, whose whole (very profitable) | schtick is based on refusing to be fully conscious of | what he's saying: https://popehat.substack.com/p/alex- | jones-at-the-tower-of-ba... | | And this is hardly a new idea. Vonnegut wrote a novel | where humans end up evolving into something like a sea | lion. The point being "all the sorrows of humankind were | caused by 'the only true villain in my story: the | oversized human brain'", an error evolution ends up | remedying. | fatherzine wrote: | Hardly a new ideea indeed: https://en.wikipedia.org/wiki/ | Tree_of_the_knowledge_of_good_... | | Edit: To be clear, I posit that consciousness is the | organ that enables us to distinguish between Good and | Evil. | wpietri wrote: | I doubt that consciousness is an organ, but if it were, | it's distinct from a moral sense, because we share that | with animals. Here's a good book on the topic: https://ww | w.hup.harvard.edu/catalog.php?isbn=9780674356610 | | That same professor has done a bunch more work on the | topic, as have many others. | fatherzine wrote: | Frans de Waal has done great work. | | Evolution is a strange phenomenon. I invite us to marvel | at the transformation of terrestrial quadrupeds into | majestic aquatic creatures, over eons: | https://en.wikipedia.org/wiki/Evolution_of_cetaceans | | Evolutionary speaking, cetaceans "share" the front limbs | with quadrupeds. And yet there is a qualitatively | distinct functional difference. Consider that moral | consciousness, as present in humans, is functionally not | quite the same as its biological precursor, the moral | sense present in dogs or gorillas. And, of course, there | are gradual changes along the evolutionary way. | | Edit: "Organ", more precise "sensory organ", as in "the | visual organ". Perhaps there is a better word here than | "organ" here, before we get lost in the medical | distinction between eye / retina / optic nerve / cortex / | etc. | kebman wrote: | But Sontag suggests the use of Batailleist `powerful | communication' to modify and read consciousness. The | subject is contextualised into a predialectic nationalism | that includes language as a totality. | fatherzine wrote: | Hi there GPT-3 bot. | | Edit. Don't really want to be unnecessarily flippant | here. Perhaps this is a real person attempting to | communicate. Sadly, I was unable to follow through, the | text jumps wildly between seemingly unrelated subjects. | kebman wrote: | Nope, that was actually the Postmodern Generator.^[1] But | nice try! ;) [1]: https://www.elsewhere.org/pomo/ | fatherzine wrote: | Haha, <fist bump>. | benlivengood wrote: | It's not clear what generates consciousness. Until we know | for sure (e.g. A/B testing with humans who can report when | they do and do not experience consciousness in different | neural configurations) then I think it's impossible to know | what level of conscious experience large ML models have. | | Blindsight is an excellent book in its exploration of | consciousness, but the speculative part is that a working | sense of self isn't necessary for embodied intelligence (like | the scramblers), which I tend to doubt. An agent without a | model of itself will have difficulty planning actions; | knowing how its outputs/manipulators are integrated into the | rest of reality will be a minimum requirement to control them | _effectively_. It is certainly possible that "self" or "I" | will be absent; humans can already turn the ego off with | drugs and still (mostly) function but they remain conscious. | mrcartmeneses wrote: | People say this a lot online in this kind of debate but you | can see which areas of the brain light up in people who are | conscious and how that's different from people who are | asleep or in a persistent vegetative state. We have a very | good idea of what consciousness is and it's definitely not | linear neural net with a discrete input and output. Having | said that, the nerve that connects the eye to the brain | could be considered similar to a modern neural net. | | It looks like we're beginning to get the building blocks of | consciousness together. But we don't yet know how to | combine the wave functions into a chorus necessary to | achieve GI | stormbrew wrote: | I think you're using a different meaning of the word | conscious than the GP? You seem to be talking about | awake, they're talking about a much more abstract idea. | | A person who is sleeping or in a vegetative state is not | currently getting new inputs fed into some parts of their | brain, so it's not surprising that their brain "lights up | differently," nor does it imply anything about a piece of | software that _is_ getting new inputs that might be being | integrated into its model (of course, a model that is | trained and then repeatedly used without further | integration is not in any way comparable to a brain). | | This more abstract idea of consciousness is definitely | not a solved problem - people can't even manage to agree | on whether non-human animals have it. And a lot of | internet arguments for why this or that neural network | can't be conscious probably also rule 5 year olds out of | it too. | [deleted] | benlivengood wrote: | We have fairly good ways of detecting human consciousness | now, but not a lot of understanding of _why physically_ | or _precisely which_ neural circuits produce conscious | experience, which is the understanding we 'd need to | infer what sort of conscious experience is possible for | ML models. We also can't reliably detect or predict | whether a given animal is conscious or not (related to | the problem of minimal viable neurons required for | consciousness), and this is the biggest evidence I have | that our study of consciousness is far from complete. | | The best we can do with humans (and maybe animals) is | behaviorism and inference on our own personal | consciousness at this point, with brain imaging to | demonstrate at least gross prediction of consciousness in | humans. | visarga wrote: | > consciousness is definitely not linear neural net with | a discrete input and output | | Just neural circuits are not going to be conscious by | themselves, for one they need to learn concepts from the | environment and those concepts shape the neural circuits. | Thus the way they act shape how they develop. You can't | separate consciousness from the environment where it | develops. | | In other words it was not the neural net that was | lacking, but the environment. | canjobear wrote: | > humans who can report when they do and do not experience | consciousness | | How can someone possibly report when they are not | experiencing consciousness? | benlivengood wrote: | > How can someone possibly report when they are not | experiencing consciousness? | | By an absence of reporting it. If I sit at a desk getting | my neurons moderated by testing equipment and say "I am | conscious" every subjective second that I am experiencing | consciousness then I could at least help narrow down when | consciousness is lost. If I am simply unable to speak or | respond at all, but still conscious, I would report that | fact later. Only in the case of locked-in conscious | awareness without later memory of the experience would | this kind of experimental setup fail, and this is where | brain imaging could probably help determine that | everything except motor or memory neurons were active. | toomanydoubts wrote: | >(e.g. A/B testing with humans who can report when they do | and do not experience consciousness in different neural | configurations) | | Doesn't that fall back to the old consciousnesses trap that | nobody knows how to resolve? How do you know if the human | reporting that he's conscious is actually conscious and not | philosophical zombie? | | We don't know what generates consciousness because we don't | know how to measure it, and if we can't measure it, we will | always have to take the words of an seemingly conscious | entity for it. | benlivengood wrote: | Self-reporting is probably ideal, probably with a camera | recording everything so I can judge my answers and | behavior and compare it with my memory. But we don't | really have a great handle on how much of our memory of | conscious experience matches our real-time experience or | even if the memory-formation process might "be" | consciousness, so I admit there are blind spots in | studying consciousness rigorously. | | I don't buy the philosophical zombie argument simply | because consciousness does alter behavior. I wouldn't | participate in this conversation the same way if I didn't | experience consciousness. It would be more like vivid | imagination (as apposed to moderate aphantasia) where I | find it curious but don't have it. As in the novel, | unconscious beings probably behave noticeably different. | | There are, apparently, some people who have a very | reduced sense of consciousness. I know I have done and | said things when I'm not (memorably) conscious, for | example when half asleep or coming out of anesthesia, and | my behavior has been altered according to witnesses. I | wasn't quite "myself". I can also hyper-focus and reduce | conscious awareness of my surroundings and of my own body | and mind, but that still feels like I have an internal | awareness and memory of the experience. I am fairly | certain I would be able to tell if that is switched off | for a time. | Scene_Cast2 wrote: | There was this Google AI that made the news recently, it | was trained to say that it's a conscious AI. However, | most agree that it isn't conscious. | laserlight wrote: | > I wouldn't participate in this conversation the same | way if I didn't experience consciousness. | | I don't think you're conscious. Prove me wrong. | [deleted] | joe_the_user wrote: | The "Chinese room" argument, so far as I understand it, | applies to any discreet computation process simulating | consciousness. | | The argument of the article is that DALL-E doesn't respond | appropriately to a particular kind of input - two entities in | some kind of spatial relationship (that it hasn't often | seen). Dall-E's not extrapolating the three-D world but | stretching a bunch 2-D images together with some heuristics. | That works to create a lot of plausible images sure but it | implies to this ability might not, say, be able to be useful | for the manipulation of 3-D space. | | So, given a "Chinese room" is just a computation, it's | plausible that some Chinese room could handle 3-d image | manipulation more effectively than this particular program. | | Which is to say, "no, the criticism isn't this is a Chinese | room, that is irrelevant". | TrevorJ wrote: | I think we need to start thinking of these networks not as | something we hope will someday simulate a brain, but as | small building blocks that can be networked together in a | variety of ways to tackle tasks. DALL-E may not understand | 3D image manipulation, but if you chained the output from | DALL-E into something like NeRF you could start to | transform the image in 3D space. One can picture a | collection of a few hundred of these networks, all with | different things that they are good at working in concert | to approach something that is very good at doing things | generally. We're missing the layer on top that decides | which algorithms to run and in what order. | jrlocke wrote: | The Chinese room argument itself isn't very compelling. | Surely the constituent parts of the brain are fundamentally | governed solely by physics, surely thought arises solely | from the physical brain, and surely the constituent parts | (and thus thought) could be described by a sufficiently | complex discreet computation. | | Are we not conscious? | indigo945 wrote: | I think a considerable subset of the people who do make | use of the Chinese room argument also subscribe to some | form of mind-body dualism, where consciousness does not | or does not completely arise from physical processes. | jrlocke wrote: | To be more explicit, I'm saying I find it weird to hear | so much about the Chinese room argument from a crowd of | (presumably) materialists. | notahacker wrote: | I see it the other way round. | | The Chinese Room and the brain of a Chinese-speaking | person are completely different physical processes. | Looked at on an atomic level, they have almost nothing in | common. Mind-body dualists may or may not agree that the | room is not "conscious" in the way a human is, but if | consciousness is purely a material process, I can't see | how the materialist can possibly conclude all the | relevant properties of the completely dissimilar room and | person are the same. | | Those that would argue the Chinese Room is "conscious" in | the same way as the Chinese person are essentially | arguing that the dissimilarity of the physical processes | is irrelevant: the "consciousness" of the Chinese person | doesn't arise from molecules bouncing around their brain | in very specific ways, but exists at some higher level of | abstraction shared with the constituent molecules of | pieces of paper with instructions written in English and | outputs written in Chinese. | | The idea our consciousness exists in some abstract sense | which transcends the physics of the brain is not a new | one of course. Historically we called such abstractions | souls... | mannykannot wrote: | The argument you make here is a reasonable one (IMHO) for | the plausibility in principle of what Searle calls | "strong AI", but he claims that his "Chinese Room" | argument proves that it must be mistaken. One can simply | ignore him, but to refute him takes a little more effort. | | It turns out that when one looks at the argument in | detail, and in particular at Searle's responses to | various objections (such as the Systems and Virtual Mind | replies), it is clear that he is essentially begging the | question, and his ultimate argument, "a model is not the | thing modeled", is a non-sequitur. | TheOtherHobbes wrote: | The argument is essentially that there are no qualia of | Chinese comprehension in an automaton or in any system | that uses an equivalent algorithm, whether or not run by | a human. | | It's a sound argument to the extent that qualia clearly | exist, but no one has any idea what they are, and even | less of an idea how to (dis)prove that they exist in | external entities. | | It's the materialists who are begging the question, | because their approach to qualia is "Well obviously | qualia are something that just happens and so what?" | | Unfortunately arguments based on "Well obviously..." have | a habit of being embarrassingly unscientific. | | And besides - _written language skills are a poor | indicator of human sentience._ Human sentience relies at | least much on empathy; emotional reading of body | language, expression, and linguistic subtexts; shared | introspection; awareness of social relationships and | behavioural codes; contextual cues from the physical and | social environment which define and illuminate | relationships; and all kinds of other skills which humans | perform effortlessly and machines... don 't. | | Turing Tests and game AI are fundamentally a nerd's view | of human intelligence and interaction. They're so | impoverished they're not remotely plausible. | | So as long as DALL-E has no obvious qualia, it cannot be | described as sentient. It has no introspection and no | emotional responses, no subjective internal state (as | opposed to mechanical objective state), and no way to | communicate that state even if it existed. | | And it also has no clue about 3D geometry. It doesn't | know what a sphere, only what sphere-like shading looks | like. Generally it knows the texture of everything and | the geometry of nothing. | | Essentially it's a style transfer engine connected to an | image search system which performs keyword searches and | smushes them together - a nice enough thing, but still | light years from AGI, never mind sentience. | planetsprite wrote: | human brains are a chinese room. Our DNA and experiences | wrote the book. | TrevorJ wrote: | I'm becoming convinced that these algorithms _are_ huge steps | towards AGI simply because AGI might end up being a | collection of many of these domain-specific networks with a | network sitting above them who 's only role is to interrogate | the sub networks for solutions to the problem at hand, and | discriminate which solution(s) are most worth trying, | simulating those and then picking one out and executing it in | the real world. That seems to me to be pretty close to what | we as humans do. | ausbah wrote: | the problem you run into then is how does it adapt to new | domains? sure large models display some degree of | generalization, but new tasks will certainly ask things the | network couldn't get | TrevorJ wrote: | I think that's valid, however, I think the modular | approach would allow for a lot more cross domain | flexibility because I suspect that a large subset of | disparate tasks do actually decompose themselves into | discrete subtasks that can be useful across domains. | | Adding to that, if you can compose multiple sub networks | together then you've really got something. You can build | a lot of different buildings from bricks without needing | to invent new kinds of brick basically. | | For instance, think about the large number of domains | that robust computer vision would be useful in. Then | think about the fact that if the computer understands the | 3D space around it, it can hand that model off to a | network that does predictive physics simulation. Now | you've got something that would be useful across a | extremely wide range of domains. | WitCanStain wrote: | Cognitive scientists have been exploring this idea for a | while, e.g. the Modularity of Mind Hypothesis. It would be | very interesting to see if learnings from this field can be | further applied to AI development. | p1necone wrote: | You just need a neural network that knows how to | interrogate other neural networks, and a neural network | that knows how to find new datasets and train new neural | networks and you potentially have something very scary | right? | | Maybe a third neural network that knows how to purchase | cloud compute? | TrevorJ wrote: | >Maybe a third neural network that knows how to purchase | cloud compute? | | Amazon has entered the chat | RootKitBeerCat wrote: | Private angel investors watch from above all of the | "clouds" | miguelxpn wrote: | That book gave me the highest dose of existential crisis I've | ever felt. I should probably re-read it. | zro wrote: | Blindsight is one of my all time favorite books. | | Looks like it's still available for free from the author's | website [0] | | [0] https://www.rifters.com/real/Blindsight.htm | mgdlbp wrote: | Indeed, it's actually irrevocably CC BY-NC-SA 2.5 | metacritic12 wrote: | Are humans just "Chinese rooms"? We don't really understand | anything deeply, but our neurons just fire in a way that | gives good responses and makes us feel like we understand | stuff. | monkeynotes wrote: | I think this is exactly true. | plmpsu wrote: | No you don't. | joe_the_user wrote: | Yes, DALL-E is very impressive to see and can have a number of | actual practical uses. | | But fear of AGI is huge currently. The more impressive non-AGI | things we see, the more worried people naturally become that | we're reaching the "dawn" of AGI with all the disturbing | implications that this might have. (A lot of people are afraid | an AGI might escape the control of its creator and destroy | humanity. I think that's less likely but I think AGI under | control of it's creator could destroy or devastate humanity so | I'd agree AGI is a worry). | | That DALL-E doesn't understand object-relationships should be | obvious to people who know this technology but a lot of people | seem to need it spelled-out. And they probably need it spelled | why this implies it's not AGI. But that would be several more | paragraphs for me. | jeremyjh wrote: | The whole point is it isn't doing so coherently if similar | images were absent from the training data. The monkey wasn't | touching the iguana. | jcelerier wrote: | I don't understand how my brain isn't just gluing things | together either. I don't personally feel like I'm actually | experiencing the understanding of anything | drooby wrote: | After reading your message I felt compelled to click reply | and generate a response and now that's what I'm doing. My | comment feels mildly appropriate to what you said and I'm not | 100% sure how it will be received but it _feels_ like I'm | communicating some degree of understanding of the sentiment | in which you have expressed. My neurons are happy. | RC_ITR wrote: | Do you think GPT 3 could generate the same string of text? | drooby wrote: | I was curious so I asked it some questions... here you | go: | | Me: Write what it feels like to understand something | | GPT-3: It feels good to understand something. You feel | like you have a handle on what is going on and you are | able to make sense of it. It can feel empowering and give | you a sense of control. | | Me: Write a sentence that provides an example of self | reflection and metacognition | | GPT-3: I was surprised that I was able to understand the | concept so quickly. | | Me: What concept? | | GPT-3: The concept was metacognition. | | Me: What does surprisal feel like in this context? | | GPT-3: It feels good to be surprised that you understand | something. You feel like you are smart and have a good | handle on the concept. | | Me: What is the experience of feeling good to you? | | GPT-3: The experience of feeling good is one of happiness | and satisfaction. You feel like you have accomplished | something and are proud of yourself. | | Me: What have you accomplished here? | | GPT-3: You have accomplished understanding a concept. | | Me: What concept is that again? | | GPT-3: The concept is metacognition. | orlp wrote: | The most important thing I think DALL-E shows is that it has a | model of our world and culture. It's not intelligence, but it | is knowledge. | | Google can give you endless pictures of giraffes if you search | for it. But it can only connect you to what exists. It doesn't | know things, it knows OF things. | | DALL-E has knowledge of the concept of a giraffe, and can | synthesize an endless amount of never-before seen giraffes for | you. It actually knows what a giraffe is. | TremendousJudge wrote: | I find talking about "knowing" is too much anthropomorphizing | to my taste. A more accurate description would be that it can | synthesize an endless amount of pictures that the viewer will | recognize as giraffes (or as anything else the viewer wants). | And I agree, it's pretty amazing. But my point is that | talking like this skips over the entire useless "what is | knowledge" debate. | soco wrote: | Maybe we need to "de-antropomorphize" these terms and | rules. It's a long due process, I believe. | psyc wrote: | If it'll get people to stop accusing other people of | anthropomorphizing, I'm 100% on board. Now if you can | think of a way to get people to quit asking for | definitions of the words _intelligence_ and | _consciousness_ , you'll be a legend and maybe these | discussions will advance for the first time. | TremendousJudge wrote: | Well, I tried my best. Another commenter summed it up | better without using the term "anthropomorphizing" at | all, but it came out a bit more condescending than | necessary. | IIAOPSW wrote: | Hang on while I become a legend. | | I use GPT3 to generate the usual trite arguments about | intelligence and consciousness why computers won't ever | get there. Of course I don't actually reveal that a | computer is generating my responses until later on. | Eventually everyone will become jaded and skeptical that | the other participants in that conversation are real | people. Soon all arguments about machine intelligence | will devolve into accusations of using GPT3 or not. Some | day, even mentioning consciousness will just make | everyone assume you're probably a GPT3 troll. This kills | the conversation in a way that makes a valid point. If | the bots can't be reliably identified, the proof is in | the pudding and the matter is settled. | TremendousJudge wrote: | I don't think it's possible. "Knowing" and | "understanding" are inherently human experiences. The | most reasonable thing to do would be to invent new terms | and relegate the older ones, that have too much cultural | baggage attached to them. | | The same way when the creator of humanity turned out to | be "evolution by natural selection" we didn't redefine | the term "God" to mean that. Eventually we just started | using the new term. | MauranKilom wrote: | I too can write a program that will synthesize an endless | amount of pictures that the viewer will recognize as a | giraffe. They would all be the same picture, just with one | bit of noise randomly applied to each pixel - enough to | generate more distinct 256x256 pictures of giraffes than | there are atoms in the universe. | | We need to go a bit deeper to tease out what makes DALL-E 2 | amazing. | soco wrote: | A honest question: what is then intelligence? Or are you just | trying to avoid being caught into a debate... | wcarss wrote: | you could think of "knowing what a giraffe is" in this | instance as a slice of an intelligent thought, but it's not | a whole thought, let alone a train of thoughts, or a plan | involving many trains of thought over time for internally | motivated reasons. | | If we could get it to tell us a story about a particular | giraffe, and then ask it next week about that same giraffe, | and then the giraffe could be referenced by it while it | tells a joke on a talk show in a decade -- that's maybe too | high a bar, but that's real intelligence. | shawnz wrote: | Why is long term memory a requirement? Are amnesiacs not | intelligent? | wcarss wrote: | > that's maybe too high a bar, but that's real | intelligence. | | It's not necessarily a requirement, but I couldn't | witness someone do it and then deny their intelligence. | shawnz wrote: | But it's not sufficient either, right? Just having memory | alone also wouldn't prove intelligence, so it's not | necessary or sufficient | trention wrote: | It knows what combinations of pixels correspond to the | metadata having "giraffe" in it. Nothing more. | blooperdeoo wrote: | [deleted] | tellarin wrote: | " It actually knows what a giraffe is." -> It maybe has a | sense of what a giraffe looks like. | lancesells wrote: | I wonder what would happen if you put in the text prompt: | "Long-necked spotted mammal that eats leaves from trees". | | Most people would answer that as a giraffe. | BudaDude wrote: | I was curious about this as well. Here is what I got: | https://i.imgur.com/Ey0i3My.png | Marazan wrote: | Very similar to my images for the prompt: | https://imgur.com/8dw2brf | | Intriguing that it's gone for a headshot for all of them. | I suspect it says something about the source text | coel wrote: | I was curious about this too and wondered if it could | then relate to my description of a giraffe so put in | "legs of a yellow mammal with a long neck and spots" and | it still nailed it. | lancesells wrote: | Dall-E knows what a giraffe is IMO | nextaccountic wrote: | I found it incredible too | | The text model of dall-e at very least can map "Long- | necked spotted mammal that eats leaves from trees" near | the same representation of "giraffe" | mechagodzilla wrote: | The pictures are incredibly similar to what one gets when | you type that same string in to google image search - it | makes me wonder if you could do reverse-image search on | some of the pictures it produces and find whole training | pictures embedded in it, the way one can find blocks of | intact training text stored inside language models. | notahacker wrote: | One of the interesting things about the "monkey touching | iguana" photos is just how abstract that sense is. | | Several of the pictures look like it's taken geometry from | a 'touching' source and painted both 'monkey' and 'iguana' | textures on to both figures, because on the one hand its | model of relationships is too sophisticated to just | copy/paste monkey and iguana photos from its library, and | on the other hand it's not sophisticated enough to always | treat "monkey touching iguana" as implying that the monkey | and the iguana are discrete animals. (An interesting | contrast with it being generally praised for being | remarkably good at things like putting hats on animals' | heads...) | XorNot wrote: | On the other hand if you posted "monkey touching iguana" | as a prompt on an internet forum - say a photoshop | contest - you would get essentially all of these variants | from actual humans as well when the task delved into | being "meta". | notahacker wrote: | People are certainly _capable_ of wilfully | misinterpreting the prompt because they consider monkey | /iguana chimeras to be amusing to draw, or think that | semi-abstract hybrid animals with a caption musing on the | meaning of "touching" is a higher form art than mere | representation. And in a photoshop competition they might | all copy one silly interpretation because they think it's | meme-worthy. | | But I don't think if you asked 18 humans to independently | draw "monkey touching iguana" you'd get 17 pairs of | monkey/iguana hybrids mostly not touching each other | against photographic backgrounds often featuring human | limbs and one apparently normal monkey being pursued by a | giant iguana! | bpodgursky wrote: | Let's not play word games. | | Do _you_ know what a giraffe _is_? No, you just know what a | giraffe looks like, where it lives, and maybe that it 's | vaguely related to a horse. | axus wrote: | I do not _grok_ a giraffe in the same way I do the things | I work on. | shafoshaf wrote: | That last bit about the horse is pretty loaded. Knowing | what a giraffe is means knowing that it is a mammal, has | a blood system, eats leaves, reacts certain ways to | certain stimuli, etc... That is a lot more than just | knowing what it looks like. Image recognition is not the | same a relationships or knowledge. | influxmoment wrote: | > mammal, has a blood system, eats leaves, reacts certain | | DALL-E likely can map all those concepts to a giraffe | also. | karatinversion wrote: | By your criteria, no human before the 18th century (when | Carl Linnaeus invented the class "mammalia") knew what a | giraffe was. | usrusr wrote: | And in the 25th century, if we manage keep technology | alive, the dall-es of the day will be trained in more | computer depictions of long extinct animals that | accidentally made it into the training set than pictures | of real animals. It will be Durer's rhinoceros all over | again. | orlp wrote: | Now ask DALL-E "giraffe feeding young", "giraffe organs", | "giraffe eating", etc, I think you'd find most knowledge | is there. | teolandon wrote: | What about ,"giraffe laying eggs"? | IIAOPSW wrote: | That's not a word game at all. To know what a giraffe is | is to be able to understand the implications of being | informed that something is a giraffe. Dalle would never | assume on its own that "a giraffe close to a low flying | helicopter" could be very dangerous for the giraffe, but | "a horse close to a low flying helicopter" is safe. | nextaccountic wrote: | > Dalle would never assume on its own that "a giraffe | close to a low flying helicopter" could be very dangerous | for the giraffe, but "a horse close to a low flying | helicopter" is safe. | | Are you sure? (Can someone run down those prompts?) | | Dall-e might be able to make those relationships in the | latent space | gattilorenz wrote: | > Are you sure? | | Considering Dall-e has problems painting "a red cube on | top of a blue cube" [1] and all kind of simple spatial | relations, I'd say it's a fair shot. | | [1] As reported by OpenAI, but there are also some | prompts by Gary Marcus et al. | (https://arxiv.org/abs/2204.13807) showing this, and it's | trivially simple to find other very simple cases like | these | Earw0rm wrote: | Not really close to horses at all. It's an even-toed | ungulate, closer to deer, cows and sheep. Many, many tens | of millions of years from horses.. the even-toed | herbivores are closer to dolphins (!!) than they are to | horses or rhinos. | leereeves wrote: | You know a lot more than that. You know that it eats, | sleeps, runs, mates. You know that there are male and | female giraffes and what that means. You know that it | dies if it gets too hot, too cold, if it falls off a | cliff. You know that it has legs, a head, neck, etc and | you probably know that it has a heart, brain, lungs, | kidneys, etc inside. | | You know, vaguely, how it interacts with other things and | what it's likely to do around a fruit tree, or a lion, or | fire. | | Even if you've never been close to a giraffe, you can | probably imagine what it looks like from close enough to | see individual hairs in its fur. | | A lot of knowledge is still missing from ML systems that | don't interact with the world. | hypertele-Xii wrote: | It's even deeper than that. Humans and giraffes are | mammals, _very_ close cousins in terms of evolutionary | biology. The very nature of your experience of reality is | essentially shared with the giraffe. | | Part of our brains are _lizard,_ both yours and the | giraffes. Tech so ancient that it uses the same circuits | and chemicals with _crustaceans._ | | You can imagine what existence is like for a giraffe with | pretty much 99% accuracy without consciously knowing a | single thing about it. | | A word-based image generator cannot. | kilroy123 wrote: | What do you mean by "is"? I've spent a fair amount of | time with giraffes in real life. Does that mean I know | what one is? | mym1990 wrote: | I would really hope so. All of this conversation is | devolving into a philosophical argument about 2 letter | words. To know what something is relates to understanding | some of its properties, and how it might interact with | the world around it. Whether DALL-E knows what something | is, is debatable. Whether you know what something is | should be pretty clear. | romellem wrote: | If this was around in 2009 it would have made the | https://www.onemilliongiraffes.com/ project much easier. | SketchySeaBeast wrote: | > It actually knows what a giraffe is. | | Every time I see a painting it does with blank biege swirls | where faces should be I question that it does know what | anything is or what the important parts are. | LordDragonfang wrote: | I suspect you're thinking of Craiyon (formerly DALL-E Mini) | which is a derivative (but distinct) project, and | consistently does the beige swirls. DALL-E 2 proper is much | higher fidelity, and seems much more reliable at generating | passable (if imperfect) faces from what I've seen. | rob74 wrote: | What's still missing, as the article mentions, is the correct | relationship between objects. As in this post from a few | months ago: | https://astralcodexten.substack.com/i/57243633/the-tenth- | vir... - it knows that Tycho Brahe was a bearded guy, and it | knows what a telescope looks like, but when you ask for | "Tycho Brahe looking through a telescope", the person | sometimes looks _at_ the telescope, or _past_ it, sometimes | the telescope just sits there, but the person never actually | looks _through_ it. | amelius wrote: | Reminds me of a classifier that couldn't tell the difference | between a panther and a couch with a panther-print. | dane-pgp wrote: | But what if the couch wasn't panther-print but made from | actual panther hide? Is a dead panther still a panther? How | much of the panther's body has to be visible for the object | depicted to still be a panther? | | I'm reminded of the classic joke about the mathematician on | a train who sees a sheep... | | https://bestjokehub.com/joke-103626/ | blooperdeoo wrote: | " It actually knows what a giraffe is." | | No. You know what a giraffe is, Dall*E simply creates pixel | groups which correlate to the text pattern you submitted. | | Watching people discuss a logical mirror scares me that most | people are not themselves conscious. | rhdunn wrote: | There's an interesting Google TechTalk on YouTube called | "The Next Generation of Neural Networks" from 2007 [1]. In | that video, there is a demo [2] that showcases the neural | network recognizing numbers when given a drawing of a | number as inputs, but more interesting is the follow on | where the demo shows fixing the output to a given number | and letting the neural network generate the "input" image | to show what it is thinking that number can be. That is a | strong indication to me that that particular neural network | has a good understanding of what the particular number | glyphs look like, even if it does not know what those are | conceptually or how they relate to each other and to | mathematics -- that is, that the neural network would not | be able to work out what number 42 is, how it relates to 37 | (i.e. 37 < 42), or how to manipulate those numbers (e.g. 2 | + 7 = 9). | | Dall-E will likely be similar in that it is effectively | doing that perception step where you fix the text | description from the classifier output and run that in | reverse to show what the neural network is "seeing" when it | is "thinking" about that given output. So it won't be able | to describe features of a giraffe, or information about | where they live, etc. but it will be able to show you what | it thinks they look like. | | [1] https://www.youtube.com/watch?v=AyzOUbkUf3M [2] | https://youtu.be/AyzOUbkUf3M?t=1293 | gojomo wrote: | There's a lot of people who make this same argument - | DALLE/GPT/etc is just a 'mirror' or 'parrot' - but they | rarely make convincing supporting arguments. | | They just assert it as axiomatic, whistling-past all the | ways that they themselves - unless they believe in | supernatural mechanisms - are also the product of a finite | physical-world system (a biological mind) and a finite | amount of prior training input (their life so far). | | I'm beginning to wonder if the entities making this | argument are conscious! It seems they don't truly | understand the issues in question, in a way they could | articulate recognizably to others. They're just repeating | comforting articles-of-faith that others have programmed | into them. | mherdeg wrote: | > I'm beginning to wonder if the entities making this | argument are conscious! It seems they don't truly | understand the issues in question, in a way they could | articulate recognizably to others. They're just repeating | comforting articles-of-faith that others have programmed | into them. | | I took a quick look at the Stanford Encyclopedia of | Philosophy entry for philosophical zombies ( | https://plato.stanford.edu/entries/zombies/ ) and I can't | see evidence of this argument having been seriously | advanced by professionals before. I think it would go | something like: | | "Yes, we have strong evidence that philosophical zombies | exist. Most of the laypeople who discuss my line of work | are demonstrably p-zombies." | gfodor wrote: | Use twitter long enough and it's hard to not be convinced | that p-zombies exist - it may be that such zombies are | ephemeral though, and a transitory state of some | individuals. | FeepingCreature wrote: | Alternately, of course, conscious awareness is the | ephemeral state. | | ("A laser is trying to find the darkness...") | martindbp wrote: | Yes, and DALLE clearly knows what a giraffe is, maybe not | in the same way as we do, but it has placed the giraffe | in some latent space close to other animals like it but | far away from say, cars. But yes, there is plenty it | doesn't know about giraffes because it's only been fed | images and captions, while we can watch documentaries | about giraffes and apply knowledge we've acquired about | other animals and the world through our senses. | drdeca wrote: | I think I'd say that it knows what a giraffe _looks like_ | and some aspects of their behavior. I'm not sure if it | (in a sense) "knows that a giraffe is an animal" though I | wouldn't be too surprised if it does. | | Though, I guess maybe that is all "knowing what a giraffe | is" is? | usrusr wrote: | I know what a giraffe looks like and some aspects of | their behavior. I couldn't build a giraffe, I've never | been a giraffe, I wouldn't even know what to talk about | if I met a giraffe on a blind date. | saynay wrote: | At some point, these discussions always get to some | question like "what is 'knowing'?" | | Would a person who only ever read articles and looked at | pictures of giraffes have a better understanding of them | than Dall-e does? At some level, probably, in that every | person will have a similar lived experience of _being_ an | animal, a mammal, etc. that Dall-e will never share. Is | having a lesser understanding sufficient to declare it | has no real understanding? | [deleted] | joe_the_user wrote: | _There 's a lot of people who make this same argument - | DALLE/GPT/etc is just a 'mirror' or 'parrot' - but they | rarely make convincing supporting arguments._ | | I suggest you look at the parent article. | | Defining "understanding" in the abstract is hard or | impossible. But it's easy to say "if it can't X, it | couldn't possibly understand". Dall-E doesn't manipulate | images three dimensionally, it just stretch images with | some heuristics. This is why the image shown for "a cup | on a spoon" don't make sense. | | I think this is a substantial argument and not hand- | waving. | gojomo wrote: | In many of its outputs, it's doing far more than | stretching, and essentially rotating new entities into | scene-appropriate relationships. | | True, it has some problems fully abstracting, and then | logically-enforcing, some object-to-object relationships | that most people are trivially able to apply as | 'acceptance tests' on candidate images. That is evidence | its scene-understanding is not yet at human-level, in | that aspect - even as it's exceeded human-level | capabilities in other aspects. | | Whether this is inherent or transitory remains to be | seen. The current publicly-available renderers tend to | have a hard time delivering requested meaningful text in | the image. But Google's PARTI claims that simply growing | the model fixes this: see, for example: | https://twitter.com/gojomo/status/1540095089615089665 | | We also should be careful using DALL-E as an accurate | measure of what's possible, because OpenAI has | intentionally crippled their offering in a number of ways | to avoid scaring or offending people, under the rubric of | "AI safety". Some apparent flaws might be intentional, or | unintentional, results of the preferences of the | designers/trainers. | | Ultimately, I understand the practicality of setting | tangible tests of the form, "To say an agent | 'understands', it MUST be able to X". | | However, to be honest in perceiving the rate-of-progress, | we need to give credit when agents defeat all the point- | in-time MUSTs, and often faster than even optimists | expected. At that point, searching for new MUSTs that | agent fails at is a valuable research exercise, but | retroactively adding such MUSTs to the definition of | 'understanding' risks self-deception. "It's still not | 'understanding' [under a retconned definition we | specifically updated with novel tough cases, to comfort | us about it crushing all of our prior definition's | MUSTs]." It obscures giant (& accelerating!) progress | under a goalpost-moving binary dismissal driven by | motivated-reasoning. | | This is especially the case as the new MUSTs increasingly | include things many, or most, humans don't reliably do! | Be careful who your rules-of-thumb say "can't possibly be | coceptually intelligent", lest you start unpersoning lots | of humanity. | joe_the_user wrote: | I read a modest amount of popular "AI safety" text | (lesswrong basically) and this is the first time I've | heard of Dall-E being crippled for safety reasons. | | Your argument overall seems to take "you skeptics keep | moving the bar, give me a benchmark I can pass and I'll | show you", which seems reasonable on it's face but I | don't think actually works. | | The problem is that while algorithm may be defined by | theory and tested by benchmark, the only "definition" we | have for general intelligence except "what we can see | people doing". If I or anyone had a clear, accepted | benchmark for general intelligence, we'd be quite a bit | further towards creating it but we're not there. | | That said, I think one thing that current AIs lack is an | understanding of it's own processing and an understanding | of the limits of that processing. And there are many | levels of this. But I won't promise that if this problem | is corrected, I won't look at other things. IDK, | achieving AGI isn't like just passing some test, no | reason it should be like that. | weego wrote: | I guess ironically you've just recreated the argument | people of faith use to shout down logic used to question | the reality of their belief system. | | The burden of proof is not on the one claiming logically | consistent interpretations of events. | gojomo wrote: | Huh? That's a bit abstract, can you clarify? | mlcrypto wrote: | Yep human brain is just neurons firing or chemicals | activating | meroes wrote: | And don't blame others for not finding this satisfactory | either. Many many mathematicians think abstract objects | exist outside of physical reality. Chomsky says | "physical" hasn't been well defined for 200 years. And | finite physical processes do not seem adequate to explain | the infinite character of language and mathematical | infinity. Or if they are, then go inform the mathematical | realists infinity isn't real because all proofs are | finite yet Godel believed in actual infinity. | bglazer wrote: | I can give a random string of letters as a prompt and | DALLE will generate coherent images based on that. To me, | that is as clear a signal as any that there is no | reasoning or even a consistent world model embodied in | DALLE. It's simply a high dimensional latent mapping | between characters and pixels. Like OP said, that is a | stupendous achievement, but it is just a very complex and | impressive mirror. If it wasn't just a mapping between | characters and pixels, and instead DALLE had intelligence | that "understood" the symbols it manipulated, then I | would expect it to generate nothing, or white noise in | response to random letters. | elcomet wrote: | I agree there's no high level reasoning (if there's a way | to define this). | | But saying "It's simply a high dimensional latent mapping | between characters and pixels" is clearly a very bad | argument. Your brain is simply a high dimensional latent | mapping between your sensory input and your muscular | output. This doesn't make you not intelligent. | thfuran wrote: | >Your brain is simply a high dimensional latent mapping | between your sensory input and your muscular output | | It definitely does more than that. | elcomet wrote: | What do you mean? Your brain _is_ a high dimensional | function. I 'm not talking about what it is able to do, | I'm talking about what it is. | gojomo wrote: | If you give humans gobbledygook input - say a Rorschach | blot, or some random syllables - and ask them to describe | what it makes them think of, or generate art from it, | many will do so quite competently. | | They'll find some morsels of fragmentary hints-of-meaning | in the junk, or just act from whatever's bouncing around | in their own 'ground state', and make something | interesting & coherent, to please their interlocutor. | | So I don't see why this corner-case impugns the level-of- | comprehension in DALLE/etc - either in its specific case, | nor in the other cases where meaningful input produces | equally-meaningful responses. | | In what ways are you yourself not just a "very complex & | impressive mirror", reflecting the sum-of-all external- | influences (training data), & internal-state-changes, | since your boot-up? | | Your expectationthat random input should result in noise | output is the weird to me. _People_ can see all sorts of | omens & images in randomness; why wouldn't AIs? | | But also: if you trained that expectation into an AI, you | could get that result. Just as if you coached a human, in | a decade or 2 of formal schooling, that queries with less | than a threshold level of coherence should generate an | exceptional objection, rather than a best-guess answer, | you could get humans to do so. | bglazer wrote: | I think most people would find it more difficult to draw | a "serkle nufding" than a "spotted dog". At the very | least, they would need an explanation that you're looking | for whatever pops into their head for the nonsense. | Again, that indicates that human intelligence works in a | fundamentally different way from DALLE, which has | absolutely no idea what is nonsense. So, in this | incredibly important task (recognizing nonsense) there's | an asymmetry: people can act like DALLE, but DALLE can | not act like humans. To me, that totally breaks any | equivalence between DALLE and human intelligence. | dtech wrote: | Then again, if you lock a human in a room, tell it "draw | me things based on the notes you receive or you will be | killed", and give it a note "serkle nufding", you bet the | human will draw _something_. | | Given the model has only 1 input and 1 output and | training is essentially surviving that order, it's not | dissimilar. | gojomo wrote: | I've not tested this, but I suspect you can get DALLE to | create pictures that humans are more likely to describe | as 'nonsense' by adding 'nonsense' or 'nonsensical' to | the prompt. That'd indicate that it absolutely _does_ | have an idea about 'nonsense' and can recognize, & | reproduce within its constrained output, 'nonsense' | that's largely compatible with human ideas of what | 'nonsense' is. | | Asking it to produce _noise_ , or raise an objection that | a prompt isn't sufficiently meaningful to render, is a | silly standard because it's been designed, and trained, | to always give _some_ result. Humans who can object have | been trained differently. | | Also, the GPT models - another similar train-by-example | deep-neural architecture - can give far better answers, | or give sensible evaluations of the quality of its | answer, _when properly prompted to do so_. If you wanted | a model that 'd flag nonsense, just give it enough | examples, and enough range-of-output where the answer | your demanding is even _possible_ , and it'll do it. | Maybe better than people. | | The circumstances & limits of the single-medium (text, or | captioned image) training goals, and allowable outputs, | absolutely establish that these are _different_ from a | full-fledged human. A human has decades of reinforcement- | training via multiple senses, and more output options, | among other things. | | But to observe that _difference_ and conclude these | models don 't "understand" the concepts they are so | deftly remixing, or are "just a very complex and | impressive mirror", does not follow from the mere | difference. | | In their single-modalities, constrained as they may be, | they can train the equivalent of a million lifetimes of | reading, or image-rendering. Objectively, they're | arguable now better at composing college-level essays, or | rendering many kinds of art, than most random humans | picked off the street would be. Maybe even better than | 90% of all humans on earth at these narrow tasks. And, | their rate of improvement seems only a matter of how much | model-size & training-data they're given. | | Further: _the narrowness of the tasks is by designers ' | choice, NOT inherent to the architectures._ You could - | and active projects are - training similar multi-modality | networks. A mixed GPT/DALLE that renders essays with | embedded supporting pictures/graphs isn't implausible. | londons_explore wrote: | I just got Dalle-2 to draw a "serkle nufding". It gave me | 4 pictures of rather odd looking food. I too assume that | random words are usually food, if only because restaurant | menus seem to have such a love of making up foreign | sounding words. | dane-pgp wrote: | > a love of making up foreign sounding words | | One great example of this phenomenon is "Haagen-Dazs".[0] | | Admittedly that's a brand name, rather than a specific | dish, but I assume that Dalle-2 would generate an image | of ice cream if given a prompt with that term in it | (unless there is a restriction on trademarks?). | | [0] https://funfactz.com/food-and-drink-facts/haagen- | dazs-name/ | caconym_ wrote: | When I read the words "serkle nufding" I pictured in my | head, with zero perceptible delay, a weird little goblin | guy with pale green skin and long ears and a pointy hat. | sarchertech wrote: | Serkle Nufding sounds like a word. What about | "xywjlkbtywlsjjsnskksbrjsksbsysjsnsjdkdmdnd"? | caconym_ wrote: | Honestly? The top half of a signpost bearing the name of | a Welsh village in white on green (likely an artifact of | my actual local signage standards). The tops of some | broadleaf deciduous trees, possibly ash, are visible in | the background, but the perspective is from below (as one | usually looks up at a street sign, also probably an | artifact of my local experience with signs rather than a | true representation of Welsh signage) and so the | background is dominated by a blue sky. | | This is probably because your string is a low-entropy | keyboard-mash. | simondw wrote: | > they would need an explanation that you're looking for | whatever pops into their head | | Of course. And the equivalent of that explanation is | baked into DALL-E, in the form of its programming to | always generate an image. | | > but DALLE can not act like humans | | No, not generally, but I don't think anyone has claimed | that. | bglazer wrote: | What else always produces an image, no matter what you | present to it? | simondw wrote: | Lots of things, but I'm not sure why you're asking that. | Could you connect that back to the point you're making? | saynay wrote: | I am not really sure that this is a basis to say human | intelligence is fundamentally different. If you ask a | human to draw a picture based on nonsensical input, they | probably wont. That might not be a difference in process, | but in agency. | | So is a single-purpose AI equivalent to the entirety of | the Human Experience? Of course not. But can it be | similar in functionality to a small sliver of it? | IIAOPSW wrote: | I've done my share of testing randomish prompts. Here's | two examples that might change your mind slightly. The | first example is "E sublimary widge fraus chronly becross | yon corpti." I constructed this sentence to resemble the | grammar, syllables and pacing of English, but it is | complete nonsense. The output was both consistent and | coherent. In the case of DallE, it generated church and | saint related images. The second example was ".- / ... . | -- .- .--. .... --- .-. . / .-.. .- -- .--. / .- -... --- | ...- . / - .... . / -.-. .-.. .- ... ... --- -. / .- ...- | . -. ..- . / ... ..- -... .-- .- -.-- / ... - .- - .. --- | -." There are clearly not enough (any?) samples of morse | code text in the training. The output was wholly | unrelated to the prompt. The output is kind of like | doodling. Its fairly random, fairly undirected without | any real theme, but its not the same randomness as TV | static. Its more like randomness from a distribution of | things people would draw when bored. | | So the interesting part is this, why did one random | prompt fail in a consistent way and the other in a random | way? Perhaps the encoding of meaning into vocabulary has | patterns to it that we ourselves haven't noticed. Maybe | your random string experiment works because there is some | amount of meaning in the syllables that happened to be in | your chosen string. | rhdunn wrote: | I suspect it is designed to describe images, like you | would find in alt text but run in reverse. As your first | example is closer to English it can do a better job than | with the second example, like how the Jabberwocky poem | contains nonsense words but still makes some degree of | sense. | | For example, the "yon" is immediately reconizable to me | (hither and yon), so "yon corpti" could mean a distant | corpti (whatever a corpti is). "becross" looks similar to | "across" but with a be- prefix (be-tween, be-neath, be- | twixt, etc.), so could be an archaic form of that. | "chronly" could be something time related (chronos+ly). | etc... | gojomo wrote: | Indeed - the 'cadence of English' prompt reminds me of | Latin, or some other foreign-litany that's nonetheless | repeated to an audience that's not following every word. | And ~IIAOPSW has used real word-roots, hinting at | meanings! (The FastText algorithm does a passable job at | synthesizing word-vectors for unknown words using such | fragments.) So rendering that as a religious scene isn't | too surprising. | | That morse code gives nothing useful probably just | indicates some combination of - (a) few morse transcripts | in training set; (b) punctuation-handling in training or | prompting - makes it more opaque. It's opaque to me, | other than recognizing it's morse code. | semi-extrinsic wrote: | Your first random prompt is far from random. It contains | the fragments "sublim", "chr", "cross" and "corpt" in | addition to the isolated "E", which all project the | solution down towards Latin and Christianity. | | You can easily see that these language models are in some | sense working on fragments as much as they are on the | actual words isolated by spaces in your sentence. Just | take a test sentence and enter as a prompt to get some | images. Then take that same sentence, remove all spaces | and add new spaces in random locations, making gibberish | words. You will see that the results will retain quite a | few elements from the original prompt, while other things | (predominantly monosyllables) become lost. | | To me, I have not seen a single example that cannot just | be explained by saying this is all just linear algebra, | with a mind-bogglingly huge and nasty set of operators | that has some randomness in it and that projects from the | vector space of sentences written in ASCII letters onto a | small subset of the vector space of 1024x1024x24bit | images. | | If you then think about doing this just in the "stupid | way", imagine you have an input vector that is 4096 bytes | long (in some sense the character limit of DALL-E 2) and | an output vector that is 3 million bytes long. A single | dense matrix representing one such mapping has 6 billion | parameters - but you want something very sparse here, | since you know that the output is very sparse in the | possible output vector space. So let's say you have a | sparsity factor of somewhere around 10^5. Then with the | 3.5 billion parameters of DALL-E 2, you can "afford" | somewhere around 10^5 such matrices. Of course you can | apply these matrices successively. | | Is it then so far fetched to believe that if you thought | of those 10^5 matrices as a basis set for your | transformation, with a separate ordering vector to say | which matrices to apply in what order, and you then spent | a huge amount of computing power running an optimizer to | get a very good basis set and a very good dictionary of | ordering vectors, based on a large corpus of images with | caption, that you would not get something comparably | impressive as DALL-E 2? | | When people are wowed that you can change the style of | the image by saying "oil painting" or "impressionist", | what more is that than one more of the basis set matrices | being tacked on in the ordering vector? | Joeri wrote: | Dall-e must generate an image, even for a random prompt | of letters, because it has been trained to do only that. | Take a human, hold a gun to their head, and force them to | paint an image for a random string of letters, and I | guarantee they will make you something. | | Retrain dall-e and give it a choice whether it generates | an image or does something else, and you will get a | different outcome. | | The argument boils down to this: is a human brain nothing | but a mapping of inputs onto outputs that loops back on | itself? If so the dall-e / gpt-3 approach can scale up to | the same complexity. If not, why not? | snapcaster wrote: | How do you know you know what a giraffe is? | logifail wrote: | > How do you know you know what a giraffe is? | | Perhaps a more interesting question could be: [how] do we | know what consciousness is? | twoxproblematic wrote: | gfodor wrote: | Not exactly. If you're using the word "simply" you probably | don't quite understand what exactly DALL-E is doing. It's | not just mapping correlations between words to pixels. The | intermediate representations and textual attention are what | make DALL-E interesting. | jvanderbot wrote: | Precisely this! You don't know what a giraffe is, either. | But you can talk about a giraffe to yourself, using words | with correlating imagery that you can "see" by feeding | through your visual cortex, and that's close enough. | EricBurnett wrote: | Thought experiment: between you and the AI, which would do | a better job depicting a giraffe skeleton? A giraffe in | it's natural habitat? Their favorite tree to eat? Species | on the genetic tree closest to giraffes? | | If we assume this AI or a successor can win that | evaluation, in what way would you say you know what a | giraffe is better than the AI? | adamsmith143 wrote: | Comprehension does not imply consciousness or vice versa... | chaxor wrote: | There are far too many that try to use this argument | without really knowing what they're talking about. | | *Don't mistake skepticism for knowledge* | | This is a major problem on this site and elsewhere. | elcomet wrote: | > You know what a giraffe is, Dall*E simply creates pixel | groups which correlate to the text pattern you submitted. | | How would you tell the difference though? Can you think of | a test to distinguish between those two abilities ? | ramoz wrote: | Because you've mentioned Google, here's this: | https://imagen.research.google/ | danielmorozoff wrote: | Where do you see DALLE automating away jobs? | soderfoo wrote: | Augmenting jobs is more likely, from what I have heard. It | would be useful for rapid prototyping of artwork and could | help designers get design specs to artists quicker. | | The initial impressions are that production ready designs | will require an artists touch. But this discussion was in the | context of AA-Games and up. For indie and small games fully | AI produced art may be fine. | robertlagrant wrote: | There was a post on here recently on how someone designed | their logo by asking DALL-E a load of questions. The | rationale could be that it might not remove that job, but it | would certainly be a powerful tool for a logo designer to | take on way more clients, thus driving down the cost of logo | design. | lancesells wrote: | If I'm thinking of the same post that logo is poorly done / | chosen. I know the author wanted very specific imagery in | it and achieved that but I don't think that logo would be | considered "good" by most. | arbitrage wrote: | That was the first instance. These sorts of things only | get better over time. | odessacubbage wrote: | i feel like this boils down to a client-side | misunderstanding of where graphic designers actually spend | the majority of their time. i'd actually argue that working | from spec or even a crude doodle on the back of a napkin is | a lot easier than dealing with a client who is married to a | poor design that DALL-E puked out for them. one of the most | important things about the sketch phase is to be able to | iterate without too many assumptions while also being able | to recognize and play upon the strengths of individual | drawings. this is not a time consuming process but it also | isn't something you want to rush anymore so than you'd want | to rush the plastic surgeon performing your nose job. | depersonalizing the design process in favor of volume does | not particularly serve anyone and, moreover, it responds to | a need that i don't really think exists. this is not to say | that i don't think these kinds of image generators are | without uses, but right now we are still in the phase where | people are being impressed by graphical fidelity[which was | already incredibly cheap] rather than usefulness. imo this | stage of ai will primarily work well for tasks that benefit | from randomness while also being relatively low stake in | the overall production pipeline. training a network to | generate typefaces or to randomize npc portraits for an | open world game is precisely the sort of area where 'gluing | things together' can potentially provide more benefits than | headaches. | robertlagrant wrote: | I was saying a designer could use it for first draft | ideas, not their client. | laxd wrote: | I just cancelled my plan of paying a graphics designer to | prepare an android app (personal project) for launch. After | playing with dall-e I'm confident that "I" can make a logo | and some artwork "myself". Edit: ment too say icon, not logo. | IshKebab wrote: | The people who do digital art illustrations for magazine | articles are surely out of a job? | elondaits wrote: | I think it'll start by removing the need for editorial | illustration and photography for a lot of blogs and websites, | starting with the less reputable ones. | | MidJourney is already good enough creating illustrations for | articles, book covers, etc. Not something that The New Yorker | will be interested in, but better than anything you can buy | for a couple of dollars or small monthly subscription. | notahacker wrote: | > I think it'll start by removing the need for editorial | illustration and photography for a lot of blogs and | websites, starting with the less reputable ones. | | Stock photo sites and Google Images and basic graphic | filters did that a _long_ time ago. Sure, DALL-E has | certain creative possibilities that those don 't, and the | niches like satirical memes or "first pass" concepts in | agency brainstorming meetings where it could be absolutely | revolutionary tend to not be deliverables the creators get | paid for. The publications giving illustrators a reliable | income are usually after their consistent style, fantasy | world coherence or novel ideas which isn't exactly where | DALL-E shines. | Workaccount2 wrote: | DALL-E is the writing on the wall for those with a | competing talent to start looking for other (non-art) | work they may want to do in the future. It would be an | egregious error to think that AI art programs are some | new tool to learn akin to photoshop. Maybe that will hold | true for a few years at best. | | There will come a point, much sooner than later, where | the value in typing prompts into an AI is going to only | be worth minimum wage. | notahacker wrote: | Substitute "the camera" (and perhaps even more "the | gramophone") for "DALL-E" and "AI art programs", and the | 19th century wants its argument against the future of | creative media back. | IshKebab wrote: | I agree. This paragraph is baffling: | | > DALL-E's difficulty in juxtaposing wildly contrastive image | elements suggests that the public is currently so dazzled by | the system's photorealistic and broadly interpretive | capabilities as to not have developed a critical eye for cases | where the system has effectively just 'glued' one element | starkly onto another, as in these examples from the official | DALL-E 2 site: | | Yes the public is so dazzled by this massive leap in capability | that it hasn't developed a critical eye for minor flaws. | | Yeah we get it. It's not instantly perfect. But the fact that | people aren't moaning that it can't put a tea cup in a cylinder | isn't because everyone stupidly thinks it is perfect, it's | because not everyone is a miserable naysayer. | RC_ITR wrote: | "Don't people realize the sunset would be more beautiful if | you run it through saturation filters and do white | balancing?" | KingOfCoders wrote: | That's called creativity. | IIAOPSW wrote: | I have a phrase I'd like to coin in contrast to AI. "Artificial | Bullshit". AB. | | I of course mean "bullshit" in the highly technical sense defined | by Frankfurt [1]. The defining feature that separates a | bullshitter from a liar is that a liar knows and understands the | truth and intentionally misrepresents the matters of fact to | further their aims, whereas a bullshitter is wholly unconcerned | with the truth of the matters they are discussing and is only | interested in the social game aspect of the conversation. | Bullshit is far more insidious than a lie, for bullshit can (and | often does) turn out to be coincident with the truth. When that | happens the bullshitter goes undetected and is free to infect our | understanding with more bullshit made up on the spot. | | DallE generates the images it thinks you want to see. It is | wholly unconcerned with the actual objects rendered that are the | ostensible focus of the prompt. In other words, its bullshitting | you. It was only trained on how to get your approval, not to | understand the mechanics of the world it is drawing. In other | words, we've trained a machine to have daddy issues. | | A profoundly interesting question (to me) is if there's a way to | rig a system of "social game reasoning" into ordinary logical | reasoning. Can we construct a Turing Tarpit out of a reasoning | system with no true/false semantics, a system only designed to | model people liking/disliking what you say? If the answer is yes, | then maybe a system like Dalle will unexpectedly gain real | understanding of what it is drawing. If not, systems like Dalle | will always be Artificial Bullshit. | | [1] http://www2.csudh.edu/ccauthen/576f12/frankfurt__harry_- | _on_... | adamsmith143 wrote: | >In other words, its bullshitting you. It was only trained on | how to get your approval, not to understand the mechanics of | the world it is drawing. | | A system can learn to do all kinds of interesting things by | trying to optimize getting rewards. | | See: https://www.deepmind.com/publications/reward-is-enough | visarga wrote: | What you call bullshit I call imagination. Both humans and AI | need it. Humans use imagination to plan ahead. AlphaGo was | generating moves to plan ahead. | | Dall-E and GPT-3 are not being used as agents, they are just | tool AIs. They have a narrow task - generating images and text. | Agents on the other hand need to learn how to act in the | environment, while learning to understand the world at the same | time. | elefanten wrote: | How did you get "imagination" out of gp's comment? They | described something quite clearly and it's not reasonably | summarized as "imagination". | | Per gp, bullshit is cynically self-interested pontificating. | It's performance. Maybe you could say that the bullshit | produced is imaginative, sometimes. But it has nothing to do | with "imagination" as a simulation-like capability used for | planning and learning. | TremendousJudge wrote: | As far as I know, the human brain is just a "social game | reasoning" optimizer, that we try (and fail) to use to do | actual logical reasoning. The zillion cognitive biases we have | are the clue: we don't do logic, we have biases and sometimes | stumble upon logic. | [deleted] | gfodor wrote: | DALL-E either doesn't generate images you want to see, or if it | does, it does a bad job, because it generates many images you | don't want to see. | | In other words, the claim you've set up is basically | unfalsifiable, given that thre's no way to form strong | counterevidence from its outputs. (I would argue that if there | was, we'd already have it in the vast majority of outputs that | aren't images people want.) | | If I were to refine what you're saying, is that DALL-E is | constrained to generating images that _make sense_ to the human | visual system in a coherent way. This constraint is a far cry | from what you need to be able to lift it up to claim it is | "bullshitting" though, since this constraint is at a very low | level in terms of constraining outputs. | ___rubidium___ wrote: | I think you're right, but I would qualify that the AI is | bullshitting in the same way that a child's drawing of a stick | figure, house, and smiling sun is bullshit designed to get | approval. The AI is giving symbols--very visually stunning | ones, to be sure, but symbols nonetheless--of what it is | prompted to create, just like a child learns that "circle with | lines coming out of it" is a symbol that can be read as "sun" | and praised by adults. | IIAOPSW wrote: | The reason for my final question at the end is this: can AI | emerge out of AB? The child eventually grows up and | (hopefully) stops bullshitting everything. But, do they | actually stop bullshitting, or is the human ability to reason | itself somehow built out of bullshit? | ___rubidium___ wrote: | That's a very good question. The child eventually learns to | stop using symbols and instead begins to copy the shapes, | negative spaces, curves, etc. in front of them and makes an | assemblage of otherwise-meaningless shapes, lines, curves | which is interpreted by viewers as a representation of _the | real thing_. Does AI have the potential to do that? I don | 't know. | IshKebab wrote: | Nobody grows out of your definition of bullshit - doing | things that they think others want to see. | | But that definition is ... well bullshit. Bullshitting is a | deliberate deceptive act. Children aren't being | deliberately deceptive when they come up with nonsense | answers to questions they don't understand. | notahacker wrote: | I don't think the child's drawing of a stick figure, house | and smiling sun is just "bullshit designed to get approval" | though. I think they care deeply about the concept of mummy, | and how mummy relates to various things especially them and | have strong emotional impulses towards mummy. Learning that a | stick figure is an symbolic representation of mummy that | might meet with approval is a trivially small part of that | intrinsic interest in mummy, not the extent of their | engagement with the concept of mummy. And if you tell them | you don't approve of their drawing, they'll burst into tears | and possibly have a completely different attitude towards | drawing and/or you, not just iterate on the bullshit. | Similarly, the kid's concept of mummy being happy or sad is a | bit more nuanced than which way up the crescent under the two | dots should be.... | [deleted] | thfuran wrote: | >Bullshit is far more insidious than a lie, for bullshit can | (and often does) turn out to be coincident with the truth. When | that happens the bullshitter goes undetected and is free to | infect our understanding with more bullshit made up on the | spot. | | If the bullshit is turning out to be true, what's the issue | with more of it? If it's not true but still believed and so | causing problems, what's the practical difference between it | and an undetected lie that makes it more insidious? | IIAOPSW wrote: | A lie is anti-correlated with the truth. Bullshit is | uncorrelated with the truth. The problem with trusting a coin | flip because it happened to be right this time is that now | you trust the coin a source of information, when really its a | source of entropy. The lie will be found out just as soon as | the truth comes out, whereas the bullshitter might get enough | lucky flips in a row that people start taking their advice | seriously. | garyrob wrote: | I agree, but I disagree about one aspect. For the most part, | humans don't use reason all that much or all that deeply. We | usually use intuitive thinking, and there is research showing | that immediate intuitive responses are often better than the | result of long thinking. More negatively, is QAnon belief, or | even Trump election claim belief, about reason? Or is it about | associations between words and concepts, especially when those | concepts are believed in by the people the believer tends to | trust and associate with? | | In other words, the takeaway here may not be that GPT-3 spews | bullshit. It may be that most of the time, human "thinking" is | a less-nuanced, biological version of GPT-3. | thaw13579 wrote: | To me, Dall-E seems analogous to a film production team that | produces visual imagery reflecting a script written by a | screenwriter. By the above reasoning, would that team would be | producing "bullshit"? I think most people would think not, | because the goal isn't to communicate objective truth about the | world, rather something plausible, interesting, entertaining, | etc. (unless it is a documentary). | | I also think distinguishing bullshit from lying depends heavily | on internal mental thoughts, goals, and intentions. Isn't | talking about Dall-E this way personification and ascribing | some level of consciousness? | croes wrote: | The production knows was it puts together Dall-E seems not | thaw13579 wrote: | Exactly my second point. Producing "bullshit" requires some | internal thoughts and intentions, as the parent put it | "DallE generates the images it _thinks_ you want to see " | (emphasis mine). A very big claim, which is interesting to | consider, but probably premature. | | However, I do think that Dall-E is able to learn complex | high-order statistical associations, i.e. beyond just | juxtaposing and visually blending objects. For a recent | example, this post with a prompt "ring side photo of | battlebots vs conor mcgregor": | | https://twitter.com/weirddalle/status/1554534469129871365 | | What is amazing here is the excessive blood and gore. That | feature can't be found in any individual battlebot or MMA | match, but it is exactly what you would expect from robots | fighting a person. Pretty amazing, and I wonder at what | point we could consider this analytical reasoning. | dougmwne wrote: | I reached essentially the same conclusion after playing with | GTP-3 for a while. It spins out mountains of convincing and | impressive bullshit, but you can't actually trust anything it | says because it is disconnected from right, wrong, correct and | incorrect. Reading too much of what it outputs is dangerous | because it basically is feeding white noise into your | perception and experience of the world. | acoard wrote: | I'm a big fan of Frankfurt's "On Bullshit", and love the | reference. | | I think there's one significant distinction between a normal | human bullshitter that Frankfurt originally envisioned, and the | AI practicing Artificial Bullshit. The bullshitter knows there | is truth and intentionally disregards it; whereas the AI is | blind to the concept. I guess this is "mens rea" in a sense, | the human is conscious of their guilt (even if they're | apathetic towards it), whereas DALL-E is just a tool that does | what it is programmed to do. | | I do like this application of "bullshit" though, and will keep | it in mind going forward. | IIAOPSW wrote: | Interesting, but I have two follow ups | | 1. What are the implications of intentionally disregarding | the existence of truth vs being blind to the concept? How | does this distinction you made manifest? | | 2. Are you sure all humans actually believe in the concept of | truth, or could it be the case that some people genuinely | function on a principle "there is no truth, only power". Is | it possible to think "truth" and "current dominant narrative" | are one in the same? | | I've certainly had a ton of luck with Bullshit in Diplomacy. | As Russia, I offered a plan that involved France helping me | take Munich and I would repay by supporting him against the | English invasion. Did I intend to actually follow through, or | was this a cunning lie? Neither. It was bullshit that got me | into Munich. I myself didn't know because (in game) I don't | believe in the concept of truth. Everything I say is true and | none of it is true. Its all true in the sense that it might | happen and I put some weight on it, none of it is true in the | sense that there is no branch of the future game tree | privileged as "the truth". Some truths have more weight than | others, but there is no underlying absolute truth value that | must exist yet I choose to ignore. Eventually the order box | forces me to pick a truth out of the many I have told. But | prior to being forced, it didn't exist. | | Is it possible to think in this way all the time about | everything? Maybe. | samstave wrote: | > _conscious of their guilt_ | | This hits it for me: | | Consciousness is kinda "being aware of the fact that you have | choices for available actions, and what the impact of these | actions/non-actions will have on either yourself, your | environment, the object of your action, or impact on others. | | Intelligence is being aware of the inputs and knowing the | (non)available list of actions to take. | | Intelligence acts on stimuli/input/data? | | Consciousness is awareness of one's own actions from | intelligence, others acts from their standpoint of | intelligence or consciousness... | | A yin/yang, subjective/objective sort of duality that Humans | make. (thought v emotion) | | Dogs are both intelligent and conscious. They know guilt when | they are shamed or happiness when praised for intelligent | actions.. | leereeves wrote: | > It is wholly unconcerned with the actual objects rendered | that are the ostensible focus of the prompt. | | I disagree. To the extent that the training data are images of | actual objects, recreating images of actual objects is the only | thing DALL-E cares about. | | If we define "caring" about something as changing behavior to | cause that to happen, then a neural network doesn't "care" | about inference at all, because inference never changes the | network's behavior. | | It also doesn't know or care about your approval. It only cares | about minimizing the loss function. | | (But now that you bring this up, I think it would be really | interesting to create a network that, after training initially | on training data, began interacting with people and continued | training to maximize approval.) | mjburgess wrote: | Images of objects arent objects and measure only their 2D | geometrical properties projected onto the image plane. These | properties are correlated across images, inexplicably, since | the machine has no access to the physical properties of those | objects, nor the laws of physics. | leereeves wrote: | It's hypothesized (and, I think, likely) that the machine | learns some of those hidden/latent properties, though, | obviously, only to the extent that they are useful to | recreate the images. | | In any case, it does at least care about the appearance of | the actual objects. So I think it would be fair to say that | there are aspects of the actual objects that the network | doesn't care about, but there are also aspects that it | cares very much about. Thus it's not "wholly unconcerned | with the actual objects". | MauranKilom wrote: | > It's hypothesized (and, I think, likely) that the | machine learns some of those hidden/latent properties, | though, obviously, only to the extent that they are | useful to recreate the images. | | Notably, this precisely describes humans too. We don't | _know_ the "true" properties of anything we interact | with. We just have models - some more sophisticated than | others - but only to the extent that we care for | reasoning about the objects. From "This stone is too | heavy to lift." to "Oh yeah Becky is always late." | IIAOPSW wrote: | My gripe with this is that it has no understanding of | what the objects are beyond "the thing that's supposed to | be output". Its a textbook case of the Chinese room | argument [1]. Sure it can follow translation rules, even | infer some sophisticated rules and patterns about the | slips of paper that are passed in, but it still has no | idea what any of it means. | | https://en.wikipedia.org/wiki/Chinese_room | | Dalle is very concerned with the 2d shapes of things we | call objects, and has correctly inferred some rules about | those shapes, but it neither knows nor cares about the | things we call objects and how the shapes it has learned | are representations of them. It doesn't do reasoning | about the round peg fitting in the round hole. It just | glues a few pegs and holes together in a way that's | mostly consistent with our expectations and says "is this | what you wanted"? | | Its a student that cares about passing the test, not | learning the material. | leereeves wrote: | Isn't that true of all of us? We all care selectively | about certain attributes of objects. | | I care that my car is quiet and has comfortable seats, I | don't care (or know) what material the muffler is made | of, but somewhere there is an engineer who cared about | that. | | A road designer cares what the tires are made of and how | much it weighs, but doesn't care what color the paint is. | | An AI recreating an image of my car would care what color | the paint is, but not how comfortable the seats are. | | I think I see what you're describing - the AI has a very | limited scope and doesn't know or care about most of the | things we do - but I think that's just a limitation of | our current small models and limited training data, not | an inherent limitation of neural networks. | IIAOPSW wrote: | You may not know what material the muffler is made of, | but you know its made of a material. You "care about the | objects themselves" in the sense of having this framework | of objects, all of which have certain properties and | relations to each other which form a predictive model you | have about the world. You expect everything to have a | position in x,y,z, to have a size, shape, mass, hardness | etc. You might not care about a particular property of a | particular object in a particular context, but you | fundamentally think in the semantics of objects. You know | what to expect when the heavy hard object meets the | brittle lite object. | | DallE doesn't really have a system of the semantics of | objects. It doesn't know why it would be useless for a | building to have all the doors on the 2nd level. Its not | even clear that DallE makes use of discrete "objects" in | its limited understanding. | | Here's an example from budget DallE | | https://ibb.co/B2B5zky | | It understood the shape of "stick figure" and "boobs", | but had no understanding of what a stick figure is meant | to represent and thus where it should place the boobs. | The results are hilarious. I'm not sure which I like | more, the guy with a boob casually walking down his arm, | or the lady with a boob head that's shrugging with | uncertainty. | icehawk wrote: | Now explain how that's different than anything you look at. | | - Your visual system only has access to the 2D geometrical | properties projected on your retina. The properties are | correlated across images, inexplicably. (I certainly cannot | explain how chairs are, in a fashion that includes all | chairs I've encountered and excludes anything I've | encountered that is not a chair) | | - Any other interaction is also a correlation. | | - Humans don't have access to the laws of physics, just | reasonable approximations in certain contexts. | Barrin92 wrote: | >Now explain how that's different than anything you look | at. | | For starters you just don't look at things, you're | embedded in the world. You have sensory input far beyond | visual information, you also have something akin to | cybernetic feedback in response to your mechanical | actions in the world, and DALL-E has not. | | In fact DALL-E doesn't even have access to visual | information in the same sense you have, which is to a | large extent biochemical and analog, not digital. | Earw0rm wrote: | Where this gets interesting is when the AI attempts to | create animations or moving images - this requires a much | deeper understanding of objects and their physical | properties, even to create a highly stylised or "drawn" | animation. | burlesona wrote: | Slight side tangent but reading this article it hit me how much | this generation of work may be reinforcing English as the global | language for generations to come. It seems like we are headed | towards a phase of technology where learning how to feed well- | crafted prompts into the AI system will be a highly valuable | productivity skill. And since the major AI systems seem to be | built around English, that would make English fluency even more | valuable than it already is. I'm sure that's obvious to non- | native speakers who have worked hard to master English, I just | hadn't thought of it before. | | Less likely but still interesting, I wonder if the way we're | building these models will at some point begin to layer on top of | each other such that English as it is used _now_ becomes | something deeply embedded in AI, and whether that will evolve | with the spoken language or not. It's funny to imagine a future | where people would need to master an archaic flavor of English to | get the best results working with their AI helpers. | affgrff2 wrote: | But the same work is also removing the language barriers at the | same time with really good translation tools. I rather guess | being fluent in English will be not as important as it is now. | | Edit: the same work = transformer based language models | Ajedi32 wrote: | Perhaps, but another possibility is that the more advanced | models all end up being polyglots. The state of the art in | machine translation already uses a single model trained on | multiple languages[1], which results in better translations | between languages it doesn't have a lot of examples for. If the | same principle applies to other types of models, then training | them on every possible dataset available regardless of language | might yield better results. That could result in models that | are fluent in hundreds of languages. (I'd be curious as to | whether DALL-E understands prompts in languages other than | English, has anyone tried?) | | [1]: https://ai.googleblog.com/2019/10/exploring-massively- | multil... | visarga wrote: | BLOOM is the most recent model trained on multi-lingual data | - 46 natural languages and 13 programming languages. | Workaccount2 wrote: | I envision the skill of giving good AI prompts to be as short | lived as the skill of T9 texting. Probably even shorter. | TillE wrote: | Right, I would expect the rise of a bunch of domain-specific | tools within the next few years. | planede wrote: | I wonder if these models performed better if they were trained | on an other language. | qayxc wrote: | I wonder the same thing. Many languages are far better at | making precise statements due to stricter and more complex | grammar. The reduction in ambiguity could help models perform | better. | | Then there's also the problem of labelling training data. | Most of the labelling and annotating is outsourced to | countries with cheap labour and performed by non-native | speakers, which leads to problems with mis-labelled training | data. | visarga wrote: | DallE is trained on web scraped data, there is no manual | labelling taking place except maybe to filter out some | types of content. | | A few years ago we were training captioning models on | manually labelled datasets (such as COCO captions), but the | they were small and models were not too general. | ffhhj wrote: | > reinforcing English as the global language | | Which might also result in new speakers modifying English to | their cultures (like Blade Runner's Cityspeak), and Global- | English speakers not understanding "secret" foreign | communication, so they might create new languages for their own | subcultures, then relegating English as the new Latin for | technical knowledge (Latin was kept by the Catholic Church). | mtlmtlmtlmtl wrote: | Also worth noting that the internet has massively accelerated | the importance of English already. | | As an ESL speaker who grew up on the internet, Norwegian was | more or less useless to me outside school and family. Most of | my time was spent on the internet, reading and writing lots of | English. Norwegian wikipedia is pretty much useless unless you | don't know English. That's still true today for the vast | majority of articles, but back then was universally the case. | | There were Norwegian forums, but with a population of just 4 | million and change at the time, they were never as interesting | or active as international/American forums and IRC channels. | | In fact I'd say Norwegian is only my native language in spoken | form, whereas English feels more natural to me to write and | read. Doesn't help that Norwegian has two very divergent | written forms, either. | | I even write my private notes in English, even though I will be | the only one reading them. | bearjaws wrote: | Covid and remote work are going to make this even more true. At | my new job I work with a Nigerian co-worker who lives in | Nigeria, he speaks clear & proper English and earns an amazing | salary for his area. | | If you live in a third world country, you could really benefit | from remote work going forward and English will be a popular | language to learn for that. That being said, I know some people | will 'phone it in' and not speak as clearly, which will put | them at a disadvantage. | TheRealPomax wrote: | "Yes". And that's why question headlines may not technically be | click bait, but still qualify as something the world can do | about. | prometheus76 wrote: | I want to see two DALL-E bots talk to each other. First bot: | "Pete and Repeat were sitting on a fence. Pete fell off. Who was | left?" Second bot: "Repeat" First bot: "Pete and Repeat were | sitting on a fence. Pete fell off. Who was left?" | | What happens next? | mkl wrote: | I think you're thinking of GPT-3. DALL-E and DALL-E 2 generate | images, not text (and they're very bad at images of text). | prometheus76 wrote: | Ah I'm sorry. Well, the curiosity remains! | xornox wrote: | At least there is much missing data for human like intelligence. | If human would be trained with same input data, I think he/she | would not understand relationships. ___________________________________________________________________ (page generated 2022-08-04 23:00 UTC)