[HN Gopher] Is DALL-E 2 'gluing things together' without underst...
       ___________________________________________________________________
        
       Is DALL-E 2 'gluing things together' without understanding their
       relationships?
        
       Author : Hard_Space
       Score  : 172 points
       Date   : 2022-08-04 13:24 UTC (9 hours ago)
        
 (HTM) web link (www.unite.ai)
 (TXT) w3m dump (www.unite.ai)
        
       | rasz wrote:
       | Dalle is fancy content aware fill on steroids (style translation
       | etc).
        
       | avalys wrote:
       | I think DALL-E is clearly just gluing things together that it
       | found in a massive dataset, and doesn't have any understanding of
       | the underlying concepts. I thought it was easy to see the signs
       | of this in examining its output. Same for GPT-3.
       | 
       | However, what's amazing about DALL-E and these other statistical,
       | generative models to me is that it's made me think about how much
       | of my daily thought processes are actually just gluing things
       | together from some kind of fuzzy statistical model in my head.
       | 
       | When I see an acquaintance on the street, I don't carefully
       | consider and "think" about what to say to them. I just blurt
       | something out from some database of stock greetings in my head -
       | which are probably based on and weighted by how people have
       | reacted in the past, what my own friends have used similar
       | greetings, and what "cool" people say in TV and other media in
       | similar circumstances. "Hey man how's it going?"
       | 
       | If I was asked to draw an airplane, I don't "think" about what an
       | airplane looks like from first principles - I can just synthesize
       | one in my head and start drawing. There are tons of daily
       | activities like this that we do like this that don't involve
       | anything I'd call "intelligent thought." I have several relatives
       | that, in the realm of political thought, don't seem to have
       | anything more in their head than a GPT-3 models trained on Fox
       | News (that, just like GPT-3, can't detect any logical
       | contradictions between sentences).
       | 
       | DALL-E has convinced me that even current deep learning models
       | are probably very close to replicating the performance of a
       | significant part of my brain. Not the most important part or the
       | most "human" part, perhaps. But I don't see any major conceptual
       | roadblock between that part and what we call conscious,
       | intelligent thought. Just many more layers of connectivity,
       | abstraction, and training data.
       | 
       | Before DALL-E I didn't believe that simply throwing more compute
       | at the AGI problem would one day solve it. Now I do.
        
         | multibit wrote:
         | >I have several relatives that, in the realm of political
         | thought, don't seem to have anything more in their head than a
         | GPT-3 models trained on Fox News (that, just like GPT-3, can't
         | detect any logical contradictions between sentences).
         | 
         | If more people were to realize that we're all probably like
         | this, trained on some particular dataset (like mainstream vs
         | reactionary news/opinion), I wonder if that would lead to a
         | kind of common peace and understanding, perhaps stemming only
         | from a deep nihilism.
        
         | dougmwne wrote:
         | I think our intelligence has 2 components, our conscious selves
         | that are good at formal reasoning and meaning making, and a
         | massive neural network that is good at statistical inference. I
         | suspect the reason we need less training data for learning than
         | dall-e is because our conscious selves can tease out the formal
         | relationships and models, rather than relying purely on
         | inference which requires enough datapoints.
        
       | AbraKdabra wrote:
       | I mean, if it is "gliung things together" is doing it pretty
       | well...
        
       | QuadmasterXLII wrote:
       | My intuition is that DALL-E is more a demonstration of how hard
       | image synthesis is for humans, than how intelligent the algorithm
       | is. The image generation models have orders of magnitude fewer
       | parameters than the large language models.
        
       | bawolff wrote:
       | AI system is good at what it was trained to do and bad at what it
       | was not trained to do! News at 11!
        
       | nodja wrote:
       | I've been following the image generation field for a couple
       | months now and while the answer to the title is "yes for most
       | things" it is easily fixed. Use a better text encoder.
       | 
       | My favorite anecdote for showing how having a text encoder that
       | actually understands the world is important to image generation
       | is when querying for "barack obama" on a model trained on a
       | dataset that has never seen Barack Obama the model somehow
       | generates images of random black men wearing suits[1]. This is,
       | in my non-expert opinion, a clear indication that the model's
       | knowledge of the world is leaking through to the image generator.
       | So if my understanding is right, as long as a concept can be
       | represented properly in the text embeddings of a model, the image
       | generation will be able to use that.
       | 
       | If my anecdote doesn't convince you, consider that one of
       | google's findings on the Imagen paper was that increasing the
       | size of the text encoder had a much bigger effect on not only the
       | quality of the image, but also have the image follow the prompt
       | correctly, including having the image generator being able to
       | spell words.
       | 
       | I think the next big step in the text to image generation field,
       | aside from the current efforts to optimize the diffusion models,
       | will be to train an efficient text encoder that can generate high
       | quality embeddings.
       | 
       | [1] Results of querying "barack obama" to an early version of
       | cene555's imagen reproduction effort.
       | https://i.imgur.com/oUo3QdF.png
        
         | snek_case wrote:
         | > when querying for "barack obama" on a model trained on a
         | dataset that has never seen Barack Obama the model somehow
         | generates images of random black men wearing suits[1]. This is,
         | in my non-expert opinion, a clear indication that the model's
         | knowledge of the world is leaking through to the image
         | generator.
         | 
         | That's super interesting. It's not just black men in suits
         | either. It's older black men, with the American flag in the
         | background who look like they might be speaking. Clearly the
         | model has a pretty in-depth knowledge of the context
         | surrounding Barack Obama.
         | 
         | I would say the image generation model is also doing a pretty
         | great job at stitching those concepts together in a way that's
         | coherent. It's not a random jumble. It's kind of what you would
         | expect if you asked a human artist to draw a black American
         | president.
        
       | LesZedCB wrote:
       | it does understand their relationships.
       | 
       | the difference are the subjects of the domain it learns about
       | exist purely as 2 dimensional images.
       | 
       | once these models get larger and include wider ranges of sensory
       | data beyond just imagery (as can be seen with models like GATO),
       | they are clearly better able to "glue together" concepts across
       | multiple domains.
       | 
       | i would argue we absolutely do nothing different with regard to
       | 'gluing things together'. we just have a wider range of sensory
       | inputs.
        
       | throwaway675309 wrote:
       | Yes, SHRDLU it most definitely is _not_.
        
       | upupandup wrote:
       | Blake Lemon claimed that Google's chatbot was sentient which I
       | disagreed with and its demonstrated through this article. AI can
       | be optimized to respond in a way that can easily fool someone
       | into thinking they are talking to a human but at the end of the
       | day sentiency requires consciousness and that is not something
       | that can be digitally produced.
       | 
       | You can teach a parrot to respond to basic arithmetic but they
       | are not aware of the concept of math rather they are acting in
       | pathways set to induce the desired response.
       | 
       | A truly conscious entity would simply have a mind of its own and
       | will not do our bidding just like any other humans. They would be
       | extremely selfish and apathetic, the idea that bunch of GPUs
       | sitting in a datacenter is sentient is sheer lunacy.
       | 
       | This Blake Lemon character will not be the last, there are always
       | those that seek to be in the lime light with zero regards for
       | authenticity. Such is sentient behavior.
        
       | boredumb wrote:
       | GPT/DALL-E/etc... All of these models are of course gluing things
       | together in some manner, but who cares? That's the point right?
       | The AI pill i've taken is that you don't need AGI in order to
       | make things that are useful for people and business. If you've
       | ever ran a business and had to dive into creatives for blogs, SEO
       | content, social media posts, etc then you spent an inordinate
       | amount of time creating it or outsourced it and in both cases the
       | final copy is NOT going to get you a literary prize any time soon
       | but it is absolutely enough to inform potential customers, start
       | ranking on google and start gaining social media clout. GPT will
       | also not garner you a literary award but can absolutely get you
       | quality copy that users, customers, google, facebook users will
       | not be the wiser that you generated it with AI instead of paid a
       | third party to hack it together for you.
       | 
       | (https://neuralmates.com/ I recently started putting together a
       | web app to MVP this, and I hope to be able to integrate DALLE-2
       | soon to be able to start generating images for folks as well.)
        
         | lancesells wrote:
         | This is the kind of response that turns me off on all of these
         | amazing technologies. Blanketing the world with low-level
         | content in the hopes to gain ranking and clout sounds awful.
         | 
         | On the flip side I'm guessing we'll have some gpt-3/dall-e
         | blocker extensions that help reduce some of it.
        
           | boredumb wrote:
           | Sure, but It's the reality of it.
           | 
           | Also - you already live in this world but it's fueled by low
           | cost copy writers and ghost accounts on fiver, I'd bet you
           | are going to see the low water mark for content increase
           | quite a bit in quality and volume over the next few years due
           | to GPT3 being leagues better than the current state of
           | content mills.
        
       | kache_ wrote:
       | define understand
        
       | kazinator wrote:
       | The authors here underestimate how much semantics is involved in
       | gluing things together.
        
       | simion314 wrote:
       | There were so many articles about this project, but I can't find
       | a demo page to test 3,4 outputs myself (unless it was some page I
       | already tried when it was linked and I got nothing similar with
       | what is shown on main page)
        
       | ___rubidium___ wrote:
       | Fascinating. I've noticed the same problem with Midjourney and
       | Nightcafe as well.
        
       | mellowside wrote:
       | I've seen different results than the author. Had it since it was
       | in beta and learned that it needs better prompts to generate
       | better images. As someone on their Discord said "prompt writing
       | is a craft you need to learn".
        
       | aaroninsf wrote:
       | Strong recommend for the book _The Alignment Problem_ which while
       | only somewhat touching on systems which is an excellent lay
       | overview of the state of ML which does a good effort to explain
       | the principles involved, the challenges and blockers, and the
       | ways in which progress has been made.
       | 
       | The field is moving so quickly it will stop short of the current
       | status quo, but, it's still remarkable.
       | 
       | I have been myself playing with MidJourney, which like DALL-E 2
       | is a text prompt to image generation system; it has different
       | goals and favors aestheticized output over photorealism.
       | 
       | The premises of that project and its current execution (as an
       | explicit exercise in collective- rather that siloed-and-
       | contractual relations) are genuinely remarkable and I believe
       | represent a watershed. The rate of evolution of what it is doing
       | are something to behold.
       | 
       | I have generated around 7500 images with MidJourney so far and am
       | slowly developing what feels like an intuition for the way it
       | "sees" images, which as the devs say in regular office hours, is
       | unlike how humans see them.
       | 
       | The limitations, and superpowers, of the system as it exists are
       | already deeply provocative. When things scale again to the next
       | level, the degree of uncanniness and challenge to our
       | preconceptions about the nature and locus of intelligence in
       | ourselves may be genuinely shaking.
       | 
       | Or so I currently think.
       | 
       | I highly recommend taking time to really feel out these systems,
       | because the ways they do and do not succeed and fail serves as a
       | very potent first-hand education in the opportunities, and
       | perhaps much more important, perils, of their application in
       | other more quotidian areas.
       | 
       | It's one thing for them to reliable produce "nightmare fuel"
       | because of their inability to retain very high level coherence
       | down through low level details, when they are drawing limbs,
       | hands, faces...
       | 
       | ...it's another thing entirely when analogous failure modes
       | quietly permeate their ability to recognize illness, or, approve
       | a loan, or recommend an interest rate.
       | 
       | Or--as the example which opens _The Alignment Problem_ spells out
       | --recommend whether someone should or should not offered bail. (A
       | real world example with life-changing consequences for people who
       | interact with ML in this path, in something over 30 states
       | today... at least, as of publication).
        
       | siglesias wrote:
       | I discovered something like this recently when I tried the prompt
       | "man throwing his smartphone into a river," and for the life of
       | me I could not get DALL-E to render the phone separated from the
       | hand (I tried "like a boomerang," "tossing," "into an ocean,"
       | "like a baseball," etc). And then it occurred to me that by the
       | training data, there are virtually no pictures of a person and a
       | phone where the phone is separated! So DALL-E might have thought
       | that the phone was just an appendage to the body, the way the
       | hand is (which, what does this say about society!). I might as
       | well have asked DALL-E to render someone throwing their elbow
       | into a river.
       | 
       | Another interesting case is animal-on-animal interactions. A
       | prompt like, "small french bulldog confronts a deer in the woods"
       | often yields weird things like the bulldog donning antlers! As
       | far as the algorithm is concerned, it sees a bulldog, ticking the
       | box for it, and it sees the antlers, ticking the box for "deer."
       | The semantics don't seem to be fully formed.
        
         | gwern wrote:
         | I dunno man, I punched that exact prompt ("man throwing his
         | smartphone into a river") in DALL-E 2 just now, and in 2/4
         | samples, the smartphone is clearly separate from the hand:
         | labs.openai.com/s/uIldzs2efWWnm3i9XjsHI7or
         | labs.openai.com/s/jSk4qhAxSiL7QJo7zeGp6m9f
         | 
         | > The semantics don't seem to be fully formed.
         | 
         | Yes, not so much 'formed' as 'formed and then scrambled'. This
         | is due to unCLIP, as clearly documented in the DALL-E 2 paper,
         | and even clearer when you contrast to the GLIDE paper (which
         | DALL-E 2 is based on) or Imagen or Parti. Injecting the
         | contrastive embedding to override a regular embedding tradesoff
         | visual creativity/diversity for the semantics, so if you insist
         | on exact semantics, DALL-E 2 samples are only a lower bound on
         | what the model can do. It does a reasonable job, better than
         | many systems up until like last year, but not as good as it
         | could if you weren't forced to use unCLIP. You're only seeing
         | what it can do after being scrambled through unCLIP. (This is
         | why Imagen or Parti can accurately pull off what feels like
         | absurdly complex descriptions - seriously, look at the examples
         | in their papers! - but people also tend to describe them as
         | 'bland'.)
        
           | visarga wrote:
           | If you want multiple objects, each with individual
           | attributes, the unCLIP model still has to make a single
           | embedding vector representation. But the single vector is too
           | small to contain an ever more detailed scene description.
           | That's why it has failure modes like assigning the wrong
           | colour to cubes and not being able to spell text.
           | 
           | On the other hand the previous approach - autoregressive
           | generation - allows full access through the attention
           | mechanism to the prompt.
           | 
           | For example Imagen encodes text to a sequence of embeddings.
           | 
           | > Imagen comprises a frozen T5-XXL [52] encoder to map input
           | text into a sequence of embeddings and a 64x64 image
           | diffusion model, followed by two super-resolution diffusion
           | models
        
       | anothernewdude wrote:
       | Certainly seems that way, but I didn't get to use DALL-E 2
       | because its a Closed-AI.
        
       | 2OEH8eoCRo0 wrote:
       | What are we trying to accomplish with AI? What is "real"
       | intelligence? What are we hoping that an AI does? What test does
       | it need to pass?
        
       | roland00 wrote:
       | Isn't this just Chaos Theory, Calculus, and the problem of Jerks
       | (third derivative of position in regards to time)?
       | 
       | When you slice complex reality into smaller pieces, within the
       | smaller piece you have a rough idea of velocity, and changes in
       | velocity (aka acceleration), but you have no idea of future
       | speed-bumps, aka the jerks (third derivative of position in
       | regards to time) for that information is outside the frame of
       | reference when you divided reality into smaller pieces.
       | 
       | Thus you have pictures of people / objects in systems but you are
       | not truly understanding relationships thus you miss things even
       | though you feel like you see things. It is all a theme park for
       | our own amusement, it is not real, only hyper-real which becomes
       | uncanny when we start noticing how the images are off.
        
       | BiteCode_dev wrote:
       | Amusingly, I get creative new ideas when I glue things together
       | and ignore what I thought I understood about their relationships.
        
         | TremendousJudge wrote:
         | Happens to me too -- it's a great way to make new things.
         | However, the "creation" I'd argue happens when you look at the
         | pile of random stuff and generate a new understanding, and
         | decide that it is valuable. The difference between trash and
         | art only exists in the head of the artist. Same thing happens
         | with DALL-E output, really.
        
       | gizajob wrote:
       | I recently used it to create a picture of "an analogue
       | synthesizer made out of meat and gristle". I don't really know
       | how you could glue these concepts together, because nothing in
       | reality causes them to ever go together, but it did a great job
       | anyway.
        
       | JoshTko wrote:
       | Do humans assume things without understanding their
       | relationships?
        
       | can16358p wrote:
       | Regardless of the content, the article has been "published 10
       | seconds ago" for the last few minutes, and has been submitted to
       | HN 1 hour ago.
        
       | otikik wrote:
       | How does one define "understand their relationships"?
       | 
       | To me it is a matter of degrees and has multiple axes.
       | 
       | When my 6yo son draws a chair, it's not the same as when Van Gogh
       | draws one, which is different to when an expert furniture
       | designer draws one. They all "understand" in different ways. A
       | machine can also "understand". It might do it in different
       | degrees and across different axes that the ones humans usually
       | have, that's all. How we transform that understanding into action
       | is what is important I think.
        
       | dezmou wrote:
       | Another cool example is when you ask DALLE-E to make a "Jaguar
       | XF", it does not really know if I talk about the car or the
       | animal, yet it manage to blend the two together https://img-
       | comment-fun.9cache.com/media/a6wG6Kq/aq04bGmR_70...
        
       | dukeofdoom wrote:
       | Using midjourney, I'm fascinated with the geometric shapes it
       | comes with. Just yesterday I was looking at a sort of a hexagon
       | made with intersecting hearts. Not only was it aesthetically
       | pleasing, it was such an unusual composition, that I think it
       | would be extremely hard for an artist to come up with, maybe even
       | a mathematician. So I think it's already capable of doing more
       | sophisticated things than human artists.
        
         | Cognitron wrote:
         | MidJourney is really good at creating abstract images. I've
         | been playing with the multi-prompt syntax (::) a lot lately,
         | just combining different words that evoke form or texture to
         | see what it will come up with. I start by testing things one at
         | a time and just take a guess at what might look cool together.
         | I end up making strange images with prompts like "fractal ::
         | stained glass :: feathers", "antelope canyon :: tattered fabric
         | :: cave", "rubik's cube :: Japanese woodblock", you get the
         | idea. I rarely try to make pictures of real things anymore.
         | Definitely fun to play around with.
        
           | dukeofdoom wrote:
           | I'm pretty new to it. Any tips on making retro pixel art
           | assets. I want a stone or brick wall I could tile into a game
           | I'm making.
        
             | Cognitron wrote:
             | I think most of us are pretty new to it since they just
             | opened the beta to a ton of people. I haven't tried to do
             | anything like that with it though. All I can say is it
             | tends to take a lot of experimentation to get what you
             | want. And you'll probably have to do some work to the
             | images to get them to tile seamlessly. If you can't get the
             | pixelated look quite how you want it, you might try using
             | it to generate a base image and then use a separate tool to
             | pixelate it. Some people are using it as part of a larger
             | pipeline.
             | 
             | The discord is full of people sharing their experiments and
             | approaches though. Maybe try asking in the prompt-craft
             | channel to see if someone else has attempted something
             | similar.
             | 
             | Good luck!
        
       | thriftwy wrote:
       | Maybe you could give DALL-E creativity lessons by finding such
       | nonsense queries and adding creative interpretations of those
       | into learning set.
        
       | Marazan wrote:
       | https://imgur.com/ggnm920
       | 
       | "a photo of 6 kittens sitting on a wooden floor. Each kitten is
       | clearly visible. No weird stuff."
       | 
       | Like, lets start with the fact that there are 7 of them (2 of the
       | 4 images from the prompt had 7 kittens). Now lets continue on
       | with how awful they look.
       | 
       | It is startling the difference in image quality between DAlle-2
       | asked for a single subject vs DAlle-2 being asked for a group of
       | stuff.
       | 
       | And its obvious, if you know how the tech works, why this is the
       | case.
        
       | xwdv wrote:
       | For art generation, gluing things together without understanding
       | relationships is basically creativity. Humans can assign meaning
       | to the final output.
        
         | visarga wrote:
         | There is some degree of understanding how to combine concepts,
         | otherwise the avocado chair would not be possible.
        
       | knighthack wrote:
       | I don't see how the 'understanding of relationships' should be
       | taken as the key intent of DALL-E 2.
       | 
       | Consider procedural generation: it can create abstractions of
       | both utter beauty or garbage without understanding context. You
       | need to guide it towards something meaningful.
       | 
       | Just the fact that DALL-E can 'glue things together' without need
       | for human inspiration - yet where its output and intent can be
       | understood by a human appraising it - that is not only a feat in
       | itself, but I would say its key _feature_.
        
       | adamsmith143 wrote:
       | Why is there this obsession with systems or algorithms having
       | "understanding"? No one thinks these things have internal states
       | equivalent to "understanding". "Understanding" or not you can't
       | deny the capability of these systems.
        
       | arey_abhishek wrote:
       | AI will never understand the actual context because not
       | everything we feel/experience can be captured and communicated to
       | a machine. For example, human language is incomplete and doesn't
       | encode every information because it doesn't need to when used
       | with other humans.
       | 
       | I think it's a romantic notion to imagine that AI will not be a
       | Chinese room.
       | 
       | Even human intelligence feels like a Chinese room. Especially
       | noticeable when using complicated devices like flight controls.
       | I've been playing the MSFT Flight simulator, and I don't fully
       | understand the relationship between the different instruments.
       | But I can still fly planes(virtually).
       | 
       | We'd be better off if we considered AI similar to an appliance
       | like a microwave or a refrigerator. Does a fridge need to
       | understand or taste what's inside it to be helpful?
        
       | RootKitBeerCat wrote:
       | Having spent about 200$ on credits... yes, it glues together your
       | words through transformers into images... of snippets of the
       | identified corpus of images it has... with some specific prompt
       | phrasings having cool weighting for making really nicely looking
       | glued together results... it's a model: classical computing -
       | everything is turtles and duct tape and rivets and glue, all the
       | way down...
        
       | pera wrote:
       | Interesting, DALL-E Mini / Craiyon actually generates a pretty
       | accurate result for "a monkey touching an iguana" (at least
       | semantically):
       | 
       | https://i.imgur.com/Oq62gQI.png
        
         | affgrff2 wrote:
         | Maybe it's in the training data? At least some special kind of
         | monkey (human) touching some lizard is.
        
           | johndough wrote:
           | https://rom1504.github.io/clip-
           | retrieval/?index=laion5B&useM...
           | 
           | There seem to exist several pictures of marmoset monkeys
           | touching iguanas, but DALL-E mini shows macaque monkeys. This
           | makes me believe that DALL-E mini has at least some
           | generalization capabilities.
        
         | walrus01 wrote:
         | imgur is providing a prompt before viewing this image "this
         | image may contain erotic imagery, are you over 18, etc".
         | 
         | i certainly hope not...
        
       | aeturnum wrote:
       | The Fair Witness was a job that Heinlein made up for Stranger in
       | a Strange Land. Fair Witnesses were supposed to reliably report
       | what they saw without judgement - including their subjective
       | judgement in their report. The example exchange is: "Is that
       | house over there brown?" "It is brown on this side."
       | 
       | Dall-E (and other ML systems) feel like fair witnesses for our
       | cultural milieu. They basically find a series of weighted
       | connections between every phrase we've thought to write down or
       | say about all images and can blend between those weights on the
       | fly. By any assessment it's an amazing feat - as is the feat to
       | view their own work and modify it (though ofc it's from their
       | coordinate system so one does expect it would work).
       | 
       | In one sense - asking if the machine "understands" is beside the
       | point. It does not need to 'understand' to be impressive (or even
       | what people claim when they're not talking to Vice media or
       | something).
       | 
       | In another sense, even among humans, "understanding" is both a
       | contested term _and_ a height that we all agree we don 't all
       | reach all of the time. One can use ideas very successfully for
       | many things without "understanding" them.
       | 
       | Sometimes people will, like, turn this around and claim that:
       | because humans don't always understand ideas when they use them,
       | we should say that ML algorithms are doing a kind of
       | understanding. I don't buy it - the map is not the territory. How
       | ML algorithms interact with semantics is wholly unlike how humans
       | interact with them (even though the learning patterns show a lot
       | of structural similarity). Maybe we are glimpsing a whole new
       | kind of intelligence that humans cannot approach - an element of
       | Turing Machine Sentience - but it seems clear to me that
       | "understanding" in the Human Sentience way (whatever that means)
       | is not part of it.
        
       | blocked_again wrote:
       | This can be easily answered by giving the following input to
       | dalle.
       | 
       | """
       | 
       | A map highlighting the countries the ancient Romans invaded since
       | Pepsi was introduced.
       | 
       | """
        
         | Rackedup wrote:
        
       | Havoc wrote:
       | Isn't that true for all the models?
        
       | [deleted]
        
       | rkwasny wrote:
       | People are "gluing things together" for a very long time, that's
       | what everyone does!
       | 
       | All ideas as remixes of previous ideas
        
       | andybak wrote:
       | I'm a big Dall-E fan but this is no surprise to anyone who's used
       | it for more than 5 minutes.
       | 
       | It was one of the things highlighted by Google when they
       | announced Imagen as a differentiator:
       | https://imagen.research.google
       | 
       | The article touches on this but the headline is slightly
       | deceptive.
        
         | causi wrote:
         | Imagen would be amazing for an "Images Against Humanity" game
         | where you score points by generating the funniest picture.
        
           | Zircom wrote:
           | Absolutely stealing this idea, look forward to being credited
           | for the idea in the future show hn post :P
        
         | andybak wrote:
         | (Incidentally - I want to make it clear I don't mean "Dall-E
         | has no understanding of the elements and it's just a fancy
         | image search/compositor" - there are results that absolutely
         | require some semantic "understanding". It's just there are
         | limits and spatial understanding is one of the weak spots)
        
       | kemiller wrote:
       | Anyone who has ever raised a small child will realize this is how
       | kids learn too.
        
       | rocgf wrote:
       | Does anyone actually believe DALL-E "understands" what it's
       | doing? For any reasonable definition of "understands", I assume
       | most people would be skeptical.
       | 
       | So if we go with that, then yes, it just glues things together
       | without understanding their relationship. I'd just be tempted to
       | say it doesn't really matter that it doesn't understand, except
       | maybe for some philosophical questions. It's still incredible
       | based on its output.
        
         | visarga wrote:
         | > Does anyone actually believe DALL E "understands" what it's
         | doing?
         | 
         | I believe it understands enough to make tens of thousands of
         | people interested and debating its merits. The GANs of 5 years
         | ago were baby toys compared to DALL-E. They were drawing 5 legs
         | to a dog and limited to a few object classes. Now people debate
         | if it "really understands" and if it is "(slightly) conscious".
        
         | LesZedCB wrote:
         | can we come up with a 'reasonable' definition of understands
         | that isn't couched in human exceptionalism and could still
         | apply to any computer model?
         | 
         | i think dall-e understands, within the sensory domain it's
         | trained from.
        
           | rocgf wrote:
           | I think DALL-E understands the same way an Excel sheet would
           | understand SUM(B3:B6), obviously at a way larger scale and
           | with a more complicated function. It's all just rather simple
           | maths all the way down.
        
             | visarga wrote:
             | Not similar. The SUM() function was written by humans in a
             | top-down fashion, DALL-E learned from the visual and
             | textual data by a completely different process. DALL-E
             | distilled something about the world and is using that to
             | generate.
        
             | [deleted]
        
       | chpatrick wrote:
       | I think "understanding" is a super vague word.
        
         | i_am_toaster wrote:
         | Bingo. This is also a fair comparison to how todays politics
         | works in a nutshell. Overload a word with a vague context and
         | people immediately attribute their own perspective to come to a
         | conclusion that conflicts with an opposing argument.
         | 
         | Relating back to this headline, using "understanding" creates
         | lots of messages with differing views because everyone has
         | their own take on the word. If instead you said something like,
         | "DALLE fakes understanding of concepts to create new images," I
         | bet you'd get even closer to the "political message board"
         | style of comments because you are now taking an objective
         | position (yes/no,true/false,good/bad) on a subjective word
         | (understanding).
        
       | IceMetalPunk wrote:
       | Now, let's be critical of possible _reasons_ for this. It 's
       | important to remember two things: 1) Any NN has zero experience
       | with the world beyond its training data. Things that seem obvious
       | to us from our experience are not obvious to a system that has
       | never experienced those things. And 2) DALL-E 2 was trained on
       | image-caption pairs scraped from the internet, basically.
       | 
       | So it's quite possible the reason it doesn't understand things
       | like "X under Y" very well is that its training set doesn't have
       | a lot of captions describing positional information like that, as
       | opposed to any failure in the architecture to even _potentially_
       | understand these things.
        
       | vannevar wrote:
       | I'm sure it is, but "gluing things together" coherently in
       | response to a text prompt is a stupendous achievement. It's not
       | AGI, but it's miles ahead of where we were even a few years ago
       | and opens the door to automating a class of jobs I don't think
       | anyone back then believed could be automated, short of AGI.
        
         | seanmcdirmid wrote:
         | Just think what this could do for a game experience like
         | Scribblenauts. Just being able to glue a fixed number of
         | concepts in a huge number of ways...game designers are going to
         | have to learn how to leverage ML.
        
         | monkeynotes wrote:
         | Their research showed that Dall-E had most success with real
         | world stuff it had been trained on. Is this surprising? I mean,
         | if I didn't know much about iguanas I'd also have a hard time
         | representing them.
        
         | _nhynes wrote:
         | I ended up reading the book _Blindsight_ (Peter Watts) that 's
         | been floating around in comments recently. A major theme in the
         | book is intelligence and its relation to consciousness
         | (including whether consciousness is even beneficial). If you
         | agree with the idea, you'd consider that DALL-E is indeed
         | intelligent even though it appears to be a "Chinese Room".
         | Humans would be "gluing things together" in just the same way,
         | but with this odd introspective ability that makes it seem
         | different.
        
           | Pulcinella wrote:
           | I think the book does make a point that maybe one benefit of
           | consciousness the ability to filter through all the
           | information and spam that conscious beings produce. E.g. The
           | scramblers may view all the radio waves we blast everywhere
           | as attempts at warfare and reducing the fitness of other
           | species. Why else would a species emit so much information if
           | not to DDOS their enemies?! tl;dr consciousness is a defense
           | against ****posting and trolling caused by conscious beings.
        
           | the8472 wrote:
           | re the chinese room, you might want to consider the
           | computational and memory complexity of a lookup table.
           | https://www.scottaaronson.com/papers/philos.pdf page 14
        
           | twak wrote:
           | fantastic book. Made me consider the question of whether
           | consciousness exists at all or if it is just some hack by
           | evolution to allow introspection.
           | 
           | I haven't found a definition of consciousness which is
           | quantifiable or stands up to serious rigour. If it can't be
           | measured and isn't necessary for intelligence, perhaps there
           | is no magic cut-off between the likes of Dall-E and human
           | intelligence. Perhaps the Chinese-room is as conscious as a
           | human (and a brick)?
        
             | otikik wrote:
             | It is a hack alright. A very beautiful hack, refined in
             | hundreds of thousands of iterations, with a very high
             | price.
             | 
             | > to allow introspection
             | 
             | Evolution doesn't do things "to anything". It repeats what
             | works, and kills the rest. Our brains have allowed us to
             | adapt to the changes in the environment better than the
             | rest. Conscience came with the pack. It might not have an
             | actual "purpose"- it could be an "appendix".
             | 
             | My personal belief is that consciousness started as the
             | self-preservation instinct that most animals have, and we
             | developed introspection as a way to strengthen our ties to
             | other members of our family or tribe. And then we "won"
             | (for now)
        
             | somenameforme wrote:
             | I think in these sort of topics we are often discussing two
             | different meanings of consciousness: physical vs
             | philosophical. The arbitrary mixing and matching of them is
             | likely one the reasons for the lack of clarity. Physical
             | consciousness is most easily identified by the notion that
             | you're unconscious when sleeping. The other is the far more
             | interesting philosophical consciousness.
             | 
             | Philosophical consciousness is what the oft misunderstood
             | quote _cogito ergo sum_ , I think therefore I am, was
             | hitting on. Descartes was not saying that consciousness is
             | defined by thinking. He was trying to identify what he
             | could know was really real in this world. When one goes to
             | sleep, the dreams we have can often be indistinguishable
             | from a reality in themselves, until we awake and find it
             | was all just a dream. So what makes one think this reality
             | isn't simply one quite long and vivid dream from which we
             | may one day awake?
             | 
             | But this wasn't an appeal to nihilism, the exact opposite.
             | The one thing he could be certain of is that he, or some
             | entity within him, was observing everything. And so, at the
             | minimum, this entity must exist. And the presence of this
             | entity is what I think many of us are discussing when we
             | speak of consciousness. In contrast to physical
             | consciousness, you are philosophically conscious even when
             | sleeping.
             | 
             | Of course like you said philosophical consciousness cannot
             | be proven or measured and likely never will be able to be,
             | which makes it an entirely philosophical topic. It is
             | impossible for me to prove I am conscious to you, or vice
             | versa, no matter what either of us does. Quite the private
             | affair, though infinitely interesting to ponder.
        
               | akomtu wrote:
               | There is a lot of literature on this topic, but nearly
               | all of it is in the "restricted" section of the library:
               | it's the section with book shelves for alchemy,
               | occultism, buddhism and so on. Western literature is very
               | shallow in this regard.
               | 
               | There a few interesting thoughts about consciousness that
               | I've found in those books. One is that the boundary
               | between consciousness and "real matter" is imaginary:
               | consciousness exists only because of change in that
               | matter, when the change stops - so does consciousness,
               | consciousness creates reality for itself, and the two are
               | in fact just two sides of the coin. In other words,
               | static consciousness isnt a thing, and hence the need for
               | "reality".
               | 
               | Human consciousness is a sum of many consciousnesses that
               | exist at wildly different levels of reality. There are
               | primitive cellular consciousnesses, and those sometimes
               | influence our mental consciousness. Our neural
               | cerebrospinal system has an advanced consciousness
               | capable of independent existence: it manages all the
               | activity of internal organs, and only loosly interacts
               | with our higher mental consciousness. That cerebrospinal
               | system is even self-conscious in a primitive way: it can
               | observe its own internal changes and distinguish them
               | from impulses from the outside. There's emotional and
               | mental consciousness that mainly lives in the brain and
               | is somewhat aware of the dark sea of lower consciousness
               | below it.
               | 
               | Most people are conscious in dreams, as they can perceive
               | in that state. However they cant make (yet) distinction
               | between inner processes (self) and external effects
               | (others), so to them it appears as if everything is
               | happening inside their mind, i.e. they are not self-
               | conscious. That's consciousness of a toddler. Some are
               | more advanced, they start seeing the me-others difference
               | and can form memories from dreams.
        
             | patcon wrote:
             | Also read the book due to comments here. Was very aligned
             | with some of my interests in relationships between
             | language, information, life/intelligence and metaphor <3
             | 
             | You might get a kick out of this paper (though some may
             | find it's proposal a bit bleak, I think there's a way to
             | integrate it without losing any of the sense of wonder of
             | the experience of being alive :) )
             | 
             | It analogizes conscious experience to the a rainbow "which
             | accompanies physical processes in the atmosphere but exerts
             | no influence over them".
             | 
             | Chasing the Rainbow: The Non-conscious Nature of Being
             | (2017) https://www.frontiersin.org/articles/10.3389/fpsyg.2
             | 017.0192...
             | 
             | > Though it is an end-product created by non-conscious
             | executive systems, the personal narrative serves the
             | powerful evolutionary function of enabling individuals to
             | communicate (externally broadcast) the contents of internal
             | broadcasting. This in turn allows recipients to generate
             | potentially adaptive strategies, such as predicting the
             | behavior of others and underlies the development of social
             | and cultural structures, that promote species survival.
             | Consequently, it is the capacity to communicate to others
             | the contents of the personal narrative that confers an
             | evolutionary advantage--not the experience of consciousness
             | (personal awareness) itself.
             | 
             | So consciousness is more about what it subjectively feels
             | like to be under pressure/influence to broadcast valuable
             | internal signals to other (external) agents in our
             | processes of life; aka other humans in the super-organism
             | of humanity. I analogize it to what a cell "experiences"
             | that compel it to release hormonal signals in a
             | multicellular organism.
        
             | f38zf5vdt wrote:
             | > Made me consider the question of whether consciousness
             | exists at all or if it is just some hack by evolution to
             | allow introspection.
             | 
             | Can't it be both? What's the difference? Evolution just
             | responds to the environment, so a method of complex
             | interaction with the environment like "consciousness" or
             | "ever-polling situational awareness" seems like par for the
             | course.
        
               | croes wrote:
               | Evolution is the result of the environment not a
               | response.
               | 
               | Giraffes didn't get a long neck because the food was out
               | of reach, giraffes have a lock neck because the one
               | without just died.
        
               | f38zf5vdt wrote:
               | It's a response caused by survival. The environment is
               | the stimuli and the genetics that are not eliminated are
               | cultivated.
               | 
               | If I introduce an antibiotic into a culture of bacteria
               | and they evolve resistance, then they appear to be
               | responding to it on a collective level.
        
               | croes wrote:
               | "Since bacteria are extremely numerous, random mutation
               | of bacterial DNA generates a wide variety of genetic
               | changes. Through mutation and selection, bacteria can
               | develop defense mechanisms against antibiotics."
               | 
               | https://health.mo.gov/safety/antibioticresistance/general
               | inf...
               | 
               | Mutation happens all the time because cell replication
               | isn't perfect, some mutation are irrelevant, some deadly,
               | some bring better chance of survival.
               | 
               | It's not a response just the result. Or how does the
               | bacteria know it's an antibiotic and not just water? It
               | doesn't, water just isn't a evolutionary filter,
               | antibiotics are.
        
           | doph wrote:
           | Exactly (and cannot recommend Blindsight highly enough). Of
           | course DALL-E is a Chinese Room. The most exciting and
           | subversive idea in Blindsight is that consciousness is
           | maladaptive, and the scramblers are _more_ fit for the lack
           | of it. Long after neutral nets have surpassed our functional
           | intelligence, we'll still be criticizing their ability to
           | navel-gaze as well as humans.
        
             | FeepingCreature wrote:
             | Of course, DALL-E can only get away with not being
             | introspective because it has reinforcement learning holding
             | its hand. Bit harder to curate a clean training set without
             | consciousness.
             | 
             | Personally, I suspect consciousness is intrinsically
             | connected with whatever lets humans learn things by looking
             | at and thinking about a single example.
        
               | soulofmischief wrote:
               | I get what you're saying, but it's important to also
               | consider how evolution provides for us as well by giving
               | us neural structures already tuned for things like
               | recognizing faces and recognizing danger. I do wonder how
               | much realtime danger analysis has to do with
               | consciousness.
        
               | FeepingCreature wrote:
               | I'd doubt it has much of a connection. Consciousness can
               | do many things, but generally not quickly.
        
               | TrevorJ wrote:
               | Reinforcement learning _is_ the thing that integrates the
               | network across time though. In one sense, I think it 's
               | doing at least part of the job that consciousness does:
               | mediating between memory(past) and the present.
        
             | [deleted]
        
             | fatherzine wrote:
             | Consciousness is very much adaptive, though it requires a
             | sense of the metaphysical / transcendent to behave in that
             | way. Theologically speaking, God brings Life and the Devil
             | brings Death. Sadly, our modern brave new world has taken
             | "Gott ist Tot" as a triumphalist proclamation, and not the
             | lamentation it was meant to be.
        
               | wpietri wrote:
               | It's far too early to tell if consciousness is adaptive.
               | Has it enabled humans to be wildly successful? In the
               | short term, sure! But the long term is unclear.
               | 
               | E.g., look at how the technological progress it makes is
               | at risk of destroying human civilization. A series of
               | ecological boom/bust cycles could lead toward negative
               | selection for consciousness. Or we might manage to wipe
               | out ourselves or, say, most mammals, leading life on
               | earth to start over as it did 65m years ago.
               | 
               | But even without that, it's not clear to me that
               | consciousness will really win out. Look at the number of
               | successful people who are not only painfully
               | unreflective, but need to be to keep doing what they're
               | doing. I could name a lot of people, but today's good
               | example is Alex Jones, whose whole (very profitable)
               | schtick is based on refusing to be fully conscious of
               | what he's saying: https://popehat.substack.com/p/alex-
               | jones-at-the-tower-of-ba...
               | 
               | And this is hardly a new idea. Vonnegut wrote a novel
               | where humans end up evolving into something like a sea
               | lion. The point being "all the sorrows of humankind were
               | caused by 'the only true villain in my story: the
               | oversized human brain'", an error evolution ends up
               | remedying.
        
               | fatherzine wrote:
               | Hardly a new ideea indeed: https://en.wikipedia.org/wiki/
               | Tree_of_the_knowledge_of_good_...
               | 
               | Edit: To be clear, I posit that consciousness is the
               | organ that enables us to distinguish between Good and
               | Evil.
        
               | wpietri wrote:
               | I doubt that consciousness is an organ, but if it were,
               | it's distinct from a moral sense, because we share that
               | with animals. Here's a good book on the topic: https://ww
               | w.hup.harvard.edu/catalog.php?isbn=9780674356610
               | 
               | That same professor has done a bunch more work on the
               | topic, as have many others.
        
               | fatherzine wrote:
               | Frans de Waal has done great work.
               | 
               | Evolution is a strange phenomenon. I invite us to marvel
               | at the transformation of terrestrial quadrupeds into
               | majestic aquatic creatures, over eons:
               | https://en.wikipedia.org/wiki/Evolution_of_cetaceans
               | 
               | Evolutionary speaking, cetaceans "share" the front limbs
               | with quadrupeds. And yet there is a qualitatively
               | distinct functional difference. Consider that moral
               | consciousness, as present in humans, is functionally not
               | quite the same as its biological precursor, the moral
               | sense present in dogs or gorillas. And, of course, there
               | are gradual changes along the evolutionary way.
               | 
               | Edit: "Organ", more precise "sensory organ", as in "the
               | visual organ". Perhaps there is a better word here than
               | "organ" here, before we get lost in the medical
               | distinction between eye / retina / optic nerve / cortex /
               | etc.
        
               | kebman wrote:
               | But Sontag suggests the use of Batailleist `powerful
               | communication' to modify and read consciousness. The
               | subject is contextualised into a predialectic nationalism
               | that includes language as a totality.
        
               | fatherzine wrote:
               | Hi there GPT-3 bot.
               | 
               | Edit. Don't really want to be unnecessarily flippant
               | here. Perhaps this is a real person attempting to
               | communicate. Sadly, I was unable to follow through, the
               | text jumps wildly between seemingly unrelated subjects.
        
               | kebman wrote:
               | Nope, that was actually the Postmodern Generator.^[1] But
               | nice try! ;) [1]: https://www.elsewhere.org/pomo/
        
               | fatherzine wrote:
               | Haha, <fist bump>.
        
           | benlivengood wrote:
           | It's not clear what generates consciousness. Until we know
           | for sure (e.g. A/B testing with humans who can report when
           | they do and do not experience consciousness in different
           | neural configurations) then I think it's impossible to know
           | what level of conscious experience large ML models have.
           | 
           | Blindsight is an excellent book in its exploration of
           | consciousness, but the speculative part is that a working
           | sense of self isn't necessary for embodied intelligence (like
           | the scramblers), which I tend to doubt. An agent without a
           | model of itself will have difficulty planning actions;
           | knowing how its outputs/manipulators are integrated into the
           | rest of reality will be a minimum requirement to control them
           | _effectively_. It is certainly possible that  "self" or "I"
           | will be absent; humans can already turn the ego off with
           | drugs and still (mostly) function but they remain conscious.
        
             | mrcartmeneses wrote:
             | People say this a lot online in this kind of debate but you
             | can see which areas of the brain light up in people who are
             | conscious and how that's different from people who are
             | asleep or in a persistent vegetative state. We have a very
             | good idea of what consciousness is and it's definitely not
             | linear neural net with a discrete input and output. Having
             | said that, the nerve that connects the eye to the brain
             | could be considered similar to a modern neural net.
             | 
             | It looks like we're beginning to get the building blocks of
             | consciousness together. But we don't yet know how to
             | combine the wave functions into a chorus necessary to
             | achieve GI
        
               | stormbrew wrote:
               | I think you're using a different meaning of the word
               | conscious than the GP? You seem to be talking about
               | awake, they're talking about a much more abstract idea.
               | 
               | A person who is sleeping or in a vegetative state is not
               | currently getting new inputs fed into some parts of their
               | brain, so it's not surprising that their brain "lights up
               | differently," nor does it imply anything about a piece of
               | software that _is_ getting new inputs that might be being
               | integrated into its model (of course, a model that is
               | trained and then repeatedly used without further
               | integration is not in any way comparable to a brain).
               | 
               | This more abstract idea of consciousness is definitely
               | not a solved problem - people can't even manage to agree
               | on whether non-human animals have it. And a lot of
               | internet arguments for why this or that neural network
               | can't be conscious probably also rule 5 year olds out of
               | it too.
        
               | [deleted]
        
               | benlivengood wrote:
               | We have fairly good ways of detecting human consciousness
               | now, but not a lot of understanding of _why physically_
               | or _precisely which_ neural circuits produce conscious
               | experience, which is the understanding we 'd need to
               | infer what sort of conscious experience is possible for
               | ML models. We also can't reliably detect or predict
               | whether a given animal is conscious or not (related to
               | the problem of minimal viable neurons required for
               | consciousness), and this is the biggest evidence I have
               | that our study of consciousness is far from complete.
               | 
               | The best we can do with humans (and maybe animals) is
               | behaviorism and inference on our own personal
               | consciousness at this point, with brain imaging to
               | demonstrate at least gross prediction of consciousness in
               | humans.
        
               | visarga wrote:
               | > consciousness is definitely not linear neural net with
               | a discrete input and output
               | 
               | Just neural circuits are not going to be conscious by
               | themselves, for one they need to learn concepts from the
               | environment and those concepts shape the neural circuits.
               | Thus the way they act shape how they develop. You can't
               | separate consciousness from the environment where it
               | develops.
               | 
               | In other words it was not the neural net that was
               | lacking, but the environment.
        
             | canjobear wrote:
             | > humans who can report when they do and do not experience
             | consciousness
             | 
             | How can someone possibly report when they are not
             | experiencing consciousness?
        
               | benlivengood wrote:
               | > How can someone possibly report when they are not
               | experiencing consciousness?
               | 
               | By an absence of reporting it. If I sit at a desk getting
               | my neurons moderated by testing equipment and say "I am
               | conscious" every subjective second that I am experiencing
               | consciousness then I could at least help narrow down when
               | consciousness is lost. If I am simply unable to speak or
               | respond at all, but still conscious, I would report that
               | fact later. Only in the case of locked-in conscious
               | awareness without later memory of the experience would
               | this kind of experimental setup fail, and this is where
               | brain imaging could probably help determine that
               | everything except motor or memory neurons were active.
        
             | toomanydoubts wrote:
             | >(e.g. A/B testing with humans who can report when they do
             | and do not experience consciousness in different neural
             | configurations)
             | 
             | Doesn't that fall back to the old consciousnesses trap that
             | nobody knows how to resolve? How do you know if the human
             | reporting that he's conscious is actually conscious and not
             | philosophical zombie?
             | 
             | We don't know what generates consciousness because we don't
             | know how to measure it, and if we can't measure it, we will
             | always have to take the words of an seemingly conscious
             | entity for it.
        
               | benlivengood wrote:
               | Self-reporting is probably ideal, probably with a camera
               | recording everything so I can judge my answers and
               | behavior and compare it with my memory. But we don't
               | really have a great handle on how much of our memory of
               | conscious experience matches our real-time experience or
               | even if the memory-formation process might "be"
               | consciousness, so I admit there are blind spots in
               | studying consciousness rigorously.
               | 
               | I don't buy the philosophical zombie argument simply
               | because consciousness does alter behavior. I wouldn't
               | participate in this conversation the same way if I didn't
               | experience consciousness. It would be more like vivid
               | imagination (as apposed to moderate aphantasia) where I
               | find it curious but don't have it. As in the novel,
               | unconscious beings probably behave noticeably different.
               | 
               | There are, apparently, some people who have a very
               | reduced sense of consciousness. I know I have done and
               | said things when I'm not (memorably) conscious, for
               | example when half asleep or coming out of anesthesia, and
               | my behavior has been altered according to witnesses. I
               | wasn't quite "myself". I can also hyper-focus and reduce
               | conscious awareness of my surroundings and of my own body
               | and mind, but that still feels like I have an internal
               | awareness and memory of the experience. I am fairly
               | certain I would be able to tell if that is switched off
               | for a time.
        
               | Scene_Cast2 wrote:
               | There was this Google AI that made the news recently, it
               | was trained to say that it's a conscious AI. However,
               | most agree that it isn't conscious.
        
               | laserlight wrote:
               | > I wouldn't participate in this conversation the same
               | way if I didn't experience consciousness.
               | 
               | I don't think you're conscious. Prove me wrong.
        
           | [deleted]
        
           | joe_the_user wrote:
           | The "Chinese room" argument, so far as I understand it,
           | applies to any discreet computation process simulating
           | consciousness.
           | 
           | The argument of the article is that DALL-E doesn't respond
           | appropriately to a particular kind of input - two entities in
           | some kind of spatial relationship (that it hasn't often
           | seen). Dall-E's not extrapolating the three-D world but
           | stretching a bunch 2-D images together with some heuristics.
           | That works to create a lot of plausible images sure but it
           | implies to this ability might not, say, be able to be useful
           | for the manipulation of 3-D space.
           | 
           | So, given a "Chinese room" is just a computation, it's
           | plausible that some Chinese room could handle 3-d image
           | manipulation more effectively than this particular program.
           | 
           | Which is to say, "no, the criticism isn't this is a Chinese
           | room, that is irrelevant".
        
             | TrevorJ wrote:
             | I think we need to start thinking of these networks not as
             | something we hope will someday simulate a brain, but as
             | small building blocks that can be networked together in a
             | variety of ways to tackle tasks. DALL-E may not understand
             | 3D image manipulation, but if you chained the output from
             | DALL-E into something like NeRF you could start to
             | transform the image in 3D space. One can picture a
             | collection of a few hundred of these networks, all with
             | different things that they are good at working in concert
             | to approach something that is very good at doing things
             | generally. We're missing the layer on top that decides
             | which algorithms to run and in what order.
        
             | jrlocke wrote:
             | The Chinese room argument itself isn't very compelling.
             | Surely the constituent parts of the brain are fundamentally
             | governed solely by physics, surely thought arises solely
             | from the physical brain, and surely the constituent parts
             | (and thus thought) could be described by a sufficiently
             | complex discreet computation.
             | 
             | Are we not conscious?
        
               | indigo945 wrote:
               | I think a considerable subset of the people who do make
               | use of the Chinese room argument also subscribe to some
               | form of mind-body dualism, where consciousness does not
               | or does not completely arise from physical processes.
        
               | jrlocke wrote:
               | To be more explicit, I'm saying I find it weird to hear
               | so much about the Chinese room argument from a crowd of
               | (presumably) materialists.
        
               | notahacker wrote:
               | I see it the other way round.
               | 
               | The Chinese Room and the brain of a Chinese-speaking
               | person are completely different physical processes.
               | Looked at on an atomic level, they have almost nothing in
               | common. Mind-body dualists may or may not agree that the
               | room is not "conscious" in the way a human is, but if
               | consciousness is purely a material process, I can't see
               | how the materialist can possibly conclude all the
               | relevant properties of the completely dissimilar room and
               | person are the same.
               | 
               | Those that would argue the Chinese Room is "conscious" in
               | the same way as the Chinese person are essentially
               | arguing that the dissimilarity of the physical processes
               | is irrelevant: the "consciousness" of the Chinese person
               | doesn't arise from molecules bouncing around their brain
               | in very specific ways, but exists at some higher level of
               | abstraction shared with the constituent molecules of
               | pieces of paper with instructions written in English and
               | outputs written in Chinese.
               | 
               | The idea our consciousness exists in some abstract sense
               | which transcends the physics of the brain is not a new
               | one of course. Historically we called such abstractions
               | souls...
        
               | mannykannot wrote:
               | The argument you make here is a reasonable one (IMHO) for
               | the plausibility in principle of what Searle calls
               | "strong AI", but he claims that his "Chinese Room"
               | argument proves that it must be mistaken. One can simply
               | ignore him, but to refute him takes a little more effort.
               | 
               | It turns out that when one looks at the argument in
               | detail, and in particular at Searle's responses to
               | various objections (such as the Systems and Virtual Mind
               | replies), it is clear that he is essentially begging the
               | question, and his ultimate argument, "a model is not the
               | thing modeled", is a non-sequitur.
        
               | TheOtherHobbes wrote:
               | The argument is essentially that there are no qualia of
               | Chinese comprehension in an automaton or in any system
               | that uses an equivalent algorithm, whether or not run by
               | a human.
               | 
               | It's a sound argument to the extent that qualia clearly
               | exist, but no one has any idea what they are, and even
               | less of an idea how to (dis)prove that they exist in
               | external entities.
               | 
               | It's the materialists who are begging the question,
               | because their approach to qualia is "Well obviously
               | qualia are something that just happens and so what?"
               | 
               | Unfortunately arguments based on "Well obviously..." have
               | a habit of being embarrassingly unscientific.
               | 
               | And besides - _written language skills are a poor
               | indicator of human sentience._ Human sentience relies at
               | least much on empathy; emotional reading of body
               | language, expression, and linguistic subtexts; shared
               | introspection; awareness of social relationships and
               | behavioural codes; contextual cues from the physical and
               | social environment which define and illuminate
               | relationships; and all kinds of other skills which humans
               | perform effortlessly and machines... don 't.
               | 
               | Turing Tests and game AI are fundamentally a nerd's view
               | of human intelligence and interaction. They're so
               | impoverished they're not remotely plausible.
               | 
               | So as long as DALL-E has no obvious qualia, it cannot be
               | described as sentient. It has no introspection and no
               | emotional responses, no subjective internal state (as
               | opposed to mechanical objective state), and no way to
               | communicate that state even if it existed.
               | 
               | And it also has no clue about 3D geometry. It doesn't
               | know what a sphere, only what sphere-like shading looks
               | like. Generally it knows the texture of everything and
               | the geometry of nothing.
               | 
               | Essentially it's a style transfer engine connected to an
               | image search system which performs keyword searches and
               | smushes them together - a nice enough thing, but still
               | light years from AGI, never mind sentience.
        
           | planetsprite wrote:
           | human brains are a chinese room. Our DNA and experiences
           | wrote the book.
        
           | TrevorJ wrote:
           | I'm becoming convinced that these algorithms _are_ huge steps
           | towards AGI simply because AGI might end up being a
           | collection of many of these domain-specific networks with a
           | network sitting above them who 's only role is to interrogate
           | the sub networks for solutions to the problem at hand, and
           | discriminate which solution(s) are most worth trying,
           | simulating those and then picking one out and executing it in
           | the real world. That seems to me to be pretty close to what
           | we as humans do.
        
             | ausbah wrote:
             | the problem you run into then is how does it adapt to new
             | domains? sure large models display some degree of
             | generalization, but new tasks will certainly ask things the
             | network couldn't get
        
               | TrevorJ wrote:
               | I think that's valid, however, I think the modular
               | approach would allow for a lot more cross domain
               | flexibility because I suspect that a large subset of
               | disparate tasks do actually decompose themselves into
               | discrete subtasks that can be useful across domains.
               | 
               | Adding to that, if you can compose multiple sub networks
               | together then you've really got something. You can build
               | a lot of different buildings from bricks without needing
               | to invent new kinds of brick basically.
               | 
               | For instance, think about the large number of domains
               | that robust computer vision would be useful in. Then
               | think about the fact that if the computer understands the
               | 3D space around it, it can hand that model off to a
               | network that does predictive physics simulation. Now
               | you've got something that would be useful across a
               | extremely wide range of domains.
        
             | WitCanStain wrote:
             | Cognitive scientists have been exploring this idea for a
             | while, e.g. the Modularity of Mind Hypothesis. It would be
             | very interesting to see if learnings from this field can be
             | further applied to AI development.
        
             | p1necone wrote:
             | You just need a neural network that knows how to
             | interrogate other neural networks, and a neural network
             | that knows how to find new datasets and train new neural
             | networks and you potentially have something very scary
             | right?
             | 
             | Maybe a third neural network that knows how to purchase
             | cloud compute?
        
               | TrevorJ wrote:
               | >Maybe a third neural network that knows how to purchase
               | cloud compute?
               | 
               | Amazon has entered the chat
        
               | RootKitBeerCat wrote:
               | Private angel investors watch from above all of the
               | "clouds"
        
           | miguelxpn wrote:
           | That book gave me the highest dose of existential crisis I've
           | ever felt. I should probably re-read it.
        
             | zro wrote:
             | Blindsight is one of my all time favorite books.
             | 
             | Looks like it's still available for free from the author's
             | website [0]
             | 
             | [0] https://www.rifters.com/real/Blindsight.htm
        
               | mgdlbp wrote:
               | Indeed, it's actually irrevocably CC BY-NC-SA 2.5
        
           | metacritic12 wrote:
           | Are humans just "Chinese rooms"? We don't really understand
           | anything deeply, but our neurons just fire in a way that
           | gives good responses and makes us feel like we understand
           | stuff.
        
             | monkeynotes wrote:
             | I think this is exactly true.
        
               | plmpsu wrote:
               | No you don't.
        
         | joe_the_user wrote:
         | Yes, DALL-E is very impressive to see and can have a number of
         | actual practical uses.
         | 
         | But fear of AGI is huge currently. The more impressive non-AGI
         | things we see, the more worried people naturally become that
         | we're reaching the "dawn" of AGI with all the disturbing
         | implications that this might have. (A lot of people are afraid
         | an AGI might escape the control of its creator and destroy
         | humanity. I think that's less likely but I think AGI under
         | control of it's creator could destroy or devastate humanity so
         | I'd agree AGI is a worry).
         | 
         | That DALL-E doesn't understand object-relationships should be
         | obvious to people who know this technology but a lot of people
         | seem to need it spelled-out. And they probably need it spelled
         | why this implies it's not AGI. But that would be several more
         | paragraphs for me.
        
         | jeremyjh wrote:
         | The whole point is it isn't doing so coherently if similar
         | images were absent from the training data. The monkey wasn't
         | touching the iguana.
        
         | jcelerier wrote:
         | I don't understand how my brain isn't just gluing things
         | together either. I don't personally feel like I'm actually
         | experiencing the understanding of anything
        
           | drooby wrote:
           | After reading your message I felt compelled to click reply
           | and generate a response and now that's what I'm doing. My
           | comment feels mildly appropriate to what you said and I'm not
           | 100% sure how it will be received but it _feels_ like I'm
           | communicating some degree of understanding of the sentiment
           | in which you have expressed. My neurons are happy.
        
             | RC_ITR wrote:
             | Do you think GPT 3 could generate the same string of text?
        
               | drooby wrote:
               | I was curious so I asked it some questions... here you
               | go:
               | 
               | Me: Write what it feels like to understand something
               | 
               | GPT-3: It feels good to understand something. You feel
               | like you have a handle on what is going on and you are
               | able to make sense of it. It can feel empowering and give
               | you a sense of control.
               | 
               | Me: Write a sentence that provides an example of self
               | reflection and metacognition
               | 
               | GPT-3: I was surprised that I was able to understand the
               | concept so quickly.
               | 
               | Me: What concept?
               | 
               | GPT-3: The concept was metacognition.
               | 
               | Me: What does surprisal feel like in this context?
               | 
               | GPT-3: It feels good to be surprised that you understand
               | something. You feel like you are smart and have a good
               | handle on the concept.
               | 
               | Me: What is the experience of feeling good to you?
               | 
               | GPT-3: The experience of feeling good is one of happiness
               | and satisfaction. You feel like you have accomplished
               | something and are proud of yourself.
               | 
               | Me: What have you accomplished here?
               | 
               | GPT-3: You have accomplished understanding a concept.
               | 
               | Me: What concept is that again?
               | 
               | GPT-3: The concept is metacognition.
        
         | orlp wrote:
         | The most important thing I think DALL-E shows is that it has a
         | model of our world and culture. It's not intelligence, but it
         | is knowledge.
         | 
         | Google can give you endless pictures of giraffes if you search
         | for it. But it can only connect you to what exists. It doesn't
         | know things, it knows OF things.
         | 
         | DALL-E has knowledge of the concept of a giraffe, and can
         | synthesize an endless amount of never-before seen giraffes for
         | you. It actually knows what a giraffe is.
        
           | TremendousJudge wrote:
           | I find talking about "knowing" is too much anthropomorphizing
           | to my taste. A more accurate description would be that it can
           | synthesize an endless amount of pictures that the viewer will
           | recognize as giraffes (or as anything else the viewer wants).
           | And I agree, it's pretty amazing. But my point is that
           | talking like this skips over the entire useless "what is
           | knowledge" debate.
        
             | soco wrote:
             | Maybe we need to "de-antropomorphize" these terms and
             | rules. It's a long due process, I believe.
        
               | psyc wrote:
               | If it'll get people to stop accusing other people of
               | anthropomorphizing, I'm 100% on board. Now if you can
               | think of a way to get people to quit asking for
               | definitions of the words _intelligence_ and
               | _consciousness_ , you'll be a legend and maybe these
               | discussions will advance for the first time.
        
               | TremendousJudge wrote:
               | Well, I tried my best. Another commenter summed it up
               | better without using the term "anthropomorphizing" at
               | all, but it came out a bit more condescending than
               | necessary.
        
               | IIAOPSW wrote:
               | Hang on while I become a legend.
               | 
               | I use GPT3 to generate the usual trite arguments about
               | intelligence and consciousness why computers won't ever
               | get there. Of course I don't actually reveal that a
               | computer is generating my responses until later on.
               | Eventually everyone will become jaded and skeptical that
               | the other participants in that conversation are real
               | people. Soon all arguments about machine intelligence
               | will devolve into accusations of using GPT3 or not. Some
               | day, even mentioning consciousness will just make
               | everyone assume you're probably a GPT3 troll. This kills
               | the conversation in a way that makes a valid point. If
               | the bots can't be reliably identified, the proof is in
               | the pudding and the matter is settled.
        
               | TremendousJudge wrote:
               | I don't think it's possible. "Knowing" and
               | "understanding" are inherently human experiences. The
               | most reasonable thing to do would be to invent new terms
               | and relegate the older ones, that have too much cultural
               | baggage attached to them.
               | 
               | The same way when the creator of humanity turned out to
               | be "evolution by natural selection" we didn't redefine
               | the term "God" to mean that. Eventually we just started
               | using the new term.
        
             | MauranKilom wrote:
             | I too can write a program that will synthesize an endless
             | amount of pictures that the viewer will recognize as a
             | giraffe. They would all be the same picture, just with one
             | bit of noise randomly applied to each pixel - enough to
             | generate more distinct 256x256 pictures of giraffes than
             | there are atoms in the universe.
             | 
             | We need to go a bit deeper to tease out what makes DALL-E 2
             | amazing.
        
           | soco wrote:
           | A honest question: what is then intelligence? Or are you just
           | trying to avoid being caught into a debate...
        
             | wcarss wrote:
             | you could think of "knowing what a giraffe is" in this
             | instance as a slice of an intelligent thought, but it's not
             | a whole thought, let alone a train of thoughts, or a plan
             | involving many trains of thought over time for internally
             | motivated reasons.
             | 
             | If we could get it to tell us a story about a particular
             | giraffe, and then ask it next week about that same giraffe,
             | and then the giraffe could be referenced by it while it
             | tells a joke on a talk show in a decade -- that's maybe too
             | high a bar, but that's real intelligence.
        
               | shawnz wrote:
               | Why is long term memory a requirement? Are amnesiacs not
               | intelligent?
        
               | wcarss wrote:
               | > that's maybe too high a bar, but that's real
               | intelligence.
               | 
               | It's not necessarily a requirement, but I couldn't
               | witness someone do it and then deny their intelligence.
        
               | shawnz wrote:
               | But it's not sufficient either, right? Just having memory
               | alone also wouldn't prove intelligence, so it's not
               | necessary or sufficient
        
           | trention wrote:
           | It knows what combinations of pixels correspond to the
           | metadata having "giraffe" in it. Nothing more.
        
             | blooperdeoo wrote:
        
           | [deleted]
        
           | tellarin wrote:
           | " It actually knows what a giraffe is." -> It maybe has a
           | sense of what a giraffe looks like.
        
             | lancesells wrote:
             | I wonder what would happen if you put in the text prompt:
             | "Long-necked spotted mammal that eats leaves from trees".
             | 
             | Most people would answer that as a giraffe.
        
               | BudaDude wrote:
               | I was curious about this as well. Here is what I got:
               | https://i.imgur.com/Ey0i3My.png
        
               | Marazan wrote:
               | Very similar to my images for the prompt:
               | https://imgur.com/8dw2brf
               | 
               | Intriguing that it's gone for a headshot for all of them.
               | I suspect it says something about the source text
        
               | coel wrote:
               | I was curious about this too and wondered if it could
               | then relate to my description of a giraffe so put in
               | "legs of a yellow mammal with a long neck and spots" and
               | it still nailed it.
        
               | lancesells wrote:
               | Dall-E knows what a giraffe is IMO
        
               | nextaccountic wrote:
               | I found it incredible too
               | 
               | The text model of dall-e at very least can map "Long-
               | necked spotted mammal that eats leaves from trees" near
               | the same representation of "giraffe"
        
               | mechagodzilla wrote:
               | The pictures are incredibly similar to what one gets when
               | you type that same string in to google image search - it
               | makes me wonder if you could do reverse-image search on
               | some of the pictures it produces and find whole training
               | pictures embedded in it, the way one can find blocks of
               | intact training text stored inside language models.
        
             | notahacker wrote:
             | One of the interesting things about the "monkey touching
             | iguana" photos is just how abstract that sense is.
             | 
             | Several of the pictures look like it's taken geometry from
             | a 'touching' source and painted both 'monkey' and 'iguana'
             | textures on to both figures, because on the one hand its
             | model of relationships is too sophisticated to just
             | copy/paste monkey and iguana photos from its library, and
             | on the other hand it's not sophisticated enough to always
             | treat "monkey touching iguana" as implying that the monkey
             | and the iguana are discrete animals. (An interesting
             | contrast with it being generally praised for being
             | remarkably good at things like putting hats on animals'
             | heads...)
        
               | XorNot wrote:
               | On the other hand if you posted "monkey touching iguana"
               | as a prompt on an internet forum - say a photoshop
               | contest - you would get essentially all of these variants
               | from actual humans as well when the task delved into
               | being "meta".
        
               | notahacker wrote:
               | People are certainly _capable_ of wilfully
               | misinterpreting the prompt because they consider monkey
               | /iguana chimeras to be amusing to draw, or think that
               | semi-abstract hybrid animals with a caption musing on the
               | meaning of "touching" is a higher form art than mere
               | representation. And in a photoshop competition they might
               | all copy one silly interpretation because they think it's
               | meme-worthy.
               | 
               | But I don't think if you asked 18 humans to independently
               | draw "monkey touching iguana" you'd get 17 pairs of
               | monkey/iguana hybrids mostly not touching each other
               | against photographic backgrounds often featuring human
               | limbs and one apparently normal monkey being pursued by a
               | giant iguana!
        
             | bpodgursky wrote:
             | Let's not play word games.
             | 
             | Do _you_ know what a giraffe _is_? No, you just know what a
             | giraffe looks like, where it lives, and maybe that it 's
             | vaguely related to a horse.
        
               | axus wrote:
               | I do not _grok_ a giraffe in the same way I do the things
               | I work on.
        
               | shafoshaf wrote:
               | That last bit about the horse is pretty loaded. Knowing
               | what a giraffe is means knowing that it is a mammal, has
               | a blood system, eats leaves, reacts certain ways to
               | certain stimuli, etc... That is a lot more than just
               | knowing what it looks like. Image recognition is not the
               | same a relationships or knowledge.
        
               | influxmoment wrote:
               | > mammal, has a blood system, eats leaves, reacts certain
               | 
               | DALL-E likely can map all those concepts to a giraffe
               | also.
        
               | karatinversion wrote:
               | By your criteria, no human before the 18th century (when
               | Carl Linnaeus invented the class "mammalia") knew what a
               | giraffe was.
        
               | usrusr wrote:
               | And in the 25th century, if we manage keep technology
               | alive, the dall-es of the day will be trained in more
               | computer depictions of long extinct animals that
               | accidentally made it into the training set than pictures
               | of real animals. It will be Durer's rhinoceros all over
               | again.
        
               | orlp wrote:
               | Now ask DALL-E "giraffe feeding young", "giraffe organs",
               | "giraffe eating", etc, I think you'd find most knowledge
               | is there.
        
               | teolandon wrote:
               | What about ,"giraffe laying eggs"?
        
               | IIAOPSW wrote:
               | That's not a word game at all. To know what a giraffe is
               | is to be able to understand the implications of being
               | informed that something is a giraffe. Dalle would never
               | assume on its own that "a giraffe close to a low flying
               | helicopter" could be very dangerous for the giraffe, but
               | "a horse close to a low flying helicopter" is safe.
        
               | nextaccountic wrote:
               | > Dalle would never assume on its own that "a giraffe
               | close to a low flying helicopter" could be very dangerous
               | for the giraffe, but "a horse close to a low flying
               | helicopter" is safe.
               | 
               | Are you sure? (Can someone run down those prompts?)
               | 
               | Dall-e might be able to make those relationships in the
               | latent space
        
               | gattilorenz wrote:
               | > Are you sure?
               | 
               | Considering Dall-e has problems painting "a red cube on
               | top of a blue cube" [1] and all kind of simple spatial
               | relations, I'd say it's a fair shot.
               | 
               | [1] As reported by OpenAI, but there are also some
               | prompts by Gary Marcus et al.
               | (https://arxiv.org/abs/2204.13807) showing this, and it's
               | trivially simple to find other very simple cases like
               | these
        
               | Earw0rm wrote:
               | Not really close to horses at all. It's an even-toed
               | ungulate, closer to deer, cows and sheep. Many, many tens
               | of millions of years from horses.. the even-toed
               | herbivores are closer to dolphins (!!) than they are to
               | horses or rhinos.
        
               | leereeves wrote:
               | You know a lot more than that. You know that it eats,
               | sleeps, runs, mates. You know that there are male and
               | female giraffes and what that means. You know that it
               | dies if it gets too hot, too cold, if it falls off a
               | cliff. You know that it has legs, a head, neck, etc and
               | you probably know that it has a heart, brain, lungs,
               | kidneys, etc inside.
               | 
               | You know, vaguely, how it interacts with other things and
               | what it's likely to do around a fruit tree, or a lion, or
               | fire.
               | 
               | Even if you've never been close to a giraffe, you can
               | probably imagine what it looks like from close enough to
               | see individual hairs in its fur.
               | 
               | A lot of knowledge is still missing from ML systems that
               | don't interact with the world.
        
               | hypertele-Xii wrote:
               | It's even deeper than that. Humans and giraffes are
               | mammals, _very_ close cousins in terms of evolutionary
               | biology. The very nature of your experience of reality is
               | essentially shared with the giraffe.
               | 
               | Part of our brains are _lizard,_ both yours and the
               | giraffes. Tech so ancient that it uses the same circuits
               | and chemicals with _crustaceans._
               | 
               | You can imagine what existence is like for a giraffe with
               | pretty much 99% accuracy without consciously knowing a
               | single thing about it.
               | 
               | A word-based image generator cannot.
        
               | kilroy123 wrote:
               | What do you mean by "is"? I've spent a fair amount of
               | time with giraffes in real life. Does that mean I know
               | what one is?
        
               | mym1990 wrote:
               | I would really hope so. All of this conversation is
               | devolving into a philosophical argument about 2 letter
               | words. To know what something is relates to understanding
               | some of its properties, and how it might interact with
               | the world around it. Whether DALL-E knows what something
               | is, is debatable. Whether you know what something is
               | should be pretty clear.
        
           | romellem wrote:
           | If this was around in 2009 it would have made the
           | https://www.onemilliongiraffes.com/ project much easier.
        
           | SketchySeaBeast wrote:
           | > It actually knows what a giraffe is.
           | 
           | Every time I see a painting it does with blank biege swirls
           | where faces should be I question that it does know what
           | anything is or what the important parts are.
        
             | LordDragonfang wrote:
             | I suspect you're thinking of Craiyon (formerly DALL-E Mini)
             | which is a derivative (but distinct) project, and
             | consistently does the beige swirls. DALL-E 2 proper is much
             | higher fidelity, and seems much more reliable at generating
             | passable (if imperfect) faces from what I've seen.
        
           | rob74 wrote:
           | What's still missing, as the article mentions, is the correct
           | relationship between objects. As in this post from a few
           | months ago:
           | https://astralcodexten.substack.com/i/57243633/the-tenth-
           | vir... - it knows that Tycho Brahe was a bearded guy, and it
           | knows what a telescope looks like, but when you ask for
           | "Tycho Brahe looking through a telescope", the person
           | sometimes looks _at_ the telescope, or _past_ it, sometimes
           | the telescope just sits there, but the person never actually
           | looks _through_ it.
        
           | amelius wrote:
           | Reminds me of a classifier that couldn't tell the difference
           | between a panther and a couch with a panther-print.
        
             | dane-pgp wrote:
             | But what if the couch wasn't panther-print but made from
             | actual panther hide? Is a dead panther still a panther? How
             | much of the panther's body has to be visible for the object
             | depicted to still be a panther?
             | 
             | I'm reminded of the classic joke about the mathematician on
             | a train who sees a sheep...
             | 
             | https://bestjokehub.com/joke-103626/
        
           | blooperdeoo wrote:
           | " It actually knows what a giraffe is."
           | 
           | No. You know what a giraffe is, Dall*E simply creates pixel
           | groups which correlate to the text pattern you submitted.
           | 
           | Watching people discuss a logical mirror scares me that most
           | people are not themselves conscious.
        
             | rhdunn wrote:
             | There's an interesting Google TechTalk on YouTube called
             | "The Next Generation of Neural Networks" from 2007 [1]. In
             | that video, there is a demo [2] that showcases the neural
             | network recognizing numbers when given a drawing of a
             | number as inputs, but more interesting is the follow on
             | where the demo shows fixing the output to a given number
             | and letting the neural network generate the "input" image
             | to show what it is thinking that number can be. That is a
             | strong indication to me that that particular neural network
             | has a good understanding of what the particular number
             | glyphs look like, even if it does not know what those are
             | conceptually or how they relate to each other and to
             | mathematics -- that is, that the neural network would not
             | be able to work out what number 42 is, how it relates to 37
             | (i.e. 37 < 42), or how to manipulate those numbers (e.g. 2
             | + 7 = 9).
             | 
             | Dall-E will likely be similar in that it is effectively
             | doing that perception step where you fix the text
             | description from the classifier output and run that in
             | reverse to show what the neural network is "seeing" when it
             | is "thinking" about that given output. So it won't be able
             | to describe features of a giraffe, or information about
             | where they live, etc. but it will be able to show you what
             | it thinks they look like.
             | 
             | [1] https://www.youtube.com/watch?v=AyzOUbkUf3M [2]
             | https://youtu.be/AyzOUbkUf3M?t=1293
        
             | gojomo wrote:
             | There's a lot of people who make this same argument -
             | DALLE/GPT/etc is just a 'mirror' or 'parrot' - but they
             | rarely make convincing supporting arguments.
             | 
             | They just assert it as axiomatic, whistling-past all the
             | ways that they themselves - unless they believe in
             | supernatural mechanisms - are also the product of a finite
             | physical-world system (a biological mind) and a finite
             | amount of prior training input (their life so far).
             | 
             | I'm beginning to wonder if the entities making this
             | argument are conscious! It seems they don't truly
             | understand the issues in question, in a way they could
             | articulate recognizably to others. They're just repeating
             | comforting articles-of-faith that others have programmed
             | into them.
        
               | mherdeg wrote:
               | > I'm beginning to wonder if the entities making this
               | argument are conscious! It seems they don't truly
               | understand the issues in question, in a way they could
               | articulate recognizably to others. They're just repeating
               | comforting articles-of-faith that others have programmed
               | into them.
               | 
               | I took a quick look at the Stanford Encyclopedia of
               | Philosophy entry for philosophical zombies (
               | https://plato.stanford.edu/entries/zombies/ ) and I can't
               | see evidence of this argument having been seriously
               | advanced by professionals before. I think it would go
               | something like:
               | 
               | "Yes, we have strong evidence that philosophical zombies
               | exist. Most of the laypeople who discuss my line of work
               | are demonstrably p-zombies."
        
               | gfodor wrote:
               | Use twitter long enough and it's hard to not be convinced
               | that p-zombies exist - it may be that such zombies are
               | ephemeral though, and a transitory state of some
               | individuals.
        
               | FeepingCreature wrote:
               | Alternately, of course, conscious awareness is the
               | ephemeral state.
               | 
               | ("A laser is trying to find the darkness...")
        
               | martindbp wrote:
               | Yes, and DALLE clearly knows what a giraffe is, maybe not
               | in the same way as we do, but it has placed the giraffe
               | in some latent space close to other animals like it but
               | far away from say, cars. But yes, there is plenty it
               | doesn't know about giraffes because it's only been fed
               | images and captions, while we can watch documentaries
               | about giraffes and apply knowledge we've acquired about
               | other animals and the world through our senses.
        
               | drdeca wrote:
               | I think I'd say that it knows what a giraffe _looks like_
               | and some aspects of their behavior. I'm not sure if it
               | (in a sense) "knows that a giraffe is an animal" though I
               | wouldn't be too surprised if it does.
               | 
               | Though, I guess maybe that is all "knowing what a giraffe
               | is" is?
        
               | usrusr wrote:
               | I know what a giraffe looks like and some aspects of
               | their behavior. I couldn't build a giraffe, I've never
               | been a giraffe, I wouldn't even know what to talk about
               | if I met a giraffe on a blind date.
        
               | saynay wrote:
               | At some point, these discussions always get to some
               | question like "what is 'knowing'?"
               | 
               | Would a person who only ever read articles and looked at
               | pictures of giraffes have a better understanding of them
               | than Dall-e does? At some level, probably, in that every
               | person will have a similar lived experience of _being_ an
               | animal, a mammal, etc. that Dall-e will never share. Is
               | having a lesser understanding sufficient to declare it
               | has no real understanding?
        
               | [deleted]
        
               | joe_the_user wrote:
               | _There 's a lot of people who make this same argument -
               | DALLE/GPT/etc is just a 'mirror' or 'parrot' - but they
               | rarely make convincing supporting arguments._
               | 
               | I suggest you look at the parent article.
               | 
               | Defining "understanding" in the abstract is hard or
               | impossible. But it's easy to say "if it can't X, it
               | couldn't possibly understand". Dall-E doesn't manipulate
               | images three dimensionally, it just stretch images with
               | some heuristics. This is why the image shown for "a cup
               | on a spoon" don't make sense.
               | 
               | I think this is a substantial argument and not hand-
               | waving.
        
               | gojomo wrote:
               | In many of its outputs, it's doing far more than
               | stretching, and essentially rotating new entities into
               | scene-appropriate relationships.
               | 
               | True, it has some problems fully abstracting, and then
               | logically-enforcing, some object-to-object relationships
               | that most people are trivially able to apply as
               | 'acceptance tests' on candidate images. That is evidence
               | its scene-understanding is not yet at human-level, in
               | that aspect - even as it's exceeded human-level
               | capabilities in other aspects.
               | 
               | Whether this is inherent or transitory remains to be
               | seen. The current publicly-available renderers tend to
               | have a hard time delivering requested meaningful text in
               | the image. But Google's PARTI claims that simply growing
               | the model fixes this: see, for example:
               | https://twitter.com/gojomo/status/1540095089615089665
               | 
               | We also should be careful using DALL-E as an accurate
               | measure of what's possible, because OpenAI has
               | intentionally crippled their offering in a number of ways
               | to avoid scaring or offending people, under the rubric of
               | "AI safety". Some apparent flaws might be intentional, or
               | unintentional, results of the preferences of the
               | designers/trainers.
               | 
               | Ultimately, I understand the practicality of setting
               | tangible tests of the form, "To say an agent
               | 'understands', it MUST be able to X".
               | 
               | However, to be honest in perceiving the rate-of-progress,
               | we need to give credit when agents defeat all the point-
               | in-time MUSTs, and often faster than even optimists
               | expected. At that point, searching for new MUSTs that
               | agent fails at is a valuable research exercise, but
               | retroactively adding such MUSTs to the definition of
               | 'understanding' risks self-deception. "It's still not
               | 'understanding' [under a retconned definition we
               | specifically updated with novel tough cases, to comfort
               | us about it crushing all of our prior definition's
               | MUSTs]." It obscures giant (& accelerating!) progress
               | under a goalpost-moving binary dismissal driven by
               | motivated-reasoning.
               | 
               | This is especially the case as the new MUSTs increasingly
               | include things many, or most, humans don't reliably do!
               | Be careful who your rules-of-thumb say "can't possibly be
               | coceptually intelligent", lest you start unpersoning lots
               | of humanity.
        
               | joe_the_user wrote:
               | I read a modest amount of popular "AI safety" text
               | (lesswrong basically) and this is the first time I've
               | heard of Dall-E being crippled for safety reasons.
               | 
               | Your argument overall seems to take "you skeptics keep
               | moving the bar, give me a benchmark I can pass and I'll
               | show you", which seems reasonable on it's face but I
               | don't think actually works.
               | 
               | The problem is that while algorithm may be defined by
               | theory and tested by benchmark, the only "definition" we
               | have for general intelligence except "what we can see
               | people doing". If I or anyone had a clear, accepted
               | benchmark for general intelligence, we'd be quite a bit
               | further towards creating it but we're not there.
               | 
               | That said, I think one thing that current AIs lack is an
               | understanding of it's own processing and an understanding
               | of the limits of that processing. And there are many
               | levels of this. But I won't promise that if this problem
               | is corrected, I won't look at other things. IDK,
               | achieving AGI isn't like just passing some test, no
               | reason it should be like that.
        
               | weego wrote:
               | I guess ironically you've just recreated the argument
               | people of faith use to shout down logic used to question
               | the reality of their belief system.
               | 
               | The burden of proof is not on the one claiming logically
               | consistent interpretations of events.
        
               | gojomo wrote:
               | Huh? That's a bit abstract, can you clarify?
        
               | mlcrypto wrote:
               | Yep human brain is just neurons firing or chemicals
               | activating
        
               | meroes wrote:
               | And don't blame others for not finding this satisfactory
               | either. Many many mathematicians think abstract objects
               | exist outside of physical reality. Chomsky says
               | "physical" hasn't been well defined for 200 years. And
               | finite physical processes do not seem adequate to explain
               | the infinite character of language and mathematical
               | infinity. Or if they are, then go inform the mathematical
               | realists infinity isn't real because all proofs are
               | finite yet Godel believed in actual infinity.
        
               | bglazer wrote:
               | I can give a random string of letters as a prompt and
               | DALLE will generate coherent images based on that. To me,
               | that is as clear a signal as any that there is no
               | reasoning or even a consistent world model embodied in
               | DALLE. It's simply a high dimensional latent mapping
               | between characters and pixels. Like OP said, that is a
               | stupendous achievement, but it is just a very complex and
               | impressive mirror. If it wasn't just a mapping between
               | characters and pixels, and instead DALLE had intelligence
               | that "understood" the symbols it manipulated, then I
               | would expect it to generate nothing, or white noise in
               | response to random letters.
        
               | elcomet wrote:
               | I agree there's no high level reasoning (if there's a way
               | to define this).
               | 
               | But saying "It's simply a high dimensional latent mapping
               | between characters and pixels" is clearly a very bad
               | argument. Your brain is simply a high dimensional latent
               | mapping between your sensory input and your muscular
               | output. This doesn't make you not intelligent.
        
               | thfuran wrote:
               | >Your brain is simply a high dimensional latent mapping
               | between your sensory input and your muscular output
               | 
               | It definitely does more than that.
        
               | elcomet wrote:
               | What do you mean? Your brain _is_ a high dimensional
               | function. I 'm not talking about what it is able to do,
               | I'm talking about what it is.
        
               | gojomo wrote:
               | If you give humans gobbledygook input - say a Rorschach
               | blot, or some random syllables - and ask them to describe
               | what it makes them think of, or generate art from it,
               | many will do so quite competently.
               | 
               | They'll find some morsels of fragmentary hints-of-meaning
               | in the junk, or just act from whatever's bouncing around
               | in their own 'ground state', and make something
               | interesting & coherent, to please their interlocutor.
               | 
               | So I don't see why this corner-case impugns the level-of-
               | comprehension in DALLE/etc - either in its specific case,
               | nor in the other cases where meaningful input produces
               | equally-meaningful responses.
               | 
               | In what ways are you yourself not just a "very complex &
               | impressive mirror", reflecting the sum-of-all external-
               | influences (training data), & internal-state-changes,
               | since your boot-up?
               | 
               | Your expectationthat random input should result in noise
               | output is the weird to me. _People_ can see all sorts of
               | omens  & images in randomness; why wouldn't AIs?
               | 
               | But also: if you trained that expectation into an AI, you
               | could get that result. Just as if you coached a human, in
               | a decade or 2 of formal schooling, that queries with less
               | than a threshold level of coherence should generate an
               | exceptional objection, rather than a best-guess answer,
               | you could get humans to do so.
        
               | bglazer wrote:
               | I think most people would find it more difficult to draw
               | a "serkle nufding" than a "spotted dog". At the very
               | least, they would need an explanation that you're looking
               | for whatever pops into their head for the nonsense.
               | Again, that indicates that human intelligence works in a
               | fundamentally different way from DALLE, which has
               | absolutely no idea what is nonsense. So, in this
               | incredibly important task (recognizing nonsense) there's
               | an asymmetry: people can act like DALLE, but DALLE can
               | not act like humans. To me, that totally breaks any
               | equivalence between DALLE and human intelligence.
        
               | dtech wrote:
               | Then again, if you lock a human in a room, tell it "draw
               | me things based on the notes you receive or you will be
               | killed", and give it a note "serkle nufding", you bet the
               | human will draw _something_.
               | 
               | Given the model has only 1 input and 1 output and
               | training is essentially surviving that order, it's not
               | dissimilar.
        
               | gojomo wrote:
               | I've not tested this, but I suspect you can get DALLE to
               | create pictures that humans are more likely to describe
               | as 'nonsense' by adding 'nonsense' or 'nonsensical' to
               | the prompt. That'd indicate that it absolutely _does_
               | have an idea about  'nonsense' and can recognize, &
               | reproduce within its constrained output, 'nonsense'
               | that's largely compatible with human ideas of what
               | 'nonsense' is.
               | 
               | Asking it to produce _noise_ , or raise an objection that
               | a prompt isn't sufficiently meaningful to render, is a
               | silly standard because it's been designed, and trained,
               | to always give _some_ result. Humans who can object have
               | been trained differently.
               | 
               | Also, the GPT models - another similar train-by-example
               | deep-neural architecture - can give far better answers,
               | or give sensible evaluations of the quality of its
               | answer, _when properly prompted to do so_. If you wanted
               | a model that 'd flag nonsense, just give it enough
               | examples, and enough range-of-output where the answer
               | your demanding is even _possible_ , and it'll do it.
               | Maybe better than people.
               | 
               | The circumstances & limits of the single-medium (text, or
               | captioned image) training goals, and allowable outputs,
               | absolutely establish that these are _different_ from a
               | full-fledged human. A human has decades of reinforcement-
               | training via multiple senses, and more output options,
               | among other things.
               | 
               | But to observe that _difference_ and conclude these
               | models don 't "understand" the concepts they are so
               | deftly remixing, or are "just a very complex and
               | impressive mirror", does not follow from the mere
               | difference.
               | 
               | In their single-modalities, constrained as they may be,
               | they can train the equivalent of a million lifetimes of
               | reading, or image-rendering. Objectively, they're
               | arguable now better at composing college-level essays, or
               | rendering many kinds of art, than most random humans
               | picked off the street would be. Maybe even better than
               | 90% of all humans on earth at these narrow tasks. And,
               | their rate of improvement seems only a matter of how much
               | model-size & training-data they're given.
               | 
               | Further: _the narrowness of the tasks is by designers '
               | choice, NOT inherent to the architectures._ You could -
               | and active projects are - training similar multi-modality
               | networks. A mixed GPT/DALLE that renders essays with
               | embedded supporting pictures/graphs isn't implausible.
        
               | londons_explore wrote:
               | I just got Dalle-2 to draw a "serkle nufding". It gave me
               | 4 pictures of rather odd looking food. I too assume that
               | random words are usually food, if only because restaurant
               | menus seem to have such a love of making up foreign
               | sounding words.
        
               | dane-pgp wrote:
               | > a love of making up foreign sounding words
               | 
               | One great example of this phenomenon is "Haagen-Dazs".[0]
               | 
               | Admittedly that's a brand name, rather than a specific
               | dish, but I assume that Dalle-2 would generate an image
               | of ice cream if given a prompt with that term in it
               | (unless there is a restriction on trademarks?).
               | 
               | [0] https://funfactz.com/food-and-drink-facts/haagen-
               | dazs-name/
        
               | caconym_ wrote:
               | When I read the words "serkle nufding" I pictured in my
               | head, with zero perceptible delay, a weird little goblin
               | guy with pale green skin and long ears and a pointy hat.
        
               | sarchertech wrote:
               | Serkle Nufding sounds like a word. What about
               | "xywjlkbtywlsjjsnskksbrjsksbsysjsnsjdkdmdnd"?
        
               | caconym_ wrote:
               | Honestly? The top half of a signpost bearing the name of
               | a Welsh village in white on green (likely an artifact of
               | my actual local signage standards). The tops of some
               | broadleaf deciduous trees, possibly ash, are visible in
               | the background, but the perspective is from below (as one
               | usually looks up at a street sign, also probably an
               | artifact of my local experience with signs rather than a
               | true representation of Welsh signage) and so the
               | background is dominated by a blue sky.
               | 
               | This is probably because your string is a low-entropy
               | keyboard-mash.
        
               | simondw wrote:
               | > they would need an explanation that you're looking for
               | whatever pops into their head
               | 
               | Of course. And the equivalent of that explanation is
               | baked into DALL-E, in the form of its programming to
               | always generate an image.
               | 
               | > but DALLE can not act like humans
               | 
               | No, not generally, but I don't think anyone has claimed
               | that.
        
               | bglazer wrote:
               | What else always produces an image, no matter what you
               | present to it?
        
               | simondw wrote:
               | Lots of things, but I'm not sure why you're asking that.
               | Could you connect that back to the point you're making?
        
               | saynay wrote:
               | I am not really sure that this is a basis to say human
               | intelligence is fundamentally different. If you ask a
               | human to draw a picture based on nonsensical input, they
               | probably wont. That might not be a difference in process,
               | but in agency.
               | 
               | So is a single-purpose AI equivalent to the entirety of
               | the Human Experience? Of course not. But can it be
               | similar in functionality to a small sliver of it?
        
               | IIAOPSW wrote:
               | I've done my share of testing randomish prompts. Here's
               | two examples that might change your mind slightly. The
               | first example is "E sublimary widge fraus chronly becross
               | yon corpti." I constructed this sentence to resemble the
               | grammar, syllables and pacing of English, but it is
               | complete nonsense. The output was both consistent and
               | coherent. In the case of DallE, it generated church and
               | saint related images. The second example was ".- / ... .
               | -- .- .--. .... --- .-. . / .-.. .- -- .--. / .- -... ---
               | ...- . / - .... . / -.-. .-.. .- ... ... --- -. / .- ...-
               | . -. ..- . / ... ..- -... .-- .- -.-- / ... - .- - .. ---
               | -." There are clearly not enough (any?) samples of morse
               | code text in the training. The output was wholly
               | unrelated to the prompt. The output is kind of like
               | doodling. Its fairly random, fairly undirected without
               | any real theme, but its not the same randomness as TV
               | static. Its more like randomness from a distribution of
               | things people would draw when bored.
               | 
               | So the interesting part is this, why did one random
               | prompt fail in a consistent way and the other in a random
               | way? Perhaps the encoding of meaning into vocabulary has
               | patterns to it that we ourselves haven't noticed. Maybe
               | your random string experiment works because there is some
               | amount of meaning in the syllables that happened to be in
               | your chosen string.
        
               | rhdunn wrote:
               | I suspect it is designed to describe images, like you
               | would find in alt text but run in reverse. As your first
               | example is closer to English it can do a better job than
               | with the second example, like how the Jabberwocky poem
               | contains nonsense words but still makes some degree of
               | sense.
               | 
               | For example, the "yon" is immediately reconizable to me
               | (hither and yon), so "yon corpti" could mean a distant
               | corpti (whatever a corpti is). "becross" looks similar to
               | "across" but with a be- prefix (be-tween, be-neath, be-
               | twixt, etc.), so could be an archaic form of that.
               | "chronly" could be something time related (chronos+ly).
               | etc...
        
               | gojomo wrote:
               | Indeed - the 'cadence of English' prompt reminds me of
               | Latin, or some other foreign-litany that's nonetheless
               | repeated to an audience that's not following every word.
               | And ~IIAOPSW has used real word-roots, hinting at
               | meanings! (The FastText algorithm does a passable job at
               | synthesizing word-vectors for unknown words using such
               | fragments.) So rendering that as a religious scene isn't
               | too surprising.
               | 
               | That morse code gives nothing useful probably just
               | indicates some combination of - (a) few morse transcripts
               | in training set; (b) punctuation-handling in training or
               | prompting - makes it more opaque. It's opaque to me,
               | other than recognizing it's morse code.
        
               | semi-extrinsic wrote:
               | Your first random prompt is far from random. It contains
               | the fragments "sublim", "chr", "cross" and "corpt" in
               | addition to the isolated "E", which all project the
               | solution down towards Latin and Christianity.
               | 
               | You can easily see that these language models are in some
               | sense working on fragments as much as they are on the
               | actual words isolated by spaces in your sentence. Just
               | take a test sentence and enter as a prompt to get some
               | images. Then take that same sentence, remove all spaces
               | and add new spaces in random locations, making gibberish
               | words. You will see that the results will retain quite a
               | few elements from the original prompt, while other things
               | (predominantly monosyllables) become lost.
               | 
               | To me, I have not seen a single example that cannot just
               | be explained by saying this is all just linear algebra,
               | with a mind-bogglingly huge and nasty set of operators
               | that has some randomness in it and that projects from the
               | vector space of sentences written in ASCII letters onto a
               | small subset of the vector space of 1024x1024x24bit
               | images.
               | 
               | If you then think about doing this just in the "stupid
               | way", imagine you have an input vector that is 4096 bytes
               | long (in some sense the character limit of DALL-E 2) and
               | an output vector that is 3 million bytes long. A single
               | dense matrix representing one such mapping has 6 billion
               | parameters - but you want something very sparse here,
               | since you know that the output is very sparse in the
               | possible output vector space. So let's say you have a
               | sparsity factor of somewhere around 10^5. Then with the
               | 3.5 billion parameters of DALL-E 2, you can "afford"
               | somewhere around 10^5 such matrices. Of course you can
               | apply these matrices successively.
               | 
               | Is it then so far fetched to believe that if you thought
               | of those 10^5 matrices as a basis set for your
               | transformation, with a separate ordering vector to say
               | which matrices to apply in what order, and you then spent
               | a huge amount of computing power running an optimizer to
               | get a very good basis set and a very good dictionary of
               | ordering vectors, based on a large corpus of images with
               | caption, that you would not get something comparably
               | impressive as DALL-E 2?
               | 
               | When people are wowed that you can change the style of
               | the image by saying "oil painting" or "impressionist",
               | what more is that than one more of the basis set matrices
               | being tacked on in the ordering vector?
        
               | Joeri wrote:
               | Dall-e must generate an image, even for a random prompt
               | of letters, because it has been trained to do only that.
               | Take a human, hold a gun to their head, and force them to
               | paint an image for a random string of letters, and I
               | guarantee they will make you something.
               | 
               | Retrain dall-e and give it a choice whether it generates
               | an image or does something else, and you will get a
               | different outcome.
               | 
               | The argument boils down to this: is a human brain nothing
               | but a mapping of inputs onto outputs that loops back on
               | itself? If so the dall-e / gpt-3 approach can scale up to
               | the same complexity. If not, why not?
        
             | snapcaster wrote:
             | How do you know you know what a giraffe is?
        
               | logifail wrote:
               | > How do you know you know what a giraffe is?
               | 
               | Perhaps a more interesting question could be: [how] do we
               | know what consciousness is?
        
               | twoxproblematic wrote:
        
             | gfodor wrote:
             | Not exactly. If you're using the word "simply" you probably
             | don't quite understand what exactly DALL-E is doing. It's
             | not just mapping correlations between words to pixels. The
             | intermediate representations and textual attention are what
             | make DALL-E interesting.
        
             | jvanderbot wrote:
             | Precisely this! You don't know what a giraffe is, either.
             | But you can talk about a giraffe to yourself, using words
             | with correlating imagery that you can "see" by feeding
             | through your visual cortex, and that's close enough.
        
             | EricBurnett wrote:
             | Thought experiment: between you and the AI, which would do
             | a better job depicting a giraffe skeleton? A giraffe in
             | it's natural habitat? Their favorite tree to eat? Species
             | on the genetic tree closest to giraffes?
             | 
             | If we assume this AI or a successor can win that
             | evaluation, in what way would you say you know what a
             | giraffe is better than the AI?
        
             | adamsmith143 wrote:
             | Comprehension does not imply consciousness or vice versa...
        
             | chaxor wrote:
             | There are far too many that try to use this argument
             | without really knowing what they're talking about.
             | 
             | *Don't mistake skepticism for knowledge*
             | 
             | This is a major problem on this site and elsewhere.
        
             | elcomet wrote:
             | > You know what a giraffe is, Dall*E simply creates pixel
             | groups which correlate to the text pattern you submitted.
             | 
             | How would you tell the difference though? Can you think of
             | a test to distinguish between those two abilities ?
        
           | ramoz wrote:
           | Because you've mentioned Google, here's this:
           | https://imagen.research.google/
        
         | danielmorozoff wrote:
         | Where do you see DALLE automating away jobs?
        
           | soderfoo wrote:
           | Augmenting jobs is more likely, from what I have heard. It
           | would be useful for rapid prototyping of artwork and could
           | help designers get design specs to artists quicker.
           | 
           | The initial impressions are that production ready designs
           | will require an artists touch. But this discussion was in the
           | context of AA-Games and up. For indie and small games fully
           | AI produced art may be fine.
        
           | robertlagrant wrote:
           | There was a post on here recently on how someone designed
           | their logo by asking DALL-E a load of questions. The
           | rationale could be that it might not remove that job, but it
           | would certainly be a powerful tool for a logo designer to
           | take on way more clients, thus driving down the cost of logo
           | design.
        
             | lancesells wrote:
             | If I'm thinking of the same post that logo is poorly done /
             | chosen. I know the author wanted very specific imagery in
             | it and achieved that but I don't think that logo would be
             | considered "good" by most.
        
               | arbitrage wrote:
               | That was the first instance. These sorts of things only
               | get better over time.
        
             | odessacubbage wrote:
             | i feel like this boils down to a client-side
             | misunderstanding of where graphic designers actually spend
             | the majority of their time. i'd actually argue that working
             | from spec or even a crude doodle on the back of a napkin is
             | a lot easier than dealing with a client who is married to a
             | poor design that DALL-E puked out for them. one of the most
             | important things about the sketch phase is to be able to
             | iterate without too many assumptions while also being able
             | to recognize and play upon the strengths of individual
             | drawings. this is not a time consuming process but it also
             | isn't something you want to rush anymore so than you'd want
             | to rush the plastic surgeon performing your nose job.
             | depersonalizing the design process in favor of volume does
             | not particularly serve anyone and, moreover, it responds to
             | a need that i don't really think exists. this is not to say
             | that i don't think these kinds of image generators are
             | without uses, but right now we are still in the phase where
             | people are being impressed by graphical fidelity[which was
             | already incredibly cheap] rather than usefulness. imo this
             | stage of ai will primarily work well for tasks that benefit
             | from randomness while also being relatively low stake in
             | the overall production pipeline. training a network to
             | generate typefaces or to randomize npc portraits for an
             | open world game is precisely the sort of area where 'gluing
             | things together' can potentially provide more benefits than
             | headaches.
        
               | robertlagrant wrote:
               | I was saying a designer could use it for first draft
               | ideas, not their client.
        
           | laxd wrote:
           | I just cancelled my plan of paying a graphics designer to
           | prepare an android app (personal project) for launch. After
           | playing with dall-e I'm confident that "I" can make a logo
           | and some artwork "myself". Edit: ment too say icon, not logo.
        
           | IshKebab wrote:
           | The people who do digital art illustrations for magazine
           | articles are surely out of a job?
        
           | elondaits wrote:
           | I think it'll start by removing the need for editorial
           | illustration and photography for a lot of blogs and websites,
           | starting with the less reputable ones.
           | 
           | MidJourney is already good enough creating illustrations for
           | articles, book covers, etc. Not something that The New Yorker
           | will be interested in, but better than anything you can buy
           | for a couple of dollars or small monthly subscription.
        
             | notahacker wrote:
             | > I think it'll start by removing the need for editorial
             | illustration and photography for a lot of blogs and
             | websites, starting with the less reputable ones.
             | 
             | Stock photo sites and Google Images and basic graphic
             | filters did that a _long_ time ago. Sure, DALL-E has
             | certain creative possibilities that those don 't, and the
             | niches like satirical memes or "first pass" concepts in
             | agency brainstorming meetings where it could be absolutely
             | revolutionary tend to not be deliverables the creators get
             | paid for. The publications giving illustrators a reliable
             | income are usually after their consistent style, fantasy
             | world coherence or novel ideas which isn't exactly where
             | DALL-E shines.
        
               | Workaccount2 wrote:
               | DALL-E is the writing on the wall for those with a
               | competing talent to start looking for other (non-art)
               | work they may want to do in the future. It would be an
               | egregious error to think that AI art programs are some
               | new tool to learn akin to photoshop. Maybe that will hold
               | true for a few years at best.
               | 
               | There will come a point, much sooner than later, where
               | the value in typing prompts into an AI is going to only
               | be worth minimum wage.
        
               | notahacker wrote:
               | Substitute "the camera" (and perhaps even more "the
               | gramophone") for "DALL-E" and "AI art programs", and the
               | 19th century wants its argument against the future of
               | creative media back.
        
         | IshKebab wrote:
         | I agree. This paragraph is baffling:
         | 
         | > DALL-E's difficulty in juxtaposing wildly contrastive image
         | elements suggests that the public is currently so dazzled by
         | the system's photorealistic and broadly interpretive
         | capabilities as to not have developed a critical eye for cases
         | where the system has effectively just 'glued' one element
         | starkly onto another, as in these examples from the official
         | DALL-E 2 site:
         | 
         | Yes the public is so dazzled by this massive leap in capability
         | that it hasn't developed a critical eye for minor flaws.
         | 
         | Yeah we get it. It's not instantly perfect. But the fact that
         | people aren't moaning that it can't put a tea cup in a cylinder
         | isn't because everyone stupidly thinks it is perfect, it's
         | because not everyone is a miserable naysayer.
        
           | RC_ITR wrote:
           | "Don't people realize the sunset would be more beautiful if
           | you run it through saturation filters and do white
           | balancing?"
        
       | KingOfCoders wrote:
       | That's called creativity.
        
       | IIAOPSW wrote:
       | I have a phrase I'd like to coin in contrast to AI. "Artificial
       | Bullshit". AB.
       | 
       | I of course mean "bullshit" in the highly technical sense defined
       | by Frankfurt [1]. The defining feature that separates a
       | bullshitter from a liar is that a liar knows and understands the
       | truth and intentionally misrepresents the matters of fact to
       | further their aims, whereas a bullshitter is wholly unconcerned
       | with the truth of the matters they are discussing and is only
       | interested in the social game aspect of the conversation.
       | Bullshit is far more insidious than a lie, for bullshit can (and
       | often does) turn out to be coincident with the truth. When that
       | happens the bullshitter goes undetected and is free to infect our
       | understanding with more bullshit made up on the spot.
       | 
       | DallE generates the images it thinks you want to see. It is
       | wholly unconcerned with the actual objects rendered that are the
       | ostensible focus of the prompt. In other words, its bullshitting
       | you. It was only trained on how to get your approval, not to
       | understand the mechanics of the world it is drawing. In other
       | words, we've trained a machine to have daddy issues.
       | 
       | A profoundly interesting question (to me) is if there's a way to
       | rig a system of "social game reasoning" into ordinary logical
       | reasoning. Can we construct a Turing Tarpit out of a reasoning
       | system with no true/false semantics, a system only designed to
       | model people liking/disliking what you say? If the answer is yes,
       | then maybe a system like Dalle will unexpectedly gain real
       | understanding of what it is drawing. If not, systems like Dalle
       | will always be Artificial Bullshit.
       | 
       | [1] http://www2.csudh.edu/ccauthen/576f12/frankfurt__harry_-
       | _on_...
        
         | adamsmith143 wrote:
         | >In other words, its bullshitting you. It was only trained on
         | how to get your approval, not to understand the mechanics of
         | the world it is drawing.
         | 
         | A system can learn to do all kinds of interesting things by
         | trying to optimize getting rewards.
         | 
         | See: https://www.deepmind.com/publications/reward-is-enough
        
         | visarga wrote:
         | What you call bullshit I call imagination. Both humans and AI
         | need it. Humans use imagination to plan ahead. AlphaGo was
         | generating moves to plan ahead.
         | 
         | Dall-E and GPT-3 are not being used as agents, they are just
         | tool AIs. They have a narrow task - generating images and text.
         | Agents on the other hand need to learn how to act in the
         | environment, while learning to understand the world at the same
         | time.
        
           | elefanten wrote:
           | How did you get "imagination" out of gp's comment? They
           | described something quite clearly and it's not reasonably
           | summarized as "imagination".
           | 
           | Per gp, bullshit is cynically self-interested pontificating.
           | It's performance. Maybe you could say that the bullshit
           | produced is imaginative, sometimes. But it has nothing to do
           | with "imagination" as a simulation-like capability used for
           | planning and learning.
        
         | TremendousJudge wrote:
         | As far as I know, the human brain is just a "social game
         | reasoning" optimizer, that we try (and fail) to use to do
         | actual logical reasoning. The zillion cognitive biases we have
         | are the clue: we don't do logic, we have biases and sometimes
         | stumble upon logic.
        
         | [deleted]
        
         | gfodor wrote:
         | DALL-E either doesn't generate images you want to see, or if it
         | does, it does a bad job, because it generates many images you
         | don't want to see.
         | 
         | In other words, the claim you've set up is basically
         | unfalsifiable, given that thre's no way to form strong
         | counterevidence from its outputs. (I would argue that if there
         | was, we'd already have it in the vast majority of outputs that
         | aren't images people want.)
         | 
         | If I were to refine what you're saying, is that DALL-E is
         | constrained to generating images that _make sense_ to the human
         | visual system in a coherent way. This constraint is a far cry
         | from what you need to be able to lift it up to claim it is
         | "bullshitting" though, since this constraint is at a very low
         | level in terms of constraining outputs.
        
         | ___rubidium___ wrote:
         | I think you're right, but I would qualify that the AI is
         | bullshitting in the same way that a child's drawing of a stick
         | figure, house, and smiling sun is bullshit designed to get
         | approval. The AI is giving symbols--very visually stunning
         | ones, to be sure, but symbols nonetheless--of what it is
         | prompted to create, just like a child learns that "circle with
         | lines coming out of it" is a symbol that can be read as "sun"
         | and praised by adults.
        
           | IIAOPSW wrote:
           | The reason for my final question at the end is this: can AI
           | emerge out of AB? The child eventually grows up and
           | (hopefully) stops bullshitting everything. But, do they
           | actually stop bullshitting, or is the human ability to reason
           | itself somehow built out of bullshit?
        
             | ___rubidium___ wrote:
             | That's a very good question. The child eventually learns to
             | stop using symbols and instead begins to copy the shapes,
             | negative spaces, curves, etc. in front of them and makes an
             | assemblage of otherwise-meaningless shapes, lines, curves
             | which is interpreted by viewers as a representation of _the
             | real thing_. Does AI have the potential to do that? I don
             | 't know.
        
             | IshKebab wrote:
             | Nobody grows out of your definition of bullshit - doing
             | things that they think others want to see.
             | 
             | But that definition is ... well bullshit. Bullshitting is a
             | deliberate deceptive act. Children aren't being
             | deliberately deceptive when they come up with nonsense
             | answers to questions they don't understand.
        
           | notahacker wrote:
           | I don't think the child's drawing of a stick figure, house
           | and smiling sun is just "bullshit designed to get approval"
           | though. I think they care deeply about the concept of mummy,
           | and how mummy relates to various things especially them and
           | have strong emotional impulses towards mummy. Learning that a
           | stick figure is an symbolic representation of mummy that
           | might meet with approval is a trivially small part of that
           | intrinsic interest in mummy, not the extent of their
           | engagement with the concept of mummy. And if you tell them
           | you don't approve of their drawing, they'll burst into tears
           | and possibly have a completely different attitude towards
           | drawing and/or you, not just iterate on the bullshit.
           | Similarly, the kid's concept of mummy being happy or sad is a
           | bit more nuanced than which way up the crescent under the two
           | dots should be....
        
           | [deleted]
        
         | thfuran wrote:
         | >Bullshit is far more insidious than a lie, for bullshit can
         | (and often does) turn out to be coincident with the truth. When
         | that happens the bullshitter goes undetected and is free to
         | infect our understanding with more bullshit made up on the
         | spot.
         | 
         | If the bullshit is turning out to be true, what's the issue
         | with more of it? If it's not true but still believed and so
         | causing problems, what's the practical difference between it
         | and an undetected lie that makes it more insidious?
        
           | IIAOPSW wrote:
           | A lie is anti-correlated with the truth. Bullshit is
           | uncorrelated with the truth. The problem with trusting a coin
           | flip because it happened to be right this time is that now
           | you trust the coin a source of information, when really its a
           | source of entropy. The lie will be found out just as soon as
           | the truth comes out, whereas the bullshitter might get enough
           | lucky flips in a row that people start taking their advice
           | seriously.
        
         | garyrob wrote:
         | I agree, but I disagree about one aspect. For the most part,
         | humans don't use reason all that much or all that deeply. We
         | usually use intuitive thinking, and there is research showing
         | that immediate intuitive responses are often better than the
         | result of long thinking. More negatively, is QAnon belief, or
         | even Trump election claim belief, about reason? Or is it about
         | associations between words and concepts, especially when those
         | concepts are believed in by the people the believer tends to
         | trust and associate with?
         | 
         | In other words, the takeaway here may not be that GPT-3 spews
         | bullshit. It may be that most of the time, human "thinking" is
         | a less-nuanced, biological version of GPT-3.
        
         | thaw13579 wrote:
         | To me, Dall-E seems analogous to a film production team that
         | produces visual imagery reflecting a script written by a
         | screenwriter. By the above reasoning, would that team would be
         | producing "bullshit"? I think most people would think not,
         | because the goal isn't to communicate objective truth about the
         | world, rather something plausible, interesting, entertaining,
         | etc. (unless it is a documentary).
         | 
         | I also think distinguishing bullshit from lying depends heavily
         | on internal mental thoughts, goals, and intentions. Isn't
         | talking about Dall-E this way personification and ascribing
         | some level of consciousness?
        
           | croes wrote:
           | The production knows was it puts together Dall-E seems not
        
             | thaw13579 wrote:
             | Exactly my second point. Producing "bullshit" requires some
             | internal thoughts and intentions, as the parent put it
             | "DallE generates the images it _thinks_ you want to see "
             | (emphasis mine). A very big claim, which is interesting to
             | consider, but probably premature.
             | 
             | However, I do think that Dall-E is able to learn complex
             | high-order statistical associations, i.e. beyond just
             | juxtaposing and visually blending objects. For a recent
             | example, this post with a prompt "ring side photo of
             | battlebots vs conor mcgregor":
             | 
             | https://twitter.com/weirddalle/status/1554534469129871365
             | 
             | What is amazing here is the excessive blood and gore. That
             | feature can't be found in any individual battlebot or MMA
             | match, but it is exactly what you would expect from robots
             | fighting a person. Pretty amazing, and I wonder at what
             | point we could consider this analytical reasoning.
        
         | dougmwne wrote:
         | I reached essentially the same conclusion after playing with
         | GTP-3 for a while. It spins out mountains of convincing and
         | impressive bullshit, but you can't actually trust anything it
         | says because it is disconnected from right, wrong, correct and
         | incorrect. Reading too much of what it outputs is dangerous
         | because it basically is feeding white noise into your
         | perception and experience of the world.
        
         | acoard wrote:
         | I'm a big fan of Frankfurt's "On Bullshit", and love the
         | reference.
         | 
         | I think there's one significant distinction between a normal
         | human bullshitter that Frankfurt originally envisioned, and the
         | AI practicing Artificial Bullshit. The bullshitter knows there
         | is truth and intentionally disregards it; whereas the AI is
         | blind to the concept. I guess this is "mens rea" in a sense,
         | the human is conscious of their guilt (even if they're
         | apathetic towards it), whereas DALL-E is just a tool that does
         | what it is programmed to do.
         | 
         | I do like this application of "bullshit" though, and will keep
         | it in mind going forward.
        
           | IIAOPSW wrote:
           | Interesting, but I have two follow ups
           | 
           | 1. What are the implications of intentionally disregarding
           | the existence of truth vs being blind to the concept? How
           | does this distinction you made manifest?
           | 
           | 2. Are you sure all humans actually believe in the concept of
           | truth, or could it be the case that some people genuinely
           | function on a principle "there is no truth, only power". Is
           | it possible to think "truth" and "current dominant narrative"
           | are one in the same?
           | 
           | I've certainly had a ton of luck with Bullshit in Diplomacy.
           | As Russia, I offered a plan that involved France helping me
           | take Munich and I would repay by supporting him against the
           | English invasion. Did I intend to actually follow through, or
           | was this a cunning lie? Neither. It was bullshit that got me
           | into Munich. I myself didn't know because (in game) I don't
           | believe in the concept of truth. Everything I say is true and
           | none of it is true. Its all true in the sense that it might
           | happen and I put some weight on it, none of it is true in the
           | sense that there is no branch of the future game tree
           | privileged as "the truth". Some truths have more weight than
           | others, but there is no underlying absolute truth value that
           | must exist yet I choose to ignore. Eventually the order box
           | forces me to pick a truth out of the many I have told. But
           | prior to being forced, it didn't exist.
           | 
           | Is it possible to think in this way all the time about
           | everything? Maybe.
        
           | samstave wrote:
           | > _conscious of their guilt_
           | 
           | This hits it for me:
           | 
           | Consciousness is kinda "being aware of the fact that you have
           | choices for available actions, and what the impact of these
           | actions/non-actions will have on either yourself, your
           | environment, the object of your action, or impact on others.
           | 
           | Intelligence is being aware of the inputs and knowing the
           | (non)available list of actions to take.
           | 
           | Intelligence acts on stimuli/input/data?
           | 
           | Consciousness is awareness of one's own actions from
           | intelligence, others acts from their standpoint of
           | intelligence or consciousness...
           | 
           | A yin/yang, subjective/objective sort of duality that Humans
           | make. (thought v emotion)
           | 
           | Dogs are both intelligent and conscious. They know guilt when
           | they are shamed or happiness when praised for intelligent
           | actions..
        
         | leereeves wrote:
         | > It is wholly unconcerned with the actual objects rendered
         | that are the ostensible focus of the prompt.
         | 
         | I disagree. To the extent that the training data are images of
         | actual objects, recreating images of actual objects is the only
         | thing DALL-E cares about.
         | 
         | If we define "caring" about something as changing behavior to
         | cause that to happen, then a neural network doesn't "care"
         | about inference at all, because inference never changes the
         | network's behavior.
         | 
         | It also doesn't know or care about your approval. It only cares
         | about minimizing the loss function.
         | 
         | (But now that you bring this up, I think it would be really
         | interesting to create a network that, after training initially
         | on training data, began interacting with people and continued
         | training to maximize approval.)
        
           | mjburgess wrote:
           | Images of objects arent objects and measure only their 2D
           | geometrical properties projected onto the image plane. These
           | properties are correlated across images, inexplicably, since
           | the machine has no access to the physical properties of those
           | objects, nor the laws of physics.
        
             | leereeves wrote:
             | It's hypothesized (and, I think, likely) that the machine
             | learns some of those hidden/latent properties, though,
             | obviously, only to the extent that they are useful to
             | recreate the images.
             | 
             | In any case, it does at least care about the appearance of
             | the actual objects. So I think it would be fair to say that
             | there are aspects of the actual objects that the network
             | doesn't care about, but there are also aspects that it
             | cares very much about. Thus it's not "wholly unconcerned
             | with the actual objects".
        
               | MauranKilom wrote:
               | > It's hypothesized (and, I think, likely) that the
               | machine learns some of those hidden/latent properties,
               | though, obviously, only to the extent that they are
               | useful to recreate the images.
               | 
               | Notably, this precisely describes humans too. We don't
               | _know_ the  "true" properties of anything we interact
               | with. We just have models - some more sophisticated than
               | others - but only to the extent that we care for
               | reasoning about the objects. From "This stone is too
               | heavy to lift." to "Oh yeah Becky is always late."
        
               | IIAOPSW wrote:
               | My gripe with this is that it has no understanding of
               | what the objects are beyond "the thing that's supposed to
               | be output". Its a textbook case of the Chinese room
               | argument [1]. Sure it can follow translation rules, even
               | infer some sophisticated rules and patterns about the
               | slips of paper that are passed in, but it still has no
               | idea what any of it means.
               | 
               | https://en.wikipedia.org/wiki/Chinese_room
               | 
               | Dalle is very concerned with the 2d shapes of things we
               | call objects, and has correctly inferred some rules about
               | those shapes, but it neither knows nor cares about the
               | things we call objects and how the shapes it has learned
               | are representations of them. It doesn't do reasoning
               | about the round peg fitting in the round hole. It just
               | glues a few pegs and holes together in a way that's
               | mostly consistent with our expectations and says "is this
               | what you wanted"?
               | 
               | Its a student that cares about passing the test, not
               | learning the material.
        
               | leereeves wrote:
               | Isn't that true of all of us? We all care selectively
               | about certain attributes of objects.
               | 
               | I care that my car is quiet and has comfortable seats, I
               | don't care (or know) what material the muffler is made
               | of, but somewhere there is an engineer who cared about
               | that.
               | 
               | A road designer cares what the tires are made of and how
               | much it weighs, but doesn't care what color the paint is.
               | 
               | An AI recreating an image of my car would care what color
               | the paint is, but not how comfortable the seats are.
               | 
               | I think I see what you're describing - the AI has a very
               | limited scope and doesn't know or care about most of the
               | things we do - but I think that's just a limitation of
               | our current small models and limited training data, not
               | an inherent limitation of neural networks.
        
               | IIAOPSW wrote:
               | You may not know what material the muffler is made of,
               | but you know its made of a material. You "care about the
               | objects themselves" in the sense of having this framework
               | of objects, all of which have certain properties and
               | relations to each other which form a predictive model you
               | have about the world. You expect everything to have a
               | position in x,y,z, to have a size, shape, mass, hardness
               | etc. You might not care about a particular property of a
               | particular object in a particular context, but you
               | fundamentally think in the semantics of objects. You know
               | what to expect when the heavy hard object meets the
               | brittle lite object.
               | 
               | DallE doesn't really have a system of the semantics of
               | objects. It doesn't know why it would be useless for a
               | building to have all the doors on the 2nd level. Its not
               | even clear that DallE makes use of discrete "objects" in
               | its limited understanding.
               | 
               | Here's an example from budget DallE
               | 
               | https://ibb.co/B2B5zky
               | 
               | It understood the shape of "stick figure" and "boobs",
               | but had no understanding of what a stick figure is meant
               | to represent and thus where it should place the boobs.
               | The results are hilarious. I'm not sure which I like
               | more, the guy with a boob casually walking down his arm,
               | or the lady with a boob head that's shrugging with
               | uncertainty.
        
             | icehawk wrote:
             | Now explain how that's different than anything you look at.
             | 
             | - Your visual system only has access to the 2D geometrical
             | properties projected on your retina. The properties are
             | correlated across images, inexplicably. (I certainly cannot
             | explain how chairs are, in a fashion that includes all
             | chairs I've encountered and excludes anything I've
             | encountered that is not a chair)
             | 
             | - Any other interaction is also a correlation.
             | 
             | - Humans don't have access to the laws of physics, just
             | reasonable approximations in certain contexts.
        
               | Barrin92 wrote:
               | >Now explain how that's different than anything you look
               | at.
               | 
               | For starters you just don't look at things, you're
               | embedded in the world. You have sensory input far beyond
               | visual information, you also have something akin to
               | cybernetic feedback in response to your mechanical
               | actions in the world, and DALL-E has not.
               | 
               | In fact DALL-E doesn't even have access to visual
               | information in the same sense you have, which is to a
               | large extent biochemical and analog, not digital.
        
             | Earw0rm wrote:
             | Where this gets interesting is when the AI attempts to
             | create animations or moving images - this requires a much
             | deeper understanding of objects and their physical
             | properties, even to create a highly stylised or "drawn"
             | animation.
        
       | burlesona wrote:
       | Slight side tangent but reading this article it hit me how much
       | this generation of work may be reinforcing English as the global
       | language for generations to come. It seems like we are headed
       | towards a phase of technology where learning how to feed well-
       | crafted prompts into the AI system will be a highly valuable
       | productivity skill. And since the major AI systems seem to be
       | built around English, that would make English fluency even more
       | valuable than it already is. I'm sure that's obvious to non-
       | native speakers who have worked hard to master English, I just
       | hadn't thought of it before.
       | 
       | Less likely but still interesting, I wonder if the way we're
       | building these models will at some point begin to layer on top of
       | each other such that English as it is used _now_ becomes
       | something deeply embedded in AI, and whether that will evolve
       | with the spoken language or not. It's funny to imagine a future
       | where people would need to master an archaic flavor of English to
       | get the best results working with their AI helpers.
        
         | affgrff2 wrote:
         | But the same work is also removing the language barriers at the
         | same time with really good translation tools. I rather guess
         | being fluent in English will be not as important as it is now.
         | 
         | Edit: the same work = transformer based language models
        
         | Ajedi32 wrote:
         | Perhaps, but another possibility is that the more advanced
         | models all end up being polyglots. The state of the art in
         | machine translation already uses a single model trained on
         | multiple languages[1], which results in better translations
         | between languages it doesn't have a lot of examples for. If the
         | same principle applies to other types of models, then training
         | them on every possible dataset available regardless of language
         | might yield better results. That could result in models that
         | are fluent in hundreds of languages. (I'd be curious as to
         | whether DALL-E understands prompts in languages other than
         | English, has anyone tried?)
         | 
         | [1]: https://ai.googleblog.com/2019/10/exploring-massively-
         | multil...
        
           | visarga wrote:
           | BLOOM is the most recent model trained on multi-lingual data
           | - 46 natural languages and 13 programming languages.
        
         | Workaccount2 wrote:
         | I envision the skill of giving good AI prompts to be as short
         | lived as the skill of T9 texting. Probably even shorter.
        
           | TillE wrote:
           | Right, I would expect the rise of a bunch of domain-specific
           | tools within the next few years.
        
         | planede wrote:
         | I wonder if these models performed better if they were trained
         | on an other language.
        
           | qayxc wrote:
           | I wonder the same thing. Many languages are far better at
           | making precise statements due to stricter and more complex
           | grammar. The reduction in ambiguity could help models perform
           | better.
           | 
           | Then there's also the problem of labelling training data.
           | Most of the labelling and annotating is outsourced to
           | countries with cheap labour and performed by non-native
           | speakers, which leads to problems with mis-labelled training
           | data.
        
             | visarga wrote:
             | DallE is trained on web scraped data, there is no manual
             | labelling taking place except maybe to filter out some
             | types of content.
             | 
             | A few years ago we were training captioning models on
             | manually labelled datasets (such as COCO captions), but the
             | they were small and models were not too general.
        
         | ffhhj wrote:
         | > reinforcing English as the global language
         | 
         | Which might also result in new speakers modifying English to
         | their cultures (like Blade Runner's Cityspeak), and Global-
         | English speakers not understanding "secret" foreign
         | communication, so they might create new languages for their own
         | subcultures, then relegating English as the new Latin for
         | technical knowledge (Latin was kept by the Catholic Church).
        
         | mtlmtlmtlmtl wrote:
         | Also worth noting that the internet has massively accelerated
         | the importance of English already.
         | 
         | As an ESL speaker who grew up on the internet, Norwegian was
         | more or less useless to me outside school and family. Most of
         | my time was spent on the internet, reading and writing lots of
         | English. Norwegian wikipedia is pretty much useless unless you
         | don't know English. That's still true today for the vast
         | majority of articles, but back then was universally the case.
         | 
         | There were Norwegian forums, but with a population of just 4
         | million and change at the time, they were never as interesting
         | or active as international/American forums and IRC channels.
         | 
         | In fact I'd say Norwegian is only my native language in spoken
         | form, whereas English feels more natural to me to write and
         | read. Doesn't help that Norwegian has two very divergent
         | written forms, either.
         | 
         | I even write my private notes in English, even though I will be
         | the only one reading them.
        
         | bearjaws wrote:
         | Covid and remote work are going to make this even more true. At
         | my new job I work with a Nigerian co-worker who lives in
         | Nigeria, he speaks clear & proper English and earns an amazing
         | salary for his area.
         | 
         | If you live in a third world country, you could really benefit
         | from remote work going forward and English will be a popular
         | language to learn for that. That being said, I know some people
         | will 'phone it in' and not speak as clearly, which will put
         | them at a disadvantage.
        
       | TheRealPomax wrote:
       | "Yes". And that's why question headlines may not technically be
       | click bait, but still qualify as something the world can do
       | about.
        
       | prometheus76 wrote:
       | I want to see two DALL-E bots talk to each other. First bot:
       | "Pete and Repeat were sitting on a fence. Pete fell off. Who was
       | left?" Second bot: "Repeat" First bot: "Pete and Repeat were
       | sitting on a fence. Pete fell off. Who was left?"
       | 
       | What happens next?
        
         | mkl wrote:
         | I think you're thinking of GPT-3. DALL-E and DALL-E 2 generate
         | images, not text (and they're very bad at images of text).
        
           | prometheus76 wrote:
           | Ah I'm sorry. Well, the curiosity remains!
        
       | xornox wrote:
       | At least there is much missing data for human like intelligence.
       | If human would be trained with same input data, I think he/she
       | would not understand relationships.
        
       ___________________________________________________________________
       (page generated 2022-08-04 23:00 UTC)