[HN Gopher] DALL-E, the Metaverse, and Zero Marginal Content
       ___________________________________________________________________
        
       DALL-E, the Metaverse, and Zero Marginal Content
        
       Author : simonebrunozzi
       Score  : 95 points
       Date   : 2022-04-13 09:53 UTC (1 days ago)
        
 (HTM) web link (stratechery.com)
 (TXT) w3m dump (stratechery.com)
        
       | axg11 wrote:
       | We are still extremely early in this space. This is the mainframe
       | era of generative machine learning.
       | 
       | I'm expecting three big leaps in the next 10 years (in order):
       | 
       | 1) Generative algorithms reach a level where the content is
       | indistinguishable from human-generated content. Compute:
       | performed on mega-clusters such as used by DALL-E, GPT-3, PaLM.
       | 
       | 2) DALL-E/GPT-3/PaLM-level generative algorithms are able to run
       | on personal hardware (phone, laptop)
       | 
       | 3) Generative algorithms are able to fine-tune/train on personal
       | hardware
       | 
       | Right now the algorithms are moving much faster than the
       | hardware, hence why we are seeing large language models and
       | gigantic generative models such as DALL-E 2. In time the hardware
       | will catch up. For the next few years, applications of these
       | models will be restricted by the fact that they're only
       | accessible through API calls to mega-clusters run by Google,
       | OpenAI, etc.
       | 
       | In time the hardware will improve and architectures will become
       | more compute efficient to the point where we can achieve "human-
       | level" (loosely defined) on personal hardware. Real-time
       | generative content will change the way we consume content
       | entirely, especially in the context of AR. Augmenting our view of
       | the physical world with an infinite number of different filters
       | will create infinite use cases for AR.
       | 
       | On the large language model side, once LLMs exceed our ability to
       | comprehend libraries of documents, that will change the way we
       | work, the way we perform science and many other things. Imagine
       | querying your team's entire library of documents, design reviews
       | and code with prompts such as: "Why was the load balancer for X
       | service designed in this way?".
        
         | visarga wrote:
         | The advent of edge GPT-3 would be a huge change for robotics as
         | well. It has been proven that augmenting robots with a language
         | model for planning gives a leap in abilities. Maybe they could
         | even enter any kitchen and make a sandwich (sudo make me a
         | sandwich task).
         | 
         | https://arxiv.org/pdf/2201.07207.pdf
        
           | consumer451 wrote:
           | > The advent of edge GPT-3
           | 
           | I am nowhere near the industry, so may be a dumb question,
           | but are the new analog in-memory compute modules[0] likely to
           | help with this in reality? How far away is that reality?
           | 
           | [0]
           | 
           | https://mythic.ai
           | 
           | https://www.youtube.com/watch?v=67LXWocO9HI
        
         | mcbuilder wrote:
         | I'm curious on your point 1, and I tend to disagree. The number
         | of parameters in these large language models is increasing
         | faster than Moore's law. Currently you need a server full of
         | GPUs just to run inference on a PaLM model. How do you see the
         | size shrinking so drastically? Hardware is improving on
         | important factors like power consumption, but inference
         | hardware needs to scale with the size of the models. Don't get
         | me wrong, it's likely that PaLM itself can run on 2032 phone,
         | but the real advances will be in even more scaled up models.
         | 
         | The future of AI will be in the data center for a long time to
         | come. Maybe after some point the models cease to scale up and
         | that point will be where the model would even overfit on the
         | amount of data we can possibly give it e.g. the entire
         | internet. The PaLM authors allude to this in their conclusion
        
           | stult wrote:
           | There was just an article from deepmind on HN about this
           | topic the other day[1], but basically IIRC it argues that all
           | of the LLMs are horrendously compute inefficient, which means
           | there's a ton of room to improve them. So those models will
           | be optimized over time just as the consumer hardware will be
           | improved until eventually one day the two trends will
           | converge. It's just a question of when that will happen.
           | 
           | [1] https://news.ycombinator.com/item?id=30987885
        
       | simonw wrote:
       | "virtual worlds needs virtual content created at virtually zero
       | cost, fully customizable to the individual"
       | 
       | That seems like the polar opposite of the NFT metaverse vision,
       | where artificial scarcity encourages people to pay for
       | speculative assets.
        
         | ausbah wrote:
         | even worse, it's "artificial scarcity" of some of the worst art
         | out there
        
         | KaoruAoiShiho wrote:
         | In a post scarcity world, all value/scarcity is artificial.
        
       | airstrike wrote:
       | _> Game developers pushed the limits on text, then images, then
       | video, then 3D_
       | 
       |  _> Social media drives content creation costs to zero first on
       | text, then images, then video_
       | 
       |  _> Machine learning models can now create text and images for
       | zero marginal cost_
       | 
       | Search was also text, then images, then video _files_... and I
       | imagine the next step is searching video _content_. Before you
       | say we already have Google video search or search on YT, I 'm
       | talking about indexing and categorizing something like the TikTok
       | video feed and letting users access the _content_ they want, not
       | long-form videos with 3 minutes of intros and  "please smash that
       | like button and subscribe". Like searching for "what should I
       | cook today?" and finding a 30-second video of someone cooking
       | penne alla puttanesca while describing the recipe. Extra points
       | if that search is a voice command.
       | 
       | Also FTA
       | 
       |  _> That phrase, "Facebook is compelling for the content it
       | surfaces, regardless of who surfaces it", is oh-so-close to
       | describing TikTok; the error is that the latter is compelling for
       | the content it surfaces, regardless of who creates it..._
       | 
       | YT and Google video search are too focused on who creates the
       | content. Who cares about channels and likes and subscriptions?
       | "Just give me the shortest answer to my query and give it to me
       | in video _now_ ".
        
         | RC_ITR wrote:
         | The easier way to do this (counterintuitively) is what Tik Tok
         | has already done - don't try to parse the content of the video,
         | instead use user feedback to proactively serve the content that
         | a user wants.
         | 
         | For us, a crowd of 'do-it-yourself' type-A's this seems like an
         | incomplete solution, but the future is probably not one defined
         | by user-generated search as much as it is defined by 'the
         | computer just knows what I want.'
        
         | axg11 wrote:
         | > YT and Google video search are too focused on who creates the
         | content
         | 
         | Correct - YouTube's feature roadmap is driven by monetization,
         | which is in turn driven by monetizing attention through
         | advertising. How often do you search directly for videos? I
         | just looked through my recent YouTube search queries; 70% of
         | the searches are for channels/content producers. Compounding
         | that, most of my time in YouTube is spent consuming their
         | recommendations rather than directly seeking content. So even
         | during the rare times when I use YouTube search, it's usually
         | not for specific video content.
         | 
         | The types of queries you're describing are better served by
         | text. Perhaps a mix of both text and video content.
        
       | bduerst wrote:
       | Zero marginal content already exists with procedural generation
       | (PG).
       | 
       | The problem with PG is that it's all the same after a while -
       | i.e. one of the chief complaints of _No Man 's Sky_ is that all
       | the PG planets look the same after a short period of time.
       | 
       | The real value isn't zero-marginal content, but zero-marginal
       | narration (or story telling) that breaks new ground. Piecing
       | together PG or zero-marginal content isn't enough, any next step
       | breakthroughs will come with higher-level orchestration of said
       | content.
        
         | tialaramex wrote:
         | Here's my observation as mostly a player and (non-game)
         | programmer but sometimes creator. To make these open ended
         | games _interesting_ there need to be large numbers of distinct
         | unique interactions between things in the world, exhaustive
         | testing will therefore be impossible and you will need to
         | approach testing differently. If you 're making a game like
         | this, and you give people the chance to play a demo but they
         | don't _completely astonish you_ by doing something you 'd never
         | considered then your game is going to be disappointing.
         | 
         | The game world needs to make some sense (ie don't have the
         | results of an interaction be merely pseudo-random) but it's OK
         | if it's a bit _weird_. The real world is a bit weird, a little
         | bit off the trivial defaults. Mario interacts with his world in
         | a more or less predictable way... but lots of the edge cases
         | are strange. Mario can 't walk off the screen... but he can be
         | pushed off by things. He bounces on some objects... but, the
         | rules for _how_ Mario bounces are hard to grasp, for a casual
         | player it doesn 't matter. In Minecraft if you put two
         | bucketfuls of _water_ a space apart from each other, unlimited
         | water is the result. But if you put two bucketfuls of _lava_
         | the same distance nothing interesting happens. Inconsistent,
         | but not so haphazard as to be baffling.
         | 
         | If your world is in the resulting sweet spot, people can
         | entertain themselves more or less indefinitely just like in
         | this world. You can procedurally generate such worlds, _but_
         | you must be prepared to have your players not interact with it
         | the way you intended. When you buy a child a $$$ game and they
         | spend hours happily playing with... the cardboard box it came
         | in, they aren 't doing it wrong, they're having fun, you don't
         | get to dictate what other people enjoy. Lots of video game
         | creators have that ego problem where they can't let go of how
         | _they_ thought the game is supposed to be played, and if you
         | have procedural generation that 's inherently _wrong_. Write a
         | visual novel next time.
        
           | Stevvo wrote:
           | Elite Dangerous is a counterpoint to that; massive procedural
           | galaxy, but very tight and designed gameplay loops.
        
           | zhynn wrote:
           | I think you are spot on. I have been toying with a formal
           | theory about this stuff which I then turned into a rambling
           | HN post. Here's what I came up with:
           | 
           | In my theory, the reason that NMS is unsatisfying is because
           | it has great procedural generation breadth (PGB), but
           | insufficient procedural generation depth (PGD).
           | 
           | PGB is defined as just the raw number of composable pieces
           | that can be used to generate artifacts. So, how many plant
           | bits, animal bits, biome bits, etc there are available to the
           | procgen system. NMS has sufficient PGB, there are lots of
           | bits.
           | 
           | PGD is defined as the influence or interaction between PG
           | artifacts. The influence can manifest as either affecting the
           | generation itself (a high-gravity planet reduces the maximum
           | size of procgen critters) or artifact behavior (a dim star
           | cold planet alters the critter behavior to reduce movement to
           | conserve energy). NMS has neither, as far as I know. Human
           | defined categories or tag metadata for PG elements does not
           | count. Declaring that sets of procgen components are all
           | "desert" is not an interaction. NMS has very low PGD, the
           | animals, plants, and biomes do not influence each other. The
           | devs have grouped components together to make biomes look and
           | feel self-similar. But this was not done procedurally, and
           | has no depth to it.
           | 
           | A game like Rimworld on the other hand has a lower PGB than
           | NMS. By the numbers, there are far fewer variations of
           | entities in Rimworld than there is in NMS. But Rimworld makes
           | up for this by having much higher PGD. The procgen landscape
           | influences climate and biomes. The biome and latitude will
           | influence growing season which influences the carrying
           | capacity of herbivores which influences the carrying capacity
           | of carnivores. The procgen of the pawn's social and family
           | relationships influence each other's behavior. The available
           | calories on the map influence what pawns are able to eat -
           | which will interact with their food preferences... The most
           | important thing is that PGD makes the procgen entities
           | actually matter and creates surprising stories.
           | 
           | (As an aside, It occurs to me that PGD is directly related to
           | emergent behavior. It is possible that this could be formally
           | proved, as conway GOL emergent behavior is entirely
           | predicated on neighboring cells affecting each other. It
           | might be possible to prove that any system with sufficient
           | PGD is capable of emergent behavior: which is really what we
           | want from a PG system. We want to be surprised by the
           | unexpected, something that GOL is absolutely capable of. In
           | fact you could probably use GOL as a starting point for this
           | whole theory of PGD/PGB.... hmm....)
           | 
           | The king of PGD is Dwarf Fortress. It has very high PGB and
           | PGD, and its ability to create surprising stories is
           | legendary. My hypothesis is that this is _why_ Dwarf Fortress
           | is so good.
           | 
           | Anyway, since NMS has no PGD, it has no emergent behavior
           | (except by the players themselves), and you come away from it
           | feeling that it is a mile wide and an inch deep. No shade
           | though! It is a beautiful piece of work -- I have been
           | playing it from launch, and I think it is a towering
           | achievement (not to mention the most incredible PR story in
           | gaming history). No Man's Sky is an outstanding work of art
           | and it should be remembered for what it is, not what it could
           | be (or what many people think it should be). And the emergent
           | gameplay of the players is really fun too.
           | 
           | But when I think about what it might be like to have a higher
           | PGD in NMS... it would be really something.
        
         | segh wrote:
         | Plenty of games get procedural generation right, from NetHack,
         | to Spelunky, to Minecraft. No Man's Sky might just be a bad
         | game.
        
       | spy888 wrote:
       | The logic sounds good in theory but in practice not much of this
       | is grounded in today's reality. No one is reading AI generated
       | text for their news or entertainment. That will be an important
       | first step. If it ever happens.
        
         | mdorazio wrote:
         | Counterpoint: I am fairly certain that a large portion of
         | financial news is machine-generated today. Machine-generated
         | blogspam is also extremely prevalent and edging into the "news"
         | space.
        
           | vimy wrote:
           | Yep.
           | 
           | > The program can dissect a financial report the moment it
           | appears and spit out an immediate news story that includes
           | the most pertinent facts and figures. And unlike business
           | reporters, who find working on that kind of thing a snooze,
           | it does so without complaint.Untiring and accurate, Cyborg
           | helps Bloomberg in its race against Reuters, its main rival
           | in the field of quick-twitch business financial journalism,
           | as well as giving it a fighting chance against a more recent
           | player in the information race, hedge funds, which use
           | artificial intelligence to serve their clients fresh facts.
           | "The financial markets are ahead of others in this," said
           | John Micklethwait, the editor in chief of Bloomberg. In
           | addition to covering company earnings for Bloomberg, robot
           | reporters have been prolific producers of articles on minor
           | league baseball for The Associated Press, high school
           | football for The Washington Post and earthquakes for The Los
           | Angeles Times. https://www.nytimes.com/2019/02/05/business/me
           | dia/artificial...
        
             | fshbbdssbbgdd wrote:
             | I wonder if this is really using ML at all to generate the
             | output or if it's just filling out a template based on a
             | predefined set of numbers and/or text properties. Baseball
             | games, earthquakes, and financial data can all be ingested
             | with a schema.
        
       | cturner wrote:
       | If there is anyone here with access to it, please could you try
       | using it to generate some tight pixel art of simple concepts and
       | post the results? I am thinking - chair, coffee, boat, control
       | panel, wedding ring, lamp, pit. I am curious whether it could be
       | a fast way to design sprites. Imagine if it could be trained in
       | the style of particular desktop systems.
        
         | dschnurr wrote:
         | I gave this a try:
         | https://twitter.com/_dschnurr/status/1514673929209614337
        
           | KaoruAoiShiho wrote:
           | Can you do the same in a more realistic style?
           | https://www.artstation.com/artwork/E1qPv
           | 
           | I always thought the idea behind pixel art is to save on
           | artist resources, but since the AI is doing it I think indie
           | devs would rather do more something more advanced.
        
         | TaupeRanger wrote:
         | The only Pixel Art example I could find so far:
         | https://twitter.com/0xCharlota/status/1511965632765607936?t=...
        
           | SparkyMcUnicorn wrote:
           | Found this one too:
           | https://labs.openai.com/s/U5A157f1tgMOj2W97m3Kff5J
        
       | ak391 wrote:
       | open source alternative to DALL-E:
       | https://huggingface.co/spaces/multimodalart/latentdiffusion
        
       | tintor wrote:
       | Looking forward to the first text adventure game built using
       | DALL-E as a frontend.
        
         | bulbosaur123 wrote:
         | AIDungeon: been there, done that. It will become a battlefield
         | of the most insane perversions and you know it.
        
           | tintor wrote:
           | AIDungeon has a text frontend. What I meant was an image-
           | based frontend generated by DALL-E.
        
           | bduerst wrote:
           | AI-Dungeon is trained on previous played sessions, IIRC. It's
           | pretty limited to generating any new outputs if players
           | haven't previously fed it in (but I could be mistaken).
        
       | [deleted]
        
       | zitterbewegung wrote:
       | I am trying to figure out how to integrate Various deep learning
       | networks together to make a coherent game. One of the big
       | problems I have is having to use alternatives to DALLE / GPT-3
       | because being contingent on their approval is a huge risk. I use
       | huggingface instead and I have many video cards. The current
       | state of trying to do this is the big problem of how to integrate
       | and getting good quality. GPT3 and other systems stop working at
       | around 500 words (tokens) and also DALLE is hard to use and it
       | looks like it takes lots of training yourself to make it work.
       | 
       | I don't think the marginal cost is absolutely zero until we can
       | get classifiers or larger systems that can go to a description
       | from an image to a word and also having GPT3 or another system
       | that works to at least a few pages. Right now you have to cherry
       | pick it.
        
       | Barrin92 wrote:
       | Computer generated content has its place, as it already does in
       | many procedurally generated games as others have pointed out but
       | I do not think it will play a central role in the near future.
       | 
       | One reason for this is technical. AI (not AGI but the systems
       | that actually exist) can synthesize existing game data into new
       | things but you'll still be in an uncanny valley of recycled
       | information and in the worst case be in some bizarre feedback
       | loop of old content that is permanently recycled and reconfigured
       | in ways that players will notice. No Man's Sky is an example.
       | 
       | The other reason is social, and I think it's the stronger case.
       | We already have many games were computers could replace humans,
       | chess is the most obvious one. Yet computer chess competitions
       | only create marginal interest (by AI engineers) and players
       | vastly prefer human interaction for the sole reason that it is
       | human interaction. In the same vein there is computer generated
       | music that actually is good enough to fool people, but there
       | doesn't seem to be any market for synthetic artists.
       | 
       | Or take Esports. Deepmind's AI systems are actually good enough
       | to produce Starcraft tournaments solely played by machines. The
       | games even look like high level human gameplay. But there is zero
       | interest in watching a bot tournament.
        
         | swix wrote:
         | True, but couldn't the same be said about the real world to
         | some degree? Everything "new" is based on something existing,
         | right? Even in sci-fi/fantasy movies which are completely wild,
         | completely "out there", the things in them are conjured by us,
         | which is in some way shape or form based on our imagination,
         | which is based in the reality we exist within.
         | 
         | My point is, that feedback loop you speak of already exists,
         | it's just that... there are SO much variation, content,
         | possibilities... but the same thing could be true for AI
         | generated content in some sense. Feed it all our assets
         | (sounds, music, movies, images) and then let it go bananas...
         | :) - you'll have as much variation in the digital world as we
         | do in our real world.
        
       | avaer wrote:
       | Having worked on adjacent problems every day for 10 years, maybe
       | my opinion counts for something (or maybe not).
       | 
       | But I think the top game of the 2030s will be something like AI
       | Minecraft or AI Steam, where everything, including the very rules
       | of the game, is generated from a structured data set optimized
       | for the player.
       | 
       | And I think the "metaverse" (as much as I loathe the term) is
       | going to go down as the labeled training set for bootstrapping
       | this, just like the open web was the catalyzing training set for
       | the (already admittedly magical) AIs we have today.
       | 
       | Further, I think Facebook won't be the one to design this,
       | because that is not what their share price incentivizes.
        
         | kurthr wrote:
         | I think with eye/attention tracking it will be absolutely
         | amazing what 3D content can be generated and optimized by AI/ML
         | to maintain the interaction feedback loop. I'm not sure it will
         | be a good thing (at least for those who didn't grow up with
         | it)... much like FB doom scrolling is a problem for the over 50
         | set.
        
           | nhecker wrote:
           | Wow! That's a crazy thought. Eye tracking tightly coupled
           | (i.e., in a ~realtime feedback loop) with PG/ML/AI seems ...
           | powerful, or scary, or both. Something along the lines of a
           | computer controlled lucid dream, sprinkle in a handful of
           | whatever the equivalent would be of blinking banner ads,
           | product placements, or subliminal messages, etc. and my mind
           | spirals out of control imagining how that would play out.
        
         | jayd16 wrote:
         | A Kalvin-ball AI would be a pretty interesting novelty but I
         | think good games are usually focused on simple rulesets. Not
         | sure an AI is really needed there. Maybe you just mean hyper
         | tuning drop rates and and modifiers and such?
         | 
         | I'm sure AI will eventually have a huge impact on the art,
         | narrative, and engineering of games, though, so maybe you're
         | correct that will bleed into game design as well.
        
         | SquibblesRedux wrote:
         | While custom-tailored games may be interesting, it seems like
         | such a thing would be socially fragmenting and isolating.
         | People need shared experiences to relate to each other. I'm not
         | sure the world would be a better place if people gradually have
         | fewer and fewer shared experiences.
        
           | mathattack wrote:
           | Unfortunately very few large companies truly operate to make
           | the world a better place if their making money depends on the
           | other direction.
        
           | buttonpusher wrote:
           | It seems like we could have both; people can generate worlds
           | by some combination of automation and manually tweaking
           | parameters or mods, then they can share that world with their
           | friends, and visit worlds created by their friends. Some
           | people may have esoteric taste, but the internet is good for
           | finding people who share your esoteric taste, for better and
           | for worse.
        
         | Zababa wrote:
         | > But I think the top game of the 2030s will be something like
         | AI Minecraft or AI Steam, where everything, including the very
         | rules of the game, is generated from a structured data set
         | optimized for the player.
         | 
         | Steam and Minecraft are both social by nature. People very
         | often want to play with other people. It's like the joke I do
         | all the time about AI feeds on Netflix: they recommand the same
         | thing to everyone because the AI realized that everyone want to
         | talk about the thing they saw more than they want to enjoy
         | seeing the thing. Humans are socials creatures.
        
         | flycaliguy wrote:
         | I wonder if carving out individual experiences would prevent
         | users from displaying their use of said entertainment to create
         | social status. So I'm coming at it wondering if people value
         | the status that their form of entertainment provides more than
         | the experience itself. On the other hand, maybe we'll continue
         | to just observe the death of the "main stream" as we all slip
         | into our own niche communities, each with its own complex
         | system of status signifiers?
         | 
         | This stuff really leaves me pretty puzzled. I'm a culture guy,
         | English grad. Art is not supposed to behave like this!!!
        
       | KaoruAoiShiho wrote:
       | Not sure I get the social network or metaverse angle. But
       | basically 0 cost content as it applies to entertainment and maybe
       | other industries as well. It's well beyond metaverse if you
       | consider metaverse or gaming to only be 1 aspect of
       | entertainment. And let's not be mistaken, we're talking about all
       | entertainment not just mass media. Looking forward to things like
       | replacement of restaurants and tourism and even chatting with
       | friends, and this is well ahead of what we need from AGI.
        
         | wodenokoto wrote:
         | I know you're being sarcastic, but as a kid I dreamed of an AI
         | dungeon master that could play dnd types games and draw
         | beautiful pictures of the scenery that the scenario would take
         | place in.
        
           | ALittleLight wrote:
           | This YouTube video, to me, shows the promise of things to
           | come. AI generated game worlds. Language models to generate
           | plots and dialog, transformers and GANs to create
           | illustrations. Imagine a game, a truly open world sand box,
           | Grand Theft Auto meets AI Dungeon - every NPC is a "real"
           | person with unlimited dialog options, the buildings you drive
           | by you could easily walk in and investigate, unlimited play
           | space - you could type in more general instructions and ideas
           | to the plot generator ("add in a vampire romance and murder
           | mystery angle") on the fly.
           | 
           | https://www.youtube.com/watch?v=udPY5rQVoW0
        
             | twoodfin wrote:
             | What you describe is actually all the more interesting
             | aspects of the "holodeck" as introduced and explored (some
             | would say too deeply) as a story concept on _Star Trek: The
             | Next Generation_.
             | 
             | There are more than a few scenes where the intrepid crew
             | members struggle with what we'd now recognize as prompts.
             | 
             | https://youtu.be/p7pPedBtbvk
        
       | oefnak wrote:
       | I'm really hoping that somebody puts in the dwarf fortress
       | forgotten beast descriptions...
        
       | jdrc wrote:
       | Looking forward to AI satire which i can download and run on my
       | laptop, away from the censorious ears that all but destroyed
       | comedy
        
         | exdsq wrote:
         | I used GPT-3 to generate some porn scripts of various
         | historical dictators and modern politicians that were pretty
         | funny. Putins secret romance with Kim Jung Un was especially
         | saucy, until Osama Bin Laden found out and told the EU. I'd be
         | happy to find them and put them on a blog somewhere if there
         | was an audience ha!
        
       ___________________________________________________________________
       (page generated 2022-04-14 23:01 UTC)