[HN Gopher] DALL-E, the Metaverse, and Zero Marginal Content ___________________________________________________________________ DALL-E, the Metaverse, and Zero Marginal Content Author : simonebrunozzi Score : 95 points Date : 2022-04-13 09:53 UTC (1 days ago) (HTM) web link (stratechery.com) (TXT) w3m dump (stratechery.com) | axg11 wrote: | We are still extremely early in this space. This is the mainframe | era of generative machine learning. | | I'm expecting three big leaps in the next 10 years (in order): | | 1) Generative algorithms reach a level where the content is | indistinguishable from human-generated content. Compute: | performed on mega-clusters such as used by DALL-E, GPT-3, PaLM. | | 2) DALL-E/GPT-3/PaLM-level generative algorithms are able to run | on personal hardware (phone, laptop) | | 3) Generative algorithms are able to fine-tune/train on personal | hardware | | Right now the algorithms are moving much faster than the | hardware, hence why we are seeing large language models and | gigantic generative models such as DALL-E 2. In time the hardware | will catch up. For the next few years, applications of these | models will be restricted by the fact that they're only | accessible through API calls to mega-clusters run by Google, | OpenAI, etc. | | In time the hardware will improve and architectures will become | more compute efficient to the point where we can achieve "human- | level" (loosely defined) on personal hardware. Real-time | generative content will change the way we consume content | entirely, especially in the context of AR. Augmenting our view of | the physical world with an infinite number of different filters | will create infinite use cases for AR. | | On the large language model side, once LLMs exceed our ability to | comprehend libraries of documents, that will change the way we | work, the way we perform science and many other things. Imagine | querying your team's entire library of documents, design reviews | and code with prompts such as: "Why was the load balancer for X | service designed in this way?". | visarga wrote: | The advent of edge GPT-3 would be a huge change for robotics as | well. It has been proven that augmenting robots with a language | model for planning gives a leap in abilities. Maybe they could | even enter any kitchen and make a sandwich (sudo make me a | sandwich task). | | https://arxiv.org/pdf/2201.07207.pdf | consumer451 wrote: | > The advent of edge GPT-3 | | I am nowhere near the industry, so may be a dumb question, | but are the new analog in-memory compute modules[0] likely to | help with this in reality? How far away is that reality? | | [0] | | https://mythic.ai | | https://www.youtube.com/watch?v=67LXWocO9HI | mcbuilder wrote: | I'm curious on your point 1, and I tend to disagree. The number | of parameters in these large language models is increasing | faster than Moore's law. Currently you need a server full of | GPUs just to run inference on a PaLM model. How do you see the | size shrinking so drastically? Hardware is improving on | important factors like power consumption, but inference | hardware needs to scale with the size of the models. Don't get | me wrong, it's likely that PaLM itself can run on 2032 phone, | but the real advances will be in even more scaled up models. | | The future of AI will be in the data center for a long time to | come. Maybe after some point the models cease to scale up and | that point will be where the model would even overfit on the | amount of data we can possibly give it e.g. the entire | internet. The PaLM authors allude to this in their conclusion | stult wrote: | There was just an article from deepmind on HN about this | topic the other day[1], but basically IIRC it argues that all | of the LLMs are horrendously compute inefficient, which means | there's a ton of room to improve them. So those models will | be optimized over time just as the consumer hardware will be | improved until eventually one day the two trends will | converge. It's just a question of when that will happen. | | [1] https://news.ycombinator.com/item?id=30987885 | simonw wrote: | "virtual worlds needs virtual content created at virtually zero | cost, fully customizable to the individual" | | That seems like the polar opposite of the NFT metaverse vision, | where artificial scarcity encourages people to pay for | speculative assets. | ausbah wrote: | even worse, it's "artificial scarcity" of some of the worst art | out there | KaoruAoiShiho wrote: | In a post scarcity world, all value/scarcity is artificial. | airstrike wrote: | _> Game developers pushed the limits on text, then images, then | video, then 3D_ | | _> Social media drives content creation costs to zero first on | text, then images, then video_ | | _> Machine learning models can now create text and images for | zero marginal cost_ | | Search was also text, then images, then video _files_... and I | imagine the next step is searching video _content_. Before you | say we already have Google video search or search on YT, I 'm | talking about indexing and categorizing something like the TikTok | video feed and letting users access the _content_ they want, not | long-form videos with 3 minutes of intros and "please smash that | like button and subscribe". Like searching for "what should I | cook today?" and finding a 30-second video of someone cooking | penne alla puttanesca while describing the recipe. Extra points | if that search is a voice command. | | Also FTA | | _> That phrase, "Facebook is compelling for the content it | surfaces, regardless of who surfaces it", is oh-so-close to | describing TikTok; the error is that the latter is compelling for | the content it surfaces, regardless of who creates it..._ | | YT and Google video search are too focused on who creates the | content. Who cares about channels and likes and subscriptions? | "Just give me the shortest answer to my query and give it to me | in video _now_ ". | RC_ITR wrote: | The easier way to do this (counterintuitively) is what Tik Tok | has already done - don't try to parse the content of the video, | instead use user feedback to proactively serve the content that | a user wants. | | For us, a crowd of 'do-it-yourself' type-A's this seems like an | incomplete solution, but the future is probably not one defined | by user-generated search as much as it is defined by 'the | computer just knows what I want.' | axg11 wrote: | > YT and Google video search are too focused on who creates the | content | | Correct - YouTube's feature roadmap is driven by monetization, | which is in turn driven by monetizing attention through | advertising. How often do you search directly for videos? I | just looked through my recent YouTube search queries; 70% of | the searches are for channels/content producers. Compounding | that, most of my time in YouTube is spent consuming their | recommendations rather than directly seeking content. So even | during the rare times when I use YouTube search, it's usually | not for specific video content. | | The types of queries you're describing are better served by | text. Perhaps a mix of both text and video content. | bduerst wrote: | Zero marginal content already exists with procedural generation | (PG). | | The problem with PG is that it's all the same after a while - | i.e. one of the chief complaints of _No Man 's Sky_ is that all | the PG planets look the same after a short period of time. | | The real value isn't zero-marginal content, but zero-marginal | narration (or story telling) that breaks new ground. Piecing | together PG or zero-marginal content isn't enough, any next step | breakthroughs will come with higher-level orchestration of said | content. | tialaramex wrote: | Here's my observation as mostly a player and (non-game) | programmer but sometimes creator. To make these open ended | games _interesting_ there need to be large numbers of distinct | unique interactions between things in the world, exhaustive | testing will therefore be impossible and you will need to | approach testing differently. If you 're making a game like | this, and you give people the chance to play a demo but they | don't _completely astonish you_ by doing something you 'd never | considered then your game is going to be disappointing. | | The game world needs to make some sense (ie don't have the | results of an interaction be merely pseudo-random) but it's OK | if it's a bit _weird_. The real world is a bit weird, a little | bit off the trivial defaults. Mario interacts with his world in | a more or less predictable way... but lots of the edge cases | are strange. Mario can 't walk off the screen... but he can be | pushed off by things. He bounces on some objects... but, the | rules for _how_ Mario bounces are hard to grasp, for a casual | player it doesn 't matter. In Minecraft if you put two | bucketfuls of _water_ a space apart from each other, unlimited | water is the result. But if you put two bucketfuls of _lava_ | the same distance nothing interesting happens. Inconsistent, | but not so haphazard as to be baffling. | | If your world is in the resulting sweet spot, people can | entertain themselves more or less indefinitely just like in | this world. You can procedurally generate such worlds, _but_ | you must be prepared to have your players not interact with it | the way you intended. When you buy a child a $$$ game and they | spend hours happily playing with... the cardboard box it came | in, they aren 't doing it wrong, they're having fun, you don't | get to dictate what other people enjoy. Lots of video game | creators have that ego problem where they can't let go of how | _they_ thought the game is supposed to be played, and if you | have procedural generation that 's inherently _wrong_. Write a | visual novel next time. | Stevvo wrote: | Elite Dangerous is a counterpoint to that; massive procedural | galaxy, but very tight and designed gameplay loops. | zhynn wrote: | I think you are spot on. I have been toying with a formal | theory about this stuff which I then turned into a rambling | HN post. Here's what I came up with: | | In my theory, the reason that NMS is unsatisfying is because | it has great procedural generation breadth (PGB), but | insufficient procedural generation depth (PGD). | | PGB is defined as just the raw number of composable pieces | that can be used to generate artifacts. So, how many plant | bits, animal bits, biome bits, etc there are available to the | procgen system. NMS has sufficient PGB, there are lots of | bits. | | PGD is defined as the influence or interaction between PG | artifacts. The influence can manifest as either affecting the | generation itself (a high-gravity planet reduces the maximum | size of procgen critters) or artifact behavior (a dim star | cold planet alters the critter behavior to reduce movement to | conserve energy). NMS has neither, as far as I know. Human | defined categories or tag metadata for PG elements does not | count. Declaring that sets of procgen components are all | "desert" is not an interaction. NMS has very low PGD, the | animals, plants, and biomes do not influence each other. The | devs have grouped components together to make biomes look and | feel self-similar. But this was not done procedurally, and | has no depth to it. | | A game like Rimworld on the other hand has a lower PGB than | NMS. By the numbers, there are far fewer variations of | entities in Rimworld than there is in NMS. But Rimworld makes | up for this by having much higher PGD. The procgen landscape | influences climate and biomes. The biome and latitude will | influence growing season which influences the carrying | capacity of herbivores which influences the carrying capacity | of carnivores. The procgen of the pawn's social and family | relationships influence each other's behavior. The available | calories on the map influence what pawns are able to eat - | which will interact with their food preferences... The most | important thing is that PGD makes the procgen entities | actually matter and creates surprising stories. | | (As an aside, It occurs to me that PGD is directly related to | emergent behavior. It is possible that this could be formally | proved, as conway GOL emergent behavior is entirely | predicated on neighboring cells affecting each other. It | might be possible to prove that any system with sufficient | PGD is capable of emergent behavior: which is really what we | want from a PG system. We want to be surprised by the | unexpected, something that GOL is absolutely capable of. In | fact you could probably use GOL as a starting point for this | whole theory of PGD/PGB.... hmm....) | | The king of PGD is Dwarf Fortress. It has very high PGB and | PGD, and its ability to create surprising stories is | legendary. My hypothesis is that this is _why_ Dwarf Fortress | is so good. | | Anyway, since NMS has no PGD, it has no emergent behavior | (except by the players themselves), and you come away from it | feeling that it is a mile wide and an inch deep. No shade | though! It is a beautiful piece of work -- I have been | playing it from launch, and I think it is a towering | achievement (not to mention the most incredible PR story in | gaming history). No Man's Sky is an outstanding work of art | and it should be remembered for what it is, not what it could | be (or what many people think it should be). And the emergent | gameplay of the players is really fun too. | | But when I think about what it might be like to have a higher | PGD in NMS... it would be really something. | segh wrote: | Plenty of games get procedural generation right, from NetHack, | to Spelunky, to Minecraft. No Man's Sky might just be a bad | game. | spy888 wrote: | The logic sounds good in theory but in practice not much of this | is grounded in today's reality. No one is reading AI generated | text for their news or entertainment. That will be an important | first step. If it ever happens. | mdorazio wrote: | Counterpoint: I am fairly certain that a large portion of | financial news is machine-generated today. Machine-generated | blogspam is also extremely prevalent and edging into the "news" | space. | vimy wrote: | Yep. | | > The program can dissect a financial report the moment it | appears and spit out an immediate news story that includes | the most pertinent facts and figures. And unlike business | reporters, who find working on that kind of thing a snooze, | it does so without complaint.Untiring and accurate, Cyborg | helps Bloomberg in its race against Reuters, its main rival | in the field of quick-twitch business financial journalism, | as well as giving it a fighting chance against a more recent | player in the information race, hedge funds, which use | artificial intelligence to serve their clients fresh facts. | "The financial markets are ahead of others in this," said | John Micklethwait, the editor in chief of Bloomberg. In | addition to covering company earnings for Bloomberg, robot | reporters have been prolific producers of articles on minor | league baseball for The Associated Press, high school | football for The Washington Post and earthquakes for The Los | Angeles Times. https://www.nytimes.com/2019/02/05/business/me | dia/artificial... | fshbbdssbbgdd wrote: | I wonder if this is really using ML at all to generate the | output or if it's just filling out a template based on a | predefined set of numbers and/or text properties. Baseball | games, earthquakes, and financial data can all be ingested | with a schema. | cturner wrote: | If there is anyone here with access to it, please could you try | using it to generate some tight pixel art of simple concepts and | post the results? I am thinking - chair, coffee, boat, control | panel, wedding ring, lamp, pit. I am curious whether it could be | a fast way to design sprites. Imagine if it could be trained in | the style of particular desktop systems. | dschnurr wrote: | I gave this a try: | https://twitter.com/_dschnurr/status/1514673929209614337 | KaoruAoiShiho wrote: | Can you do the same in a more realistic style? | https://www.artstation.com/artwork/E1qPv | | I always thought the idea behind pixel art is to save on | artist resources, but since the AI is doing it I think indie | devs would rather do more something more advanced. | TaupeRanger wrote: | The only Pixel Art example I could find so far: | https://twitter.com/0xCharlota/status/1511965632765607936?t=... | SparkyMcUnicorn wrote: | Found this one too: | https://labs.openai.com/s/U5A157f1tgMOj2W97m3Kff5J | ak391 wrote: | open source alternative to DALL-E: | https://huggingface.co/spaces/multimodalart/latentdiffusion | tintor wrote: | Looking forward to the first text adventure game built using | DALL-E as a frontend. | bulbosaur123 wrote: | AIDungeon: been there, done that. It will become a battlefield | of the most insane perversions and you know it. | tintor wrote: | AIDungeon has a text frontend. What I meant was an image- | based frontend generated by DALL-E. | bduerst wrote: | AI-Dungeon is trained on previous played sessions, IIRC. It's | pretty limited to generating any new outputs if players | haven't previously fed it in (but I could be mistaken). | [deleted] | zitterbewegung wrote: | I am trying to figure out how to integrate Various deep learning | networks together to make a coherent game. One of the big | problems I have is having to use alternatives to DALLE / GPT-3 | because being contingent on their approval is a huge risk. I use | huggingface instead and I have many video cards. The current | state of trying to do this is the big problem of how to integrate | and getting good quality. GPT3 and other systems stop working at | around 500 words (tokens) and also DALLE is hard to use and it | looks like it takes lots of training yourself to make it work. | | I don't think the marginal cost is absolutely zero until we can | get classifiers or larger systems that can go to a description | from an image to a word and also having GPT3 or another system | that works to at least a few pages. Right now you have to cherry | pick it. | Barrin92 wrote: | Computer generated content has its place, as it already does in | many procedurally generated games as others have pointed out but | I do not think it will play a central role in the near future. | | One reason for this is technical. AI (not AGI but the systems | that actually exist) can synthesize existing game data into new | things but you'll still be in an uncanny valley of recycled | information and in the worst case be in some bizarre feedback | loop of old content that is permanently recycled and reconfigured | in ways that players will notice. No Man's Sky is an example. | | The other reason is social, and I think it's the stronger case. | We already have many games were computers could replace humans, | chess is the most obvious one. Yet computer chess competitions | only create marginal interest (by AI engineers) and players | vastly prefer human interaction for the sole reason that it is | human interaction. In the same vein there is computer generated | music that actually is good enough to fool people, but there | doesn't seem to be any market for synthetic artists. | | Or take Esports. Deepmind's AI systems are actually good enough | to produce Starcraft tournaments solely played by machines. The | games even look like high level human gameplay. But there is zero | interest in watching a bot tournament. | swix wrote: | True, but couldn't the same be said about the real world to | some degree? Everything "new" is based on something existing, | right? Even in sci-fi/fantasy movies which are completely wild, | completely "out there", the things in them are conjured by us, | which is in some way shape or form based on our imagination, | which is based in the reality we exist within. | | My point is, that feedback loop you speak of already exists, | it's just that... there are SO much variation, content, | possibilities... but the same thing could be true for AI | generated content in some sense. Feed it all our assets | (sounds, music, movies, images) and then let it go bananas... | :) - you'll have as much variation in the digital world as we | do in our real world. | avaer wrote: | Having worked on adjacent problems every day for 10 years, maybe | my opinion counts for something (or maybe not). | | But I think the top game of the 2030s will be something like AI | Minecraft or AI Steam, where everything, including the very rules | of the game, is generated from a structured data set optimized | for the player. | | And I think the "metaverse" (as much as I loathe the term) is | going to go down as the labeled training set for bootstrapping | this, just like the open web was the catalyzing training set for | the (already admittedly magical) AIs we have today. | | Further, I think Facebook won't be the one to design this, | because that is not what their share price incentivizes. | kurthr wrote: | I think with eye/attention tracking it will be absolutely | amazing what 3D content can be generated and optimized by AI/ML | to maintain the interaction feedback loop. I'm not sure it will | be a good thing (at least for those who didn't grow up with | it)... much like FB doom scrolling is a problem for the over 50 | set. | nhecker wrote: | Wow! That's a crazy thought. Eye tracking tightly coupled | (i.e., in a ~realtime feedback loop) with PG/ML/AI seems ... | powerful, or scary, or both. Something along the lines of a | computer controlled lucid dream, sprinkle in a handful of | whatever the equivalent would be of blinking banner ads, | product placements, or subliminal messages, etc. and my mind | spirals out of control imagining how that would play out. | jayd16 wrote: | A Kalvin-ball AI would be a pretty interesting novelty but I | think good games are usually focused on simple rulesets. Not | sure an AI is really needed there. Maybe you just mean hyper | tuning drop rates and and modifiers and such? | | I'm sure AI will eventually have a huge impact on the art, | narrative, and engineering of games, though, so maybe you're | correct that will bleed into game design as well. | SquibblesRedux wrote: | While custom-tailored games may be interesting, it seems like | such a thing would be socially fragmenting and isolating. | People need shared experiences to relate to each other. I'm not | sure the world would be a better place if people gradually have | fewer and fewer shared experiences. | mathattack wrote: | Unfortunately very few large companies truly operate to make | the world a better place if their making money depends on the | other direction. | buttonpusher wrote: | It seems like we could have both; people can generate worlds | by some combination of automation and manually tweaking | parameters or mods, then they can share that world with their | friends, and visit worlds created by their friends. Some | people may have esoteric taste, but the internet is good for | finding people who share your esoteric taste, for better and | for worse. | Zababa wrote: | > But I think the top game of the 2030s will be something like | AI Minecraft or AI Steam, where everything, including the very | rules of the game, is generated from a structured data set | optimized for the player. | | Steam and Minecraft are both social by nature. People very | often want to play with other people. It's like the joke I do | all the time about AI feeds on Netflix: they recommand the same | thing to everyone because the AI realized that everyone want to | talk about the thing they saw more than they want to enjoy | seeing the thing. Humans are socials creatures. | flycaliguy wrote: | I wonder if carving out individual experiences would prevent | users from displaying their use of said entertainment to create | social status. So I'm coming at it wondering if people value | the status that their form of entertainment provides more than | the experience itself. On the other hand, maybe we'll continue | to just observe the death of the "main stream" as we all slip | into our own niche communities, each with its own complex | system of status signifiers? | | This stuff really leaves me pretty puzzled. I'm a culture guy, | English grad. Art is not supposed to behave like this!!! | KaoruAoiShiho wrote: | Not sure I get the social network or metaverse angle. But | basically 0 cost content as it applies to entertainment and maybe | other industries as well. It's well beyond metaverse if you | consider metaverse or gaming to only be 1 aspect of | entertainment. And let's not be mistaken, we're talking about all | entertainment not just mass media. Looking forward to things like | replacement of restaurants and tourism and even chatting with | friends, and this is well ahead of what we need from AGI. | wodenokoto wrote: | I know you're being sarcastic, but as a kid I dreamed of an AI | dungeon master that could play dnd types games and draw | beautiful pictures of the scenery that the scenario would take | place in. | ALittleLight wrote: | This YouTube video, to me, shows the promise of things to | come. AI generated game worlds. Language models to generate | plots and dialog, transformers and GANs to create | illustrations. Imagine a game, a truly open world sand box, | Grand Theft Auto meets AI Dungeon - every NPC is a "real" | person with unlimited dialog options, the buildings you drive | by you could easily walk in and investigate, unlimited play | space - you could type in more general instructions and ideas | to the plot generator ("add in a vampire romance and murder | mystery angle") on the fly. | | https://www.youtube.com/watch?v=udPY5rQVoW0 | twoodfin wrote: | What you describe is actually all the more interesting | aspects of the "holodeck" as introduced and explored (some | would say too deeply) as a story concept on _Star Trek: The | Next Generation_. | | There are more than a few scenes where the intrepid crew | members struggle with what we'd now recognize as prompts. | | https://youtu.be/p7pPedBtbvk | oefnak wrote: | I'm really hoping that somebody puts in the dwarf fortress | forgotten beast descriptions... | jdrc wrote: | Looking forward to AI satire which i can download and run on my | laptop, away from the censorious ears that all but destroyed | comedy | exdsq wrote: | I used GPT-3 to generate some porn scripts of various | historical dictators and modern politicians that were pretty | funny. Putins secret romance with Kim Jung Un was especially | saucy, until Osama Bin Laden found out and told the EU. I'd be | happy to find them and put them on a blog somewhere if there | was an audience ha! ___________________________________________________________________ (page generated 2022-04-14 23:01 UTC)