[HN Gopher] 15.ai ___________________________________________________________________ 15.ai Author : memorable Score : 221 points Date : 2022-06-12 03:33 UTC (19 hours ago) (HTM) web link (15.ai) (TXT) w3m dump (15.ai) | hojjat12000 wrote: | They named their tts "Deep Throat"? Why would you? | layer8 wrote: | Maybe they're seeing a need for text-to-speech in the porn | market? | Bytewave81 wrote: | They knew. | latenightcoding wrote: | bronnies | xdennis wrote: | It could be a reference to | https://en.wikipedia.org/wiki/Deep_Throat_(Watergate) | mgdlbp wrote: | to the Deep _Foo_ pattern in deep learning naming, more | likely. | BeFlatXIII wrote: | Why not both at once? | droidist2 wrote: | Which itself was a reference to the pornographic film of the | same name. | | https://en.wikipedia.org/wiki/Deep_Throat_(film) | userbinator wrote: | Relatedly, a speech synth (or rather, the "output" part) that | has appeared on HN before is named the Pink Trombone: | | https://news.ycombinator.com/item?id=18912628 | quenix wrote: | Perhaps as a joke? | lagrange77 wrote: | I only get white noise after trying several inputs. Alignment | Confidence > 80% | [deleted] | armchairhacker wrote: | This is really cool. It's a text-to-speech and the gist seems to | be that they synthesize it from only a little audio. | | The results are clearly synthetic and need work. However what's | cool is that there are a ton of characters (from popular shows | and video games) and there are useful statistics like inferred | emotion (which is also in the output). | | Honestly it's a big problem how a lot of AIs are like "black | boxes" where you really can't customize or see anything. Yeah we | have DALL-E and GPT which can generate text images but the lack | of customization or fine-tuning the image afterwards severely | hinders what's possible with them. Ultimately what you want is | something interactive, where you can control how much or little | the AI generates, and give it really specific criterion. | | But seriously: how did you get the domain `15.ai`? | [deleted] | jamal-kumar wrote: | I just used it to make spongebob squarepants say bad things. | BuyMyBitcoins wrote: | This thing synthesizes dolphin squeaks? Wow! | Der_Einzige wrote: | In the case of text generation, we call this "Constrained Text | Generation" and it is an active field of research. Without | going into too many details (I have a paper out for review | about this), it's pretty trivial to get "interactive control | over how much or how little the AI generates" by a combination | of filters on the LMs vocabulary, and effective selection of | the various hyperparamaters in the decoder (top_p, top_k, | temperature)... | userbinator wrote: | I agree this is an amazing demonstration of what AI can do, but | I think that the current method of "learn and repeat" that | depends on having tons of computing resources available is | still too inefficient in many ways. Personally I'm more | interested in what parameterisable formant-based synths can do, | since they are extremely efficient and can produce a | theoretically infinite variation of voices, although the output | quality is still not great. Example: | https://news.ycombinator.com/item?id=31604299 | teaearlgraycold wrote: | You can fine tune GPT-3 | canjobear wrote: | Only if you're OpenAI | jameshart wrote: | Fine tuning of GPT3 models is available via their public | API. Costs credits, and you need to get their permission to | use it in an actual application, but it's not locked in a | lab. | sillysaurusx wrote: | So "Only if you're OpenAI" :) | | If the weights were public, the community would figure | out a way to fine tune it. | jameshart wrote: | It's not a matter of 'figuring out'. The model supports | fine tuning. It's a core feature of the openai API. | Running 'fine tuned' versions of GPT-3 that are created | by customers is literally their SaaS model. They have | examples in the documentation. Here: | https://help.openai.com/en/articles/5528730-fine-tuning- | a-cl... | Dangeranger wrote: | GPT-3 can be quite adaptive given prompt engineering and | the uploading of sample files. | | Have you used GPT-3 with any of the methods mentioned in | the docs? | | I've seen that GPT-3 can produce quite starkly different | results when prompted differently and when samples have | been uploaded. | [deleted] | Deritio wrote: | Dall-E 2 has customization. | | You can remove or add things etc. | | And for GPT you can also specify more details. | | Only a question of time until you can work with the ai on your | art/thing. | | There are ai models which keep track of context and others | which generate a plan of actions. | | AI is not a blackbox | sterlind wrote: | OpenAI itself is a black box. Until I can reproduce their | models or download them myself, and have unfettered access to | them, it's just gatekept magic behind an API. So much for | democratizing machine learning. | judge2020 wrote: | > So much for democratizing machine learning. | | Unless this is a recent change, their mission isn't that: | | > OpenAI's mission is to ensure that artificial general | intelligence (AGI)--by which we mean highly autonomous | systems that outperform humans at most economically | valuable work--benefits all of humanity. | | https://openai.com/about/ | marcofatica wrote: | > But seriously: how did you get the domain `15.ai`? | | it's an MIT project so I'm sure that was a factor | paulsutter wrote: | .ai domains cost a couple hundred bucks a year so domains are | very available / not widely used by domain squatters (Its the | country domain for the island of Anguilla, pop 15,000) | vehemenz wrote: | That's a lot of SpongeBob and My Little Pony characters. At this | point, is it fair to say the attachment to kids' cartoons is a | cultural (or pathological) phenomenon for under 30s? | eljimmy wrote: | This is unrelated but what's with the fascination with HN users | and My Little Pony? I've noticed this on a lot of posts in the | past few months. | canjobear wrote: | A lot of people in tech circles have a sexual fixation on the | show and its characters. | BeFlatXIII wrote: | It's a good thing they're warehoused in cities and | apartments, then. | jeroenhd wrote: | Aside from the causal brony references, this project originally | featured a lot of my little pony voices because it needed | meticulously annotated transcriptions of the input audio to be | trained well. | | The extremely dedicated brony subculture voluntarily put in a | lot of work to get a corpus for the AI to learn from. | | There's also another factor at play: this AI works best with | highly pitched voices, which my little pony is just full of. | Not only did MLP provide such a generous source of training | data, its results were also much more impressive than the dry | dictation many other corpi would've resulted in, adding to its | fame. | | I personally haven't seen any significant rise in MLP | references, though that could be because I don't know the show | so I don't catch references to it. It's also very possible that | you've caught the Baader-Meinhof phenomenon. | crooked-v wrote: | It's basically the same as unironic appreciation of various | child-targeted-but-adult-friendly 'slice of life' anime, just | more incongruous-seeming because of the 'pony' thing. | smoldesu wrote: | I mean, 15.ai started as a 4chan project for /mlp/ users to | generate voice lines from official voice actors now that | Friendship is Magic is over (google Pony Preservation Project). | Honestly, the _more_ impressive part is that a bunch of | nobodies on an imageboard leapfrogged the rest of the industry | and made a now-famous voice transformer model. | | In the greater sense, though? Ponies have always been this | weird relic of internet absurdity and bear-baiting. Some people | rep it ironically, other people are dead-serious, but the | community has significant overlap with the STEM field. As a | result, a lot of pony-related stuff would end up propagating | into the tech world, much like this very project. | loves_mangoes wrote: | A lot of people in or around tech are furries, are into things | like japanese animation, or are into My Little Pony. I don't | consider myself one, but people often jokingly say that furries | run the Internet. | | And it's not really specific to HN. For instance you have well- | known people in the community who do vaccine R&D, or | cryptography, or contribute to the C/C++ standards at ISO, or | several other STEM things that are pretty outspoken about their | interests. | | This is made more obvious on Twitter, where people tend to blur | their personal and work identities a lot. | Der_Einzige wrote: | My ML professor at the university I went to was also weirdly | obsessed with MLP. | | Weeaboo/furry data scientists are always ahead of the industry | - I seem to recall an effective decensoring model that was | called "DeepCreamPy" and had almost 10K github stars before it | was nuked and rehosted. | | I'm convinced that learning Statistics is in a zero-sum game | with social skills. | btown wrote: | https://en.wikipedia.org/wiki/My_Little_Pony:_Friendship_Is_... | explains in detail - between 2010 and ~2015 there was a massive | overlap between millennial geek culture and unironic fandom of | the rebooted My Little Pony show, especially among millennial | men. One dedicated fan hub averaged almost 400k page views per | day over its first 3.5 years of existence. And throughout it | all, programming projects abounded, such as the delightful | FiM++ esoteric language (https://esolangs.org/wiki/FiM%2B%2B) | styled after the show's framing device. For many in tech now, | it was an inescapable part of internet culture of the early | 2010s, and a fond memory for many. | jonas21 wrote: | One of my favorite examples from that era: | | https://pjreddie.com/static/Redmon%20Resume.pdf | | And in case you were wondering what this little pony did | next... | | https://scholar.google.com/citations?user=TDk_NfkAAAAJ&hl=en | Der_Einzige wrote: | Wait, the guy who wrote darknet IS THE SAME GUY WHO DID | THIS RESUME? | | AHHHHHHHHH | [deleted] | drblue wrote: | Friendship is Magic was a legitimately good show. (Or at least | Season 1 and 2 were). | nope96 wrote: | Oh god, 50 shades of SpongePants. The future is wild in ways I | never imagined. Star Trek style holodecks in what, 15 years? | | So, creepy thought: should we be recording audio of our parents, | so we can still "hear from them" once in a while after they die? | People are going to want to reconstruct their lost loved ones | with AI. This project seems to imply you only need an hour or so | of audio. | batch12 wrote: | After my dad died, we found that he had recorded every phone | call he had with us. I thought about doing this combined with | text generation to create plausible prompts but never got the | guts to go through with it. He wouldn't care if I had done it, | but it wouldn't ease the guilt from years of sighs and rolling | my eyes when he called at always the wrong times. | WalterGR wrote: | If anyone is curious, the previous submission of this was | popular: https://news.ycombinator.com/item?id=25654118 | convery wrote: | Interesting how it seems like there's little correlation between | source sample-size and quality. e.g. the Portal Sentry turret at | 1.5min input vs the 100+ minutes of the narrator from Stanly | Parable which sounded like auto-tune had a stroke. | jeroenhd wrote: | The AI seems to work best on high-pitched, female voices. The | model seems to have improved in this regard since I last tried | this website, but it's still very significantly biased towards | female voices it seems. | crooked-v wrote: | Much of it depends on refinement work on each specific model. | Try the Daria voices, for example, which are easy to get | results with that sound like they came straight out of the | show. | darkerside wrote: | Unfortunately, I guess I've reached the stage of my life where | there are only three choices I actually would recognize out of | the entire selection | _gabe_ wrote: | > All code and models used for this website were written and | trained as part of my research at the Massachusetts Institute of | Technology (MIT). The code and models are privately owned and are | not to be sold or distributed for unauthorized use. | | Does anybody else find the irony in this statement absolutely | amazing lol. | belter wrote: | https://tlo.mit.edu/learn-about-intellectual- | property/owners.... | | "...MIT owns inventions made or created by MIT faculty, | students, staff, and others participating in sponsored research | projects or in MIT programs using significant MIT funds or | facilities or those inventions developed pursuant to a written | agreement with MIT..." | | I got RickRolled as soon as arriving to the page. :-) | ntoskrnl wrote: | The Chell voice from Portal is extremely accurate | BeFlatXIII wrote: | How does she compare to the Gordon Freeman model? | blooalien wrote: | 100% accurate to be precise. ;) | deeplearner1 wrote: | If you want more information about 15.ai, I highly suggest | reading their Wikipedia article! | https://en.wikipedia.org/wiki/15.ai | | The whole history behind the project is fascinating: 4chan had a | huge role in its development, and the project's work was stolen | by an NFT company that a famous voice actor endorsed not too long | ago. | julianeon wrote: | Ah, I was wondering why they were so concerned about | attribution. | | The truth is that, today, if I was going to use a tool to | generate voices (say for YouTube), I wouldn't necessarily pick | a small SaaS tool. I'd use Amazon Polly or some other GCP-style | platform voice creation tool. There are already a few products | in the space, and their costs are so low as to be almost | negligible (example: Polly, 5 million characters free). For a | commercial project, I could probably stay on a free tier for a | whole year. | | With Dall*E, it seems like the only option, and it's such a | superior option that a website could abuse it for commercial | profits. But for voice synthesis, it's already dirt cheap and | commercially available without limitations. | quickthrower2 wrote: | What is the tldr. Got a wall of terms of service I didn't want to | agree too and clicking reject was a Rickroll. | forrestthewoods wrote: | The copyright laws around this are fascinating. They're adamant | it must be non-commercial, they must be credited, and it can't be | mixed with any other generated content. Meanwhile their content | is exclusively derived from popular commercial products. Oh and | they also make money via Patreon donations. | | I dunno. Feels a little gross to me. Eventually there is going to | be a big copyright case about a model trained with copyrighted | material. I have no idea how that will be resolved. Or maybe | there will simply be new laws passed to make it either explicitly | ok or explicitly not ok. | deeplearner1 wrote: | "Make money"? The creator loses several thousands of dollars a | month hosting the site, and it's done for free. The Patreon | donations are all voluntary and only offer a pittance to the | developer. | | I highly suggest reading into the project first. The Wiki | article I linked before (https://en.wikipedia.org/wiki/15.ai) | answers all of your questions about copyright infringement. | jason2323 wrote: | Hah! If you click on reject on the cookies window it rickrolls[1] | you! | | [1]https://www.urbandictionary.com/define.php?term=Rick%20Roll | capelio wrote: | Except that wasn't a cookies acceptance window... | quickthrower2 wrote: | Those 2 comments sum up the web in 2022 | layer8 wrote: | Well that is one shitty ToS dialog. | [deleted] | claviska wrote: | I appreciate the intent, and I understand that many people will | do the wrong thing so this was probably an attempt to get such | folks to actually read and adhere to the TOS, but the obnoxious | consent dialog with a mandatory countdown turned me off. It's | probably not effective, either. | | On desktop, maybe I'd open dev tools and remove it. On mobile, I | won't be bothered. I hate that this is what the web has become | and I choose to simply miss out on websites that behave this way. | sophiebits wrote: | Weird, I read through the text because I care about how I'm | allowed to use the things people are giving me - and by the | time I got to the Accept button, it was enabled. | darkerside wrote: | I just want you to know that it was absolutely hilarious to | hear (the first half of) this read in the voice of SpongeBob | SquarePants. | s-xyz wrote: | The DeepThroat model? Sounds familiar... ___________________________________________________________________ (page generated 2022-06-12 23:00 UTC)