[HN Gopher] 15.ai
       ___________________________________________________________________
        
       15.ai
        
       Author : memorable
       Score  : 221 points
       Date   : 2022-06-12 03:33 UTC (19 hours ago)
        
 (HTM) web link (15.ai)
 (TXT) w3m dump (15.ai)
        
       | hojjat12000 wrote:
       | They named their tts "Deep Throat"? Why would you?
        
         | layer8 wrote:
         | Maybe they're seeing a need for text-to-speech in the porn
         | market?
        
         | Bytewave81 wrote:
         | They knew.
        
         | latenightcoding wrote:
         | bronnies
        
         | xdennis wrote:
         | It could be a reference to
         | https://en.wikipedia.org/wiki/Deep_Throat_(Watergate)
        
           | mgdlbp wrote:
           | to the Deep _Foo_ pattern in deep learning naming, more
           | likely.
        
             | BeFlatXIII wrote:
             | Why not both at once?
        
           | droidist2 wrote:
           | Which itself was a reference to the pornographic film of the
           | same name.
           | 
           | https://en.wikipedia.org/wiki/Deep_Throat_(film)
        
         | userbinator wrote:
         | Relatedly, a speech synth (or rather, the "output" part) that
         | has appeared on HN before is named the Pink Trombone:
         | 
         | https://news.ycombinator.com/item?id=18912628
        
         | quenix wrote:
         | Perhaps as a joke?
        
       | lagrange77 wrote:
       | I only get white noise after trying several inputs. Alignment
       | Confidence > 80%
        
         | [deleted]
        
       | armchairhacker wrote:
       | This is really cool. It's a text-to-speech and the gist seems to
       | be that they synthesize it from only a little audio.
       | 
       | The results are clearly synthetic and need work. However what's
       | cool is that there are a ton of characters (from popular shows
       | and video games) and there are useful statistics like inferred
       | emotion (which is also in the output).
       | 
       | Honestly it's a big problem how a lot of AIs are like "black
       | boxes" where you really can't customize or see anything. Yeah we
       | have DALL-E and GPT which can generate text images but the lack
       | of customization or fine-tuning the image afterwards severely
       | hinders what's possible with them. Ultimately what you want is
       | something interactive, where you can control how much or little
       | the AI generates, and give it really specific criterion.
       | 
       | But seriously: how did you get the domain `15.ai`?
        
         | [deleted]
        
         | jamal-kumar wrote:
         | I just used it to make spongebob squarepants say bad things.
        
           | BuyMyBitcoins wrote:
           | This thing synthesizes dolphin squeaks? Wow!
        
         | Der_Einzige wrote:
         | In the case of text generation, we call this "Constrained Text
         | Generation" and it is an active field of research. Without
         | going into too many details (I have a paper out for review
         | about this), it's pretty trivial to get "interactive control
         | over how much or how little the AI generates" by a combination
         | of filters on the LMs vocabulary, and effective selection of
         | the various hyperparamaters in the decoder (top_p, top_k,
         | temperature)...
        
         | userbinator wrote:
         | I agree this is an amazing demonstration of what AI can do, but
         | I think that the current method of "learn and repeat" that
         | depends on having tons of computing resources available is
         | still too inefficient in many ways. Personally I'm more
         | interested in what parameterisable formant-based synths can do,
         | since they are extremely efficient and can produce a
         | theoretically infinite variation of voices, although the output
         | quality is still not great. Example:
         | https://news.ycombinator.com/item?id=31604299
        
         | teaearlgraycold wrote:
         | You can fine tune GPT-3
        
           | canjobear wrote:
           | Only if you're OpenAI
        
             | jameshart wrote:
             | Fine tuning of GPT3 models is available via their public
             | API. Costs credits, and you need to get their permission to
             | use it in an actual application, but it's not locked in a
             | lab.
        
               | sillysaurusx wrote:
               | So "Only if you're OpenAI" :)
               | 
               | If the weights were public, the community would figure
               | out a way to fine tune it.
        
               | jameshart wrote:
               | It's not a matter of 'figuring out'. The model supports
               | fine tuning. It's a core feature of the openai API.
               | Running 'fine tuned' versions of GPT-3 that are created
               | by customers is literally their SaaS model. They have
               | examples in the documentation. Here:
               | https://help.openai.com/en/articles/5528730-fine-tuning-
               | a-cl...
        
             | Dangeranger wrote:
             | GPT-3 can be quite adaptive given prompt engineering and
             | the uploading of sample files.
             | 
             | Have you used GPT-3 with any of the methods mentioned in
             | the docs?
             | 
             | I've seen that GPT-3 can produce quite starkly different
             | results when prompted differently and when samples have
             | been uploaded.
        
         | [deleted]
        
         | Deritio wrote:
         | Dall-E 2 has customization.
         | 
         | You can remove or add things etc.
         | 
         | And for GPT you can also specify more details.
         | 
         | Only a question of time until you can work with the ai on your
         | art/thing.
         | 
         | There are ai models which keep track of context and others
         | which generate a plan of actions.
         | 
         | AI is not a blackbox
        
           | sterlind wrote:
           | OpenAI itself is a black box. Until I can reproduce their
           | models or download them myself, and have unfettered access to
           | them, it's just gatekept magic behind an API. So much for
           | democratizing machine learning.
        
             | judge2020 wrote:
             | > So much for democratizing machine learning.
             | 
             | Unless this is a recent change, their mission isn't that:
             | 
             | > OpenAI's mission is to ensure that artificial general
             | intelligence (AGI)--by which we mean highly autonomous
             | systems that outperform humans at most economically
             | valuable work--benefits all of humanity.
             | 
             | https://openai.com/about/
        
         | marcofatica wrote:
         | > But seriously: how did you get the domain `15.ai`?
         | 
         | it's an MIT project so I'm sure that was a factor
        
           | paulsutter wrote:
           | .ai domains cost a couple hundred bucks a year so domains are
           | very available / not widely used by domain squatters (Its the
           | country domain for the island of Anguilla, pop 15,000)
        
       | vehemenz wrote:
       | That's a lot of SpongeBob and My Little Pony characters. At this
       | point, is it fair to say the attachment to kids' cartoons is a
       | cultural (or pathological) phenomenon for under 30s?
        
       | eljimmy wrote:
       | This is unrelated but what's with the fascination with HN users
       | and My Little Pony? I've noticed this on a lot of posts in the
       | past few months.
        
         | canjobear wrote:
         | A lot of people in tech circles have a sexual fixation on the
         | show and its characters.
        
           | BeFlatXIII wrote:
           | It's a good thing they're warehoused in cities and
           | apartments, then.
        
         | jeroenhd wrote:
         | Aside from the causal brony references, this project originally
         | featured a lot of my little pony voices because it needed
         | meticulously annotated transcriptions of the input audio to be
         | trained well.
         | 
         | The extremely dedicated brony subculture voluntarily put in a
         | lot of work to get a corpus for the AI to learn from.
         | 
         | There's also another factor at play: this AI works best with
         | highly pitched voices, which my little pony is just full of.
         | Not only did MLP provide such a generous source of training
         | data, its results were also much more impressive than the dry
         | dictation many other corpi would've resulted in, adding to its
         | fame.
         | 
         | I personally haven't seen any significant rise in MLP
         | references, though that could be because I don't know the show
         | so I don't catch references to it. It's also very possible that
         | you've caught the Baader-Meinhof phenomenon.
        
         | crooked-v wrote:
         | It's basically the same as unironic appreciation of various
         | child-targeted-but-adult-friendly 'slice of life' anime, just
         | more incongruous-seeming because of the 'pony' thing.
        
         | smoldesu wrote:
         | I mean, 15.ai started as a 4chan project for /mlp/ users to
         | generate voice lines from official voice actors now that
         | Friendship is Magic is over (google Pony Preservation Project).
         | Honestly, the _more_ impressive part is that a bunch of
         | nobodies on an imageboard leapfrogged the rest of the industry
         | and made a now-famous voice transformer model.
         | 
         | In the greater sense, though? Ponies have always been this
         | weird relic of internet absurdity and bear-baiting. Some people
         | rep it ironically, other people are dead-serious, but the
         | community has significant overlap with the STEM field. As a
         | result, a lot of pony-related stuff would end up propagating
         | into the tech world, much like this very project.
        
         | loves_mangoes wrote:
         | A lot of people in or around tech are furries, are into things
         | like japanese animation, or are into My Little Pony. I don't
         | consider myself one, but people often jokingly say that furries
         | run the Internet.
         | 
         | And it's not really specific to HN. For instance you have well-
         | known people in the community who do vaccine R&D, or
         | cryptography, or contribute to the C/C++ standards at ISO, or
         | several other STEM things that are pretty outspoken about their
         | interests.
         | 
         | This is made more obvious on Twitter, where people tend to blur
         | their personal and work identities a lot.
        
         | Der_Einzige wrote:
         | My ML professor at the university I went to was also weirdly
         | obsessed with MLP.
         | 
         | Weeaboo/furry data scientists are always ahead of the industry
         | - I seem to recall an effective decensoring model that was
         | called "DeepCreamPy" and had almost 10K github stars before it
         | was nuked and rehosted.
         | 
         | I'm convinced that learning Statistics is in a zero-sum game
         | with social skills.
        
         | btown wrote:
         | https://en.wikipedia.org/wiki/My_Little_Pony:_Friendship_Is_...
         | explains in detail - between 2010 and ~2015 there was a massive
         | overlap between millennial geek culture and unironic fandom of
         | the rebooted My Little Pony show, especially among millennial
         | men. One dedicated fan hub averaged almost 400k page views per
         | day over its first 3.5 years of existence. And throughout it
         | all, programming projects abounded, such as the delightful
         | FiM++ esoteric language (https://esolangs.org/wiki/FiM%2B%2B)
         | styled after the show's framing device. For many in tech now,
         | it was an inescapable part of internet culture of the early
         | 2010s, and a fond memory for many.
        
           | jonas21 wrote:
           | One of my favorite examples from that era:
           | 
           | https://pjreddie.com/static/Redmon%20Resume.pdf
           | 
           | And in case you were wondering what this little pony did
           | next...
           | 
           | https://scholar.google.com/citations?user=TDk_NfkAAAAJ&hl=en
        
             | Der_Einzige wrote:
             | Wait, the guy who wrote darknet IS THE SAME GUY WHO DID
             | THIS RESUME?
             | 
             | AHHHHHHHHH
        
         | [deleted]
        
         | drblue wrote:
         | Friendship is Magic was a legitimately good show. (Or at least
         | Season 1 and 2 were).
        
       | nope96 wrote:
       | Oh god, 50 shades of SpongePants. The future is wild in ways I
       | never imagined. Star Trek style holodecks in what, 15 years?
       | 
       | So, creepy thought: should we be recording audio of our parents,
       | so we can still "hear from them" once in a while after they die?
       | People are going to want to reconstruct their lost loved ones
       | with AI. This project seems to imply you only need an hour or so
       | of audio.
        
         | batch12 wrote:
         | After my dad died, we found that he had recorded every phone
         | call he had with us. I thought about doing this combined with
         | text generation to create plausible prompts but never got the
         | guts to go through with it. He wouldn't care if I had done it,
         | but it wouldn't ease the guilt from years of sighs and rolling
         | my eyes when he called at always the wrong times.
        
       | WalterGR wrote:
       | If anyone is curious, the previous submission of this was
       | popular: https://news.ycombinator.com/item?id=25654118
        
       | convery wrote:
       | Interesting how it seems like there's little correlation between
       | source sample-size and quality. e.g. the Portal Sentry turret at
       | 1.5min input vs the 100+ minutes of the narrator from Stanly
       | Parable which sounded like auto-tune had a stroke.
        
         | jeroenhd wrote:
         | The AI seems to work best on high-pitched, female voices. The
         | model seems to have improved in this regard since I last tried
         | this website, but it's still very significantly biased towards
         | female voices it seems.
        
         | crooked-v wrote:
         | Much of it depends on refinement work on each specific model.
         | Try the Daria voices, for example, which are easy to get
         | results with that sound like they came straight out of the
         | show.
        
       | darkerside wrote:
       | Unfortunately, I guess I've reached the stage of my life where
       | there are only three choices I actually would recognize out of
       | the entire selection
        
       | _gabe_ wrote:
       | > All code and models used for this website were written and
       | trained as part of my research at the Massachusetts Institute of
       | Technology (MIT). The code and models are privately owned and are
       | not to be sold or distributed for unauthorized use.
       | 
       | Does anybody else find the irony in this statement absolutely
       | amazing lol.
        
         | belter wrote:
         | https://tlo.mit.edu/learn-about-intellectual-
         | property/owners....
         | 
         | "...MIT owns inventions made or created by MIT faculty,
         | students, staff, and others participating in sponsored research
         | projects or in MIT programs using significant MIT funds or
         | facilities or those inventions developed pursuant to a written
         | agreement with MIT..."
         | 
         | I got RickRolled as soon as arriving to the page. :-)
        
       | ntoskrnl wrote:
       | The Chell voice from Portal is extremely accurate
        
         | BeFlatXIII wrote:
         | How does she compare to the Gordon Freeman model?
        
         | blooalien wrote:
         | 100% accurate to be precise. ;)
        
       | deeplearner1 wrote:
       | If you want more information about 15.ai, I highly suggest
       | reading their Wikipedia article!
       | https://en.wikipedia.org/wiki/15.ai
       | 
       | The whole history behind the project is fascinating: 4chan had a
       | huge role in its development, and the project's work was stolen
       | by an NFT company that a famous voice actor endorsed not too long
       | ago.
        
         | julianeon wrote:
         | Ah, I was wondering why they were so concerned about
         | attribution.
         | 
         | The truth is that, today, if I was going to use a tool to
         | generate voices (say for YouTube), I wouldn't necessarily pick
         | a small SaaS tool. I'd use Amazon Polly or some other GCP-style
         | platform voice creation tool. There are already a few products
         | in the space, and their costs are so low as to be almost
         | negligible (example: Polly, 5 million characters free). For a
         | commercial project, I could probably stay on a free tier for a
         | whole year.
         | 
         | With Dall*E, it seems like the only option, and it's such a
         | superior option that a website could abuse it for commercial
         | profits. But for voice synthesis, it's already dirt cheap and
         | commercially available without limitations.
        
       | quickthrower2 wrote:
       | What is the tldr. Got a wall of terms of service I didn't want to
       | agree too and clicking reject was a Rickroll.
        
       | forrestthewoods wrote:
       | The copyright laws around this are fascinating. They're adamant
       | it must be non-commercial, they must be credited, and it can't be
       | mixed with any other generated content. Meanwhile their content
       | is exclusively derived from popular commercial products. Oh and
       | they also make money via Patreon donations.
       | 
       | I dunno. Feels a little gross to me. Eventually there is going to
       | be a big copyright case about a model trained with copyrighted
       | material. I have no idea how that will be resolved. Or maybe
       | there will simply be new laws passed to make it either explicitly
       | ok or explicitly not ok.
        
         | deeplearner1 wrote:
         | "Make money"? The creator loses several thousands of dollars a
         | month hosting the site, and it's done for free. The Patreon
         | donations are all voluntary and only offer a pittance to the
         | developer.
         | 
         | I highly suggest reading into the project first. The Wiki
         | article I linked before (https://en.wikipedia.org/wiki/15.ai)
         | answers all of your questions about copyright infringement.
        
       | jason2323 wrote:
       | Hah! If you click on reject on the cookies window it rickrolls[1]
       | you!
       | 
       | [1]https://www.urbandictionary.com/define.php?term=Rick%20Roll
        
         | capelio wrote:
         | Except that wasn't a cookies acceptance window...
        
           | quickthrower2 wrote:
           | Those 2 comments sum up the web in 2022
        
       | layer8 wrote:
       | Well that is one shitty ToS dialog.
        
         | [deleted]
        
       | claviska wrote:
       | I appreciate the intent, and I understand that many people will
       | do the wrong thing so this was probably an attempt to get such
       | folks to actually read and adhere to the TOS, but the obnoxious
       | consent dialog with a mandatory countdown turned me off. It's
       | probably not effective, either.
       | 
       | On desktop, maybe I'd open dev tools and remove it. On mobile, I
       | won't be bothered. I hate that this is what the web has become
       | and I choose to simply miss out on websites that behave this way.
        
         | sophiebits wrote:
         | Weird, I read through the text because I care about how I'm
         | allowed to use the things people are giving me - and by the
         | time I got to the Accept button, it was enabled.
        
         | darkerside wrote:
         | I just want you to know that it was absolutely hilarious to
         | hear (the first half of) this read in the voice of SpongeBob
         | SquarePants.
        
       | s-xyz wrote:
       | The DeepThroat model? Sounds familiar...
        
       ___________________________________________________________________
       (page generated 2022-06-12 23:00 UTC)