[HN Gopher] Show HN: Clone your voice and speak a foreign language ___________________________________________________________________ Show HN: Clone your voice and speak a foreign language Author : _josh_meyer_ Score : 115 points Date : 2022-01-03 20:17 UTC (2 hours ago) (HTM) web link (coqui.ai) (TXT) w3m dump (coqui.ai) | alonmln wrote: | Cool, it's impressive how much can it do with a short sample, | although this seems like an easy way for end users to deep fake | their friends / enemies saying something. | tiborsaas wrote: | I tested it with your comment: https://sndup.net/mghy/ :) | | It's also a new possibility to somewhat personalize the text to | speech engines. The above example is not really close to my | voice. | Philip-J-Fry wrote: | Maybe the solution is to have a randomly generated paragraph of | text to read which expires in short amount of time. So you | can't predict it and you don't have enough time to splice | together a fake reading from something else. | kdavis wrote: | Currently we're looking at possible solutions, see for example | here[1]. If you have suggestions, feel free to chime in! | | In the demo we specifically disallowed bulk uploads to hinder | such abuses. | | [1] https://github.com/coqui-ai/TTS/discussions/1036 | acqbu wrote: | Gold! | jeroenhd wrote: | Interesting. I like the addition of music to make sure it's not | just a raw voice sample. The output I get seems to be a mix of a | native speaker and my voice, because my (thick) accent is being | filtered out. | | I suppose that if I ever take proper English pronunciation | classes, I now know what to strive for. | wombatmobile wrote: | Awesome! | | How do I embed this? | bagels wrote: | Is there a static demo that I don't have to provide my own voice | for? | [deleted] | kdavis wrote: | We did not provide such a demo in part to hinder nefarious uses | of the technology. | crumpled wrote: | Honestly, how much of a hinderance is that? A person could | just supply a recording of another person, couldn't they? | reubenmorais wrote: | The project page has a bunch of pre-rendered samples and ground | truths: https://edresson.github.io/YourTTS/ | pcarolan wrote: | This is incredibly impressive and does a great job of capturing | my voice. Well done! | akeck wrote: | Is it supposed to translate or just read with the target accent? | For me, it's only reading the English input text with the target | accent. | reubenmorais wrote: | It doesn't translate the text, you have to put in text in the | target language. But you can record audio speaking in any | language you want. | [deleted] | sxv wrote: | My 26 second training input perhaps wasn't enough. The result | sounded like someone else. Is the result some kind of merger of | my voice and a native speaker's? | reubenmorais wrote: | Similarity depends on many factors: recording quality, which | language you're synthesizing in (models trained on more | speakers do better), and diversity of prosody in your | recording. Try recording for a bit longer and "acting out" a | bit in your tone, that tends to give me interesting results :) | IanCal wrote: | Very interesting! Is the music an intentional blended track or an | artifact of generation? | _josh_meyer_ wrote: | very much intentional. | | Background music makes misuse/abuse less likely (both | intentional and unintentional) | | Read more here about in our open discussion: | https://github.com/coqui-ai/TTS/discussions/1036 | momolo wrote: | is the model available? | _josh_meyer_ wrote: | Demo: https://coqui.ai Code: https://github.com/coqui-ai/tts | Blogpost: https://coqui.ai/blog/tts/yourtts-zero-shot-text- | synthesis-l... Paper: https://arxiv.org/abs/2112.02418 | echelon wrote: | This is so cool! Thank you! | | How do y'all intend to profit (succeed as a startup) if | you're releasing so much publicly? I'd love to see you guys | succeed. | | Really great to see where some of the Mozilla TTS folks wound | up, too. | SwiftyBug wrote: | I speak Brazilian Portuguese natively. I chose to record my voice | saying a specific sentence and to "translate" it to Brazilian | Portuguese using the exact same sentence. I was very pleased to | find out that I became a Mineiro from the countryside, one of the | coolest accents in Brazil! | actually_a_dog wrote: | You spoke Portuguese into it and it just changed your accent? | That's kinda cool. | reubenmorais wrote: | The Brazilian Portuguese model is a bit of an extreme showcase | (and thus really cool!), as it was trained on a single speaker | (entirely recorded by the main author of the paper, Edresson | Casanova, who's Brazilian). | | The fact that it can do multi-lingual voice cloning at all in | that case is already surprising. You can find more details in | the project page [0] and paper [1]. And here's the corpus. [2] | | [0] https://edresson.github.io/YourTTS/ | | [1] https://arxiv.org/abs/2112.02418 | | [2] https://edresson.github.io/TTS-Portuguese-Corpus/ | winter_squirrel wrote: | ceva wrote: | it says enter your text here .. | kdavis wrote: | You're free to enter any input sentence you want in the text | box. | | The input sentence generally should be in the language you | selected from the dropdown. For example, if the dropdown has | "French" selected you could enter the text "Allons enfants de | la Patrie, Le jour de gloire est arrive!" | | Clicking "Submit" then generates a TTS reading of the sentence | you input in the language selected from the dropdown. | | For fun you can mix and match. In other words, select a | language from the drop down and enter text in the text box | _not_ in the language selected from the dropdown. (For example, | the dropdown could have "French" selected and the sentence | could be "O say can you see, by the dawn's early light". This | gives interesting results, it sounds as if a native French | speaker is speaking English.) ___________________________________________________________________ (page generated 2022-01-03 23:00 UTC)