[HN Gopher] OpenAI quietly launched Whisper V2 in a GitHub commit ___________________________________________________________________ OpenAI quietly launched Whisper V2 in a GitHub commit Author : fudged71 Score : 61 points Date : 2022-12-06 18:24 UTC (4 hours ago) (HTM) web link (github.com) (TXT) w3m dump (github.com) | nshm wrote: | Looks like they plugged GPT-4 AI into speech recognition research | and now they are going to release huge updates every month. | dweekly wrote: | What is the basis for this claim? How does their GPT-4 work | intersect with the work on their ASR model? | tpmx wrote: | So in general and from the probably uninformed outside it seems | like OpenAI (120 employees?) is outperforming Alphabet (187k | employees). How? | amelius wrote: | How can the Word Error Rate (WER) be larger than 100% for some | languages? | lunixbochs wrote: | If the target is the words "A A A" and you produce "B B B B", | you have more errors than there were words in the target. 3 | replacements and 1 insertion. | iKlsR wrote: | I've been using whisper to get transcripts from my local radio | stations. I know it's out of scope for the original project but I | hope someone can build a streaming input around it in the future. | Currently pipe in and save 10 minute chunks that get sent off for | processing. | rexreed wrote: | What processing / server / backend are you using to run the | whisper model? | galleywest200 wrote: | I have been using Whisper to transcribe my audio notes. I just | save my voice memo from my phone to my NAS and my little script | does the rest on a loop. | chimineycricket wrote: | How do you handle connectivity from outside your home network | (when you're at the grocery store for example)? Do you have a | VPN running? | tehf0x wrote: | Not OP but I use syncthing for such things. | lunixbochs wrote: | Nice catch. I'll run my test suite [1] on this and report back. | | [1] https://twitter.com/lunixbochs/status/1574848899897884672 | amelius wrote: | Would be cool if you could run these speech models in tandem, | and compute a new error-rate for the consensus of them. | daemoens wrote: | Is there a reason why Spanish and Italian have a lower WER than | English? OpenAI is based in America right? They probably focused | on English more than anything. | nshm wrote: | Spanish and Italian are very easy to recognize due to very | simple phonetic structure. ___________________________________________________________________ (page generated 2022-12-06 23:00 UTC)