[HN Gopher] New acoustic attack steals data from keystrokes with... ___________________________________________________________________ New acoustic attack steals data from keystrokes with 95% accuracy Author : mikece Score : 141 points Date : 2023-08-05 16:33 UTC (6 hours ago) (HTM) web link (www.bleepingcomputer.com) (TXT) w3m dump (www.bleepingcomputer.com) | elderlybanana wrote: | In response to this post, I just open sourced a starter project | to a variation of this idea: | https://github.com/secretlessai/audio-mnist. I've been interested | in doing image classification techniques like CNN on audio data | for a while. | | A couple years ago for a weekend project I made a simple "audio- | mnist" dataset from handwritten digit audio recordings. I never | got past a few days worth of work, but open-sourcing it has been | on my mind for a minute. This post kicked me into action. Getting | some more data, basic CNN examples, etc. could provide a nice | starting point for a lot of research and tools. | | There is still separate code I'd have to find and make | intelligible to create the recordings and split the audio. | | Anyway, in case anyone finds part of this process interesting or | useful. | zaxomi wrote: | New? Sovjet listened to typewriters in the 1970s. | MaximilianEmel wrote: | Now they can make wireless keyboards that don't need a battery or | radio! | gladiatr72 wrote: | Death metal. | | Suck it. | constantcrying wrote: | Very interesting that this is even possible. But seems somewhat | dangerous, making an audio recording is very easy. | lispisok wrote: | So they generated training data from one laptop and microphone | then generated test data with the exact same laptop and | microphone in the same setup, possibly one person pressing the | keys too. For the Zoom model they trained a new model with data | gathered from Zoom. They call it a practical side channel attack | but they didnt do anything to see if this approach could | generalize at all | jprete wrote: | I think this linited attack surface can work without having to | generalize one model to multiple people or keyboards. One | advantage of a Zoom attack is that you get "plaintext" shortly | after hearing the "ciphertext" if you can get the target to | type into the chat window. And when you hear typing in other | contexts it's likely to be something that matches a handful of | grammars that an LLM can recognize already (written languages, | programming languages, commands, calculation inputs) - and when | it doesn't, that's probably a password. | omgJustTest wrote: | The answer is that likely all the above are used. | | Asking for "what signal it is detecting" might be better asked | from a "what is the greatest signal bearing information" being | used... which would help in averting attacks. | | This kind of stuff could be real menacing in all sorts of | public places like airports, coffee shops and etc. | [deleted] | Geee wrote: | It's for a targeted attack. It doesn't need to be generalized. | voytec wrote: | Good enough for PoC. | OtherShrezzing wrote: | I believe that is the generalisable version of the attack. | You're not looking to learn the sound of arbitrary keyboards | with this attack, rather you're looking to learn the sound of | specific targets. | | For example, a Twitch streamer enters responses into their | stream-chat with a live mic. Later, the streamer enters their | Twitch password. Someone employing this technique could | reasonably be able to learn the audio from the first scenario, | and apply the findings in the second scenario. | TechBro8615 wrote: | Finally, a real security weakness to cite when making fun of | people for their mechanical keyboard. Time to start recording | the audio of Zoom calls with some particularly loud typers... | fatfingerd wrote: | Not according to the article.. Microphones are sensitive | enough to mount the attack on quieter keyboards. | thereisnospork wrote: | What we clearly need are louder keyboards - which | overload the mic so as to render keystrokes | indistinguishable. | meepmorp wrote: | I've wanted to integrate a cap gun into a keyboard, | basically a an old fashioned roll of paper caps and | solenoid to whack 'em, triggered by exclamation points. | TheCleric wrote: | Adding a gain knob to my keyboard, be right back. | dvngnt_ wrote: | for a few years I've used rtx voice to remove keyboard typing | and other background noise | yowzadave wrote: | I guess more reason to just use a password manager to | autofill your password? | kypro wrote: | Or just use 2fa | bee_rider wrote: | If you have 2FA and one part of it is easily figured out, | then you have one factor authentication. | | If you cared enough about the authentication in the first | place to bother with 2FA, then I guess it seems like the | reduction there is still something to be worried about, | right? | | Lots of "two factor authentication" schemes seem to | involve just getting a text or something, so, not very | secure at all. Of course, this is bad 2FA, but it is | popular. | gleenn wrote: | Perfect is the enemy of good. Text based 2FA is | compromisable relatively easily but at least it's an | extra hurdle. | 3np wrote: | It's the "or just" being the issue there, not the "use | 2fa". | jgtrosh wrote: | Only if it doesn't only rely on a master password | apendleton wrote: | A nice thing about master passwords though is that since | you don't have to type them in as often, they can be very | long. 95% accuracy probably isn't good enough to reliably | reproduce a sentence-length master password, at least if | it's only captured once. | belval wrote: | 95% means that on average only 1 in 20 keystroke will be | wrong. Even if your password is very long (40-60) that | means only 2-3 errors. Since more people are not machines | their long password will be a combination of words like | the famous "horsestaplebatterycorrect" example from xkcd. | | Even if you flip a few letters from something like the | above a human attacker will easily be able to fix it | manually. | | "horswstaplevatterucorrect" for example is still | intelligible. | moonchrome wrote: | Seems simple to defend - use a password manager. | WXLCKNO wrote: | Time to inject background audio of me typing "fuck you" into my | zoom calls. | zgluck wrote: | Tactical noise! | hoosieree wrote: | Text-to-keystroke-audio where the text comes from the LLM | Prompt "fanfiction based on HGTV's Love It or List It starring | an Ewok realtor and Klingon interior designer in iambic | pentameter". | | The goal is to cause the eavesdropper to totally reevaluate | their life choices, and maybe even get caught up in the story. | [deleted] | pengaru wrote: | So microphones need to get muted automatically by password | prompts, seems simple enough in principle. | ariym wrote: | Georgi Gerganov created one a few years ago | | https://github.com/ggerganov/kbd-audio | [deleted] | [deleted] | whoopdedo wrote: | If this means the end of those loud mechanical keyboards then | good. I never liked the clicking noise. | amelius wrote: | No it means the beginning of people playing recordings of loud | mechanical keyboards all day to thwart the snooping algorithms. | exabrial wrote: | Physical Access Owns, as usual. | hoosieree wrote: | Using an image classifier on spectrograms is pretty funny. Not a | bad idea, given image classifiers are dime a dozen, but still. | iainctduncan wrote: | It's actually quite common. One of the big bird recognition | apps does just this. | constantly wrote: | There are multiple apps for this? Seems like PBS KIDS should | own the authoritative one, and the licensing. | devsda wrote: | Some systems have a setting to disable touchpad for x | milliseconds after a key press. | | Do we need something similar for microphones too? | thedookmaster wrote: | I don't use the qwerty layout, I use colemak. Likely this | mitigates this for myself. | insanitybit wrote: | That's the equivalent of a shift cipher with a well known | offset. | dns_snek wrote: | I'm pretty confident that statistical analysis would give away | your layout (assuming there's enough data), I wouldn't be so | sure. | bqmjjx0kac wrote: | This is just security through obscurity. For real security, you | need a cryptographically rolling keyboard layout. | glitchc wrote: | Brilliant suggestion. Have a TRNG or a CSPRNG (if too poor | for a TRNG) choose the next layout at random for you, ideally | with every keystroke. Good luck cracking that! | hoosieree wrote: | Even using Vim or Emacs would add some | obufsCTRL[dbiobfuscation from all the spurious keystrokes. | segfaultbuserr wrote: | Some places use touchscreen keypads for PIN entry exactly | for this reason: to allow randomization, e.g. for opening a | locked door, or for authorizing a transaction. | bee_rider wrote: | That is interesting. | | I'm sure it depends on the application to some extent. I | can type my pin in without looking at all, so I can cover | it up while doing it. If I had to hunt and peck, it'd | easier for an onlooker to observe my slower motions I | think. | | But if I used the same machine often enough to produce | wear specific to me, this randomization would be really | useful. | zootboy wrote: | I use a randomized PIN pad on my phone, and I've gotten | quite used to it. I can enter my PIN almost as fast as I | could on an unscrambled pad; it's definitely not hunting | and pecking. | 8note wrote: | Do they randomize the key locations though? | | Otherwise, you leave behind grease where your fingers | touched | [deleted] | segfaultbuserr wrote: | Yes, the layout is randomized every time you use it. | mdp2021 wrote: | Could be done by using a device with a display - e.g. an | "ereader" - to present a random keyboard layout. But, good | luck being efficient typing on that. At that point, better | use a different input model. | | Or, use techniques such as those in the article, such as | random keypresses played during the actual ones. | raincole wrote: | Why not just a keyboard that produces random noise? | bqmjjx0kac wrote: | Because the real data stream would still be there, just | mixed with some noise. It feels harder to analyze whether | the noise sufficiently obscures the real keystrokes than it | does to ensure the actual keystrokes reveal no information. | ben_w wrote: | Finally, a use for Buffy's Swearing Keyboard. | | Or possibly the exact opposite of that, I can't tell if | it's a one-to-one mapping on mobile: | https://www2.b3ta.com/buffyswear/ | | (Also, I'm feeling my age now, given how many years have | elapsed since that kind of thing passed for internet | culture...) | usrusr wrote: | Whereas for practical security, having some common substring | in all your passwords that you don't type but insert through | some global hotkey would be just fine as a mitigation against | eavesdrop attacks. | | Yes, that's also obscurity, but obscurity is actually good - | it only got a (deservedly) bad reputation from when it gets | used as a _substitute_ (but I fail to see how using a | nonstandard keyboard layout would even count as obscurity in | the context of an audio attack, as the clear text reference | would surely go through the same layout?) | raffraffraff wrote: | My sister in law uses voice recognition and dictation | software, so she doesn't even use a keyboard! Totally safe! | schaefer wrote: | At least it would have, until just now, when you recklessly | disclosed your secret keyboard layout. :P | wildrhythms wrote: | Couldn't they just translate the detected keystrokes to colemak | layout? | dragonmost wrote: | Yes but you would have to know or try all possible layout | bunga-bunga wrote: | This specific attack could also be easily mitigated by | dictating your passwords instead. | transportgo wrote: | I think about this attack when streamers on Twitch logs into | websites etc. | nmeagent wrote: | I think an attacker would find that many streamers with high | quality audio have properly setup their mics with noise gate | filters to remove their relatively quiet keystrokes. | mxwsn wrote: | The example figure shows a key hit every half second, which | suggests a pecking style of typing at around 24 wpm. This way the | model gets very clean waveforms. I wonder how their approach | would work with average or fast typists. The sound profiles might | be much harder to link to characters. | zaxomi wrote: | Sovjet listened successfully to typewrites back in the 1970s. | mejutoco wrote: | Impressive. To be fair, a lot of typewriters jam if you press | more than one key at a time, plus they are very loud. | insickness wrote: | Zoom is good at filtering out rather loud background noises. I | can't imagine that the sound of background typing during a | conversation could be detected by the other party. | frant-hartm wrote: | What? Zoom (by default with auto mic adjustment) catches | everything. Typing on laptop is especially bad as it is closer | to the mic than the person speaking (unless there is external | mic), so it's like a stampede of rhinos. | bee_rider wrote: | In this case the parent comment is considering Zoom as an | ally, while you are considering it an adversary. | | So, in case that "what" was intended to denote some | confusion, there is the most likely source. | woadwarrior01 wrote: | If you're on macOS, you can use the voice isolation mic mode. | rjh29 wrote: | When I type my login or wallet password, I've done it so many | times that the sound profile is going to be quite different to | normal typing. Does the model handle that? | tehsauce wrote: | Would love a wireless keyboard that works using this! It wouldn't | need any battery, charging or syncing! | swid wrote: | Some old TV remotes used to work this way. They were made by | Zenith and are called Space Command remotes. Apparently they | are the reason TV remotes are sometimes called clickers. | | https://www.theverge.com/23810061/zenith-space-command-remot... | javajosh wrote: | I find this really hard to believe. If it were really possible | then _people_ could do it with their ears, and they would be | doing it and showing off that they can do it. The human ear (and | brain) are really, really good at finding patterns and getting | signal out of noise. | AndroTux wrote: | Computers are better at stuff than humans? Impossible! I am the | king of math, no machine beats me in calculating numbers! | zaxomi wrote: | This isn't new. Soviet listened to typewiters back in the | 1970s. | trifurcate wrote: | You're really surprised that computers can outperform humans at | pattern recognition? | javajosh wrote: | Yes. Humans have fantastic audio and video processing | abilities, particularly picking out signal from noise. Even | now human operators listen to sonar signals on submarines. | There's a reason for that. | crazygringo wrote: | Fascinating. I'm really curious what the acoustic properties are | that it's recognizing. | | Is it more of a physical fingerprint of each key, such that if | you swapped keys/springs the model would need to be updated? So | it's produced by manufacturing inconsistencies, the way | individual typewriters used to be forensically identified? | | Or is more each key being identical, but producing a different | resonance pattern within the keyboard/laptop due to the shape of | all of the matter surrounding it? If you move the keyboard in the | room, do you have to re-train the model? | | I also wonder how much it varies depending on how hard you press | each key -- not at all or a great deal? And what about by | keyboard -- when you compare thin MacBook keys with an external | full-height keyboard, is one easier/harder to recognize each key | on than the other? | tedunangst wrote: | But what passwords are you typing while on zoom and why aren't | you on mute? | constantcrying wrote: | I can imagine many, many situations where you might do this. | But maybe another thing to be worried about are scammees being | able to know the Password of people they are calling. | Tempest1981 wrote: | When calling my cellular/internet/medical/financial provider, | it might be interesting to "see" what they are typing. (Or if | they're randomly surfing the internet.) | tedunangst wrote: | How long are you talking to them that you've been able to | record samples of the sound of all their keystrokes and | perform this analysis? | slashdev wrote: | Call support, get the URLs and logins for all their internal | apps. Ouch! | jacquesm wrote: | Presumably all their backoffice stuff is only accessible | via VPN. Oh, wait... | foobiekr wrote: | Given your username, you might find this interesting: | | https://en.m.wikipedia.org/wiki/Tempest_(codename) | | TEMPEST considered almost everything from electromagnetic | leakage to exactly the attack described here. | syntaxing wrote: | Timing attacks have been attack vector for a while? I remember | reading a tool on HN a couple years ago about it. You don't even | need audio, the rate of which you enter the keys into the | password field is enough. | IshKebab wrote: | I seriously doubt that. | remram wrote: | How do you get the rate? | bqmjjx0kac wrote: | Maybe any one of your browser tabs has JS listening to the | accelerometer. It doesn't even require a permission, AFAIK. | crazygringo wrote: | By the way, some (most?) videoconferencing software removes | keyboard sounds from the audio, because it's particularly a | distracting problem with laptops where the microphone is _right | next to_ the keys. | | I'm pretty sure Zoom does this by default as part of its noise | cancellation (it's potentially even easier since you can use | keydown events to help identify, not just the audio stream). | | So as long as basic default noise cancellation is on, that would | at least prevent this over regular videoconferencing. And because | of this, I'm having a hard time thinking of when else this would | be a realistic threat, where the attacker wouldn't already have | enough physical access to either install a regular keylogger or | else a hidden camera. | 1123581321 wrote: | Meetings between organizations, multi-office cafeterias, or | coffee shops, perhaps. | cute_boi wrote: | i use 1password and have never ever typed password, so i am | probably safe. | AndroTux wrote: | Two words for you: Master password. | bdcravens wrote: | The risk isn't limited to passwords: | | "...passwords, discussions, messages, or other sensitive | information..." ___________________________________________________________________ (page generated 2023-08-05 23:00 UTC)