[HN Gopher] AI unmasks anonymous chess players, posing privacy r... ___________________________________________________________________ AI unmasks anonymous chess players, posing privacy risks Author : O__________O Score : 148 points Date : 2022-12-11 12:09 UTC (10 hours ago) (HTM) web link (www.science.org) (TXT) w3m dump (www.science.org) | Havoc wrote: | This seems incredibly ominous when transferred to social media. | Large parts of the internet rely on the pseudo anonymity of it | quite heavily - see reddit etc. | dewey wrote: | Basic opsec principles always say that if you want to stay | anonymous you have to switch the way you are writing (by | adopting a differnet personality, running it through | translation apps etc.). | | If someone would want to stay anonymous the unmasking % would | probably be lower. The threat-model of the chess players | doesn't include that they have to stay anonymous and need to | switch up their way of playing. | Havoc wrote: | >>switch the way you are writing | | I'd be very surprised if that actually works. Stuff like | vocabulary can't exactly be turned off at will | greggarious wrote: | I'm fond of occasionally throwing in some ou s and references | to cities other than Ontario to throw folks off, but it's | hard to pull off long term | wussboy wrote: | I'm not convinced that, for 99.9% of use cases, anonymity is a | feature. I think it's far more likely that the easy anonymity | that has been the default for much of the existence of the | Internet has harmed society. | | I get downvoted whenever I say this, but anonymous speech is | only allowed by recent technology and has never been a part of | our ancestral environment. | potatototoo99 wrote: | Freedom of speech is also a recent invention. | hairofadog wrote: | I'm not downvoting, but I think the concept of anonymous | speech goes way way back to the origins of writing, doesn't | it? The change in recent years is that it's extra hard to | stay anonymous, what with the surveillance economy? | wussboy wrote: | Perhaps. But if you consider: | | 1. The cost of printing/transcribing something 2. Literacy | rates 3. Constraints tied to physical distribution | | ...the reach of that potentially "anonymous speech" was the | tiniest fraction of what we experience today. And even then | it wasn't necessarily anonymous, unless you just left books | lying around? | hairofadog wrote: | I think it was pretty doable: https://en.wikipedia.org/wi | ki/List_of_anonymously_published_... | wussboy wrote: | I won't argue that it wasn't possible. But that link | shows 20-40 books, and covers 3000 years of human | culture. That's about one anonymous book per century. I | think we need more anonymity than that. But I think the | amount of anonymity we have now is disastrous to civil | society. | | Appreciate the link though. Thanks for engaging. | kypro wrote: | Seems quite easy to solve this though. An AI could easily | anonymise your text by rephasing sentences. | s3000 wrote: | For those you haven't seen it, from 2 weeks ago: | | Show HN: Using stylometry to find HN users with alternate | accounts [1] | | [1] https://news.ycombinator.com/item?id=33755016 | [deleted] | makeworld wrote: | Perhaps anonymous social media (like 4chan, with its lack of | usernames) will become more popular. | [deleted] | LarryMullins wrote: | Or less popular, because there is now a greater risk of your | 4chan comments being associated with your other online | identities. | | There may be some additional safety in conforming to the | local "memespeak" dialect, and not using that dialect | elsewhere. | password4321 wrote: | In case you missed this two weeks ago, "find by example" is | possible in many datasets even without AI: | | _Show HN: Using stylometry to find HN users with alternate | accounts_ | | https://news.ycombinator.com/item?id=33755016 | r3trohack3r wrote: | This is one of the use cases I see AI helping with. You can | already ask ChatGPT to reword content for you - breaking | analysis like this. | | A (humorous) example: https://vc.blankenship.io | | If you want to remain anonymous, use an AI filter for your | written content. | jacooper wrote: | Thats definitely not private, which is the point of defeating | stylometry | | If only there was some kind of re wording AI that can be run | locally. | r3trohack3r wrote: | Given current trends, I guess I'd say give it 6months? | Siira wrote: | There is OPT and BLOOM, which need lots of expensive GPUs | to run. Not as good as GPT3, IMO. I doubt there will be | any good local alternatives in a year. StableDiffusion | needed light compute, not at all comparable to these | beasts. | JadeNB wrote: | > Given current trends, I guess I'd say give it 6months? | | I'm happy to give it 6 months until the technology is | there to do it, if it's not already--but, as long as | there's an owner of the technology (that is, as long as | the technology is pre- the point where I can easily roll | my own), I'm skeptical of any owner in today's privacy | climate intentionally forgoing the opportunity to suck up | personal data whenever and however they can. | kristopolous wrote: | This is nice. I consider my stuff fairly easy for me to | recognize so when I read the other users I found a bit of | myself in them. | | I thought "well this person seems a bit cynical" - you know, | it's not a bad way to go outside yourself | alkonaut wrote: | Pseudonymous. Just like anyone who is willing to do the work can | identify my real identity from my HN writing, a pseudonymous | chess handle gives a lot of information. A chess player who wants | to be anonymous should not re-use a handle for two games. There | is no anonymity anywhere if you provide enough entropy which you | do if you use a persistent pseudonym. | nextlevelwizard wrote: | If you want to play against good players you need some games to | gain rating | alkonaut wrote: | Yes. That's fine so long as you never play in person as well, | thereby tying a real identity to your pseudonym. | | Otherwise you can remain pseudonymous (not anonymous) for as | long as you want. | | But there is no way to do a mix of in-person and pseudonymous | writing/chess/art/anything with a personal "style". | random_kris wrote: | Mmmm this could be solved using Zero-Knowledge proofs. | | When registering pick an elo. Provide a proof that you own | the account in that Elo range and then you can create another | account that will start in that Elo range. | cute_boi wrote: | So, to solve such issue we can add noise? So, lets say I play 1 | game, i let another people/bot to play another game using same | account? | sureglymop wrote: | What if the playing style of one game is already enough to hint | at who might be playing? | alkonaut wrote: | Then you are never entirely anonymous. I doubt that's the | case though. | YetAnotherNick wrote: | They aren't comparing two sets of games. They are comparing | a single game with all player's known set of game. Any FIDE | rated player will have a set of games that is known to | everyone. | matsemann wrote: | To not get unmasked by things like this I throw in some blunders | here and there. Unfortunately my other moves are often also | blunders.. | djexjms wrote: | Maybe the strategy here is to just not make any blunders. | dtgriscom wrote: | Oh, right. Like it sounds so easy when YOU say it. | neaden wrote: | There was recently a player who shot up the Chess.com blitz | leaderboard under the name Sinister Magnus. They were eventually | removed from the leaderboard, presumably for being an alt of an | existing GM, but I don't think anyone has figured it out for sure | who they are, it would be interesting test to see if they are | able to figure the player out using this. | rollcat wrote: | Interesting! Why is it not allowed to have alt accounts? And | why was such severe action taken, without enough evidence? Why | do people want using alts in the first place, in a game with | perfect information? | | In StarCraft II, having alts is tolerated (well, depending on | your manners - nobody likes smurfing), but we have | sc2revealed.com which takes crowd-sourced reports to try to | unmask "barcode" (llllllllllll) players. Many pros try to | practice anonymously on the ladder, because SC2 a game of | imperfect information, and in a best-of-3 series (like in a | tournament), you 1. don't want to use the same opening every | match, and 2. don't want your opponent to immediately recognize | what you're doing, or work on preparing a counter ahead of | time. | neaden wrote: | You can have an alt account, and Sinister Magnus is still | around, you just can't be on the leaderboard more then once. | So SM is presumably the alt of someone who is already on the | leaderboard. Similar to StarCraft a GM might want to prep | openings without revealing that's what they are doing, so | there are reasons for having alts but that applies more for | longer time controls rather then blitz. Generally GMs do | their prep with a small and usually secret team before a big | tournament. | version_five wrote: | Reminds me of this - predating AI | | https://axbom.com/keystroke-dynamics/ As early as | 1860, experienced telegraph operators realized they could | actually recognize each individual by everyone's unique tapping | rhythm. To the trained ear, the soft tip-tap of every operator | could be as recognizable as the spoken voice of a family member. | myself248 wrote: | Oh yeah, every ham knows you can recognize the sending fist. If | they're actually using a real key, anyway. | Victerius wrote: | I don't understand your sentence. | wbl wrote: | Hey's saying if you are using a straight key instead of an | iambic keyer (different morse sending tech) then you can | recognize the operator from their patterns (the first) | dd82 wrote: | https://en.wikipedia.org/wiki/Telegraph_key#Operators'_%22f | i... | | > With straight keys, side-swipers, and, to an extent, | bugs, each and every telegrapher has their own unique style | or rhythm pattern when transmitting a message. An | operator's style is known as their "fist". | | > Since every fist is unique, other telegraphers can | usually identify the individual telegrapher transmitting a | particular message. This had a huge significance during the | first and second World Wars, since the on-board | telegrapher's "fist" could be used to track individual | ships and submarines, and for traffic analysis. | Tempest1981 wrote: | Identifiable ... by the way they tap out Morse code with | their hand/fist. | | Ham = amateur radio operator: | https://en.wikipedia.org/wiki/Amateur_radio | flak48 wrote: | This made my Google the etymology of 'ham-fisted' and I was | disappointed that it had nothing to do with ham operators | with carpal tunnel syndrome. | mzi wrote: | It's the opposite, really. The radio term was a pejorative, | as the professionals saw the amateurs as ham-fisted. | mattr47 wrote: | My Dad was a morse intercept operator for the US Army, mid 60s, | stationed in Northern Japan. He has stories of them naming all | the Soviet morse operators by the way the tapped. | Rodeoclash wrote: | Cryptonomicon mentions this as well. Each operator having a | particular "fist" that was unique to them. | nextlevelwizard wrote: | I remember playing FPS game called Enemy Territory as teenager | and after awhile whenever I was in 1v1 shoot out with one of my | "clan" members I knew who it was based on their movement. | sitkack wrote: | I was walking down the street a couple years ago, in my | peripheral vision I noticed the gait of someone walking | across the street traveling the other direction, instant | recall of that former coworkers name from 10 years previous. | It would have taken longer to recognize them visually if they | were standing still. Still amazes me, years later. | jacquesm wrote: | Gait? | sitkack wrote: | Fixed, thanks! | [deleted] | erk__ wrote: | Off topic, but if it was Wolfenstein ET I will just add that | there still is a community for it and a Foss version of it | that runs on modern operating systems | | https://www.etlegacy.com/ | smarri wrote: | Thanks, I spent many hours and late nights into early | mornings playing this game. | iLoveOncall wrote: | I doubt those players are really anonymous anyway, as either | lichess or chess.com can easily identify them. | | I don't see this changing anything, especially when we already | know that chess.com is not impartial in its treatment of players. | croes wrote: | Could it be used to find cheaters? | | If you don't play like yourself you're likely cheating. | recursive wrote: | I really don't like this. Sometimes I want try a different | style. It sounds like a presumption of guilt. "Machine doesn't | know what' he's doing? Probably cheating" | WJW wrote: | For most normal humans, if they merely try a different style | than what they are used to their performance would probably | go down rather than up. After all, you'd have seen many | common positions before and know most of the usual ideas for | your chosen openings etc. | | If someone suddenly plays a different style and also their | move quality goes up significantly, that might be an | additional indicator of cheating. All cheat detection works | in a probabilistic fashion, since it is not allowed (and | would be way worse) to actually observe players 24/7 in their | home to verify whether they're cheating or not. | swayvil wrote: | To state the obvious : of course we are using this exact same | technology to identify people by writing style. All of us. Right | now. | quotemstr wrote: | Just wait until quantum computing breaks all non-PFS encrypted | internet traffic from the past 20 years. It's going to be _wild_. | David Brin is going to get to live out his vision of a | transparent society. | [deleted] | philippejara wrote: | I'm not sure why they ham up the risk of privacy loss regarding | unmasking anon players, the only "risk" I can think of would be | someone developing and testing new strategies but then I'd assume | the fingerprinting would be far less accurate, testing if you can | effectively hide your own quirks while deviating from them would | be far more interesting. As it stands this is nothing new and the | concerns seem weirdly pointed, everyone(?) already knows the | risks of fingerprinting and pattern recognition in more general | applications. It was a fun paper to read, shame half the article | promoting it was cautioning. | lobe wrote: | Often the new strategies top level players test on anonymised | accounts are subtle tweaks in lines deep into / slightly beyond | opening theory. Often these lines are slightly inferior to | mainline but come with an edge due to the "surprise factor" | making it harder for opponents to prepare. Each of these subtle | tweaks will only arise in a small proportion of games (as only | some of the time your opponent will play the line you want to | test), so I imagine fingerprinting based on play style will | still work relatively well. | forrestthewoods wrote: | Super cool and fascinating. | | I don't know why, but I don't like that the headline frames it as | a "privacy risk". Are we really concerned about privacy when | playing chess? | | I think the world probably needs to accept there's no such thing | as "anonymous behavior". Behavior itself is individualized. | Therefore if behavior can be observed the probability that it is | anonymous rapidly approaches zero with time and observations. The | only way to be verifiably anonymous is to not be observed. | | This means if you are a person at risk of harm if your identity | is unmasked that you can't rely on supposedly anonymous behavior. | Bummer. | JadeNB wrote: | > I don't know why, but I don't like that the headline frames | it as a "privacy risk". Are we really concerned about privacy | when playing chess? | | To the extent that headlines matter, I'd way rather that people | worry about the privacy risks of de-anonymizing technology long | before it's at the point where it's a practical issue. If we | only worry about it when it becomes an actual issue already | being, or about to be, applied to unambiguously privacy- | invading matters, then, well, that's the way we've already done | it and it's too late now--why didn't you bring it up earlier? | | I'd also prefer to avoid the "what do you have to hide?" issue. | Maybe someone, for whatever reason, _does_ have something to | hide; if they intentionally play chess anonymously, presumably | they intend to do so. It shouldn 't be up to me to decide | whether or not their need, or even just desire, for privacy is | legitimate. | | (Of course, it's already too late--and has been since, at the | very latest, the AOL incident--to worry about the onset of such | de-anonymization, but it's always (or only almost always?) | better to face inevitable future problems now, rather than | waiting for that future.) | rvba wrote: | Why did meta hire someome to study poker bots? Arent they afraid | that after spending a lot of money on some developer that peraon | will just quit facebook and write own poker bots? | | How does this research help facebook? | | Im very, very far away from Musk and his antics, but really some | of those big companies seem to have lots of people who do passion | projects. | | Meanwhile an actual user has low if no chance to get decent | support (probably for the cost of that of programmer they could | get multiple people). | | And yes I am aware that I sound anti illectual here and research | the sake of reseaech can lead to nice things. I just think that | the person will quit facebook to write poker bots and ruin the | game for those who play it by detecting their weaknesses ( btw. I | dont even play poker). | Jerrrry wrote: | Studying poker and humans style of play of the game translate | directly to improving generalized agents that can play other | games. | | For all we know, the guy who was kinda good at poker made the | small break-thru that led Google's Alpha/Omega chess or Go | achievement. | | It's actually difficult to tell what piece of such complex | systems are responsible for which - but in general, applying | incremental piecemeal improvements have been monumental for the | magnitude jump in progress in recent years. | LarryMullins wrote: | > _Arent they afraid that after spending a lot of money on some | developer that peraon will just quit facebook and write own | poker bots?_ | | That's a risk when you pay workers to research or learn nearly | anything new. If you run a pizzeria, you teach your workers how | to make pizzas; they could turn around and make pizzas for | another business instead. Maybe even open their own pizza shop | and compete with you, using your own recipe. | | I think accepting this kind of risk is simply table stakes for | running a business. | [deleted] | Fricken wrote: | In the early to mid-2010s AI hype peaked, radical Kurzweillean | ideas became sales pitches from founders, and a great wave of | discussion and speculation passed through the media public | consciousness. | | For a time r/futurology was an interesting place for discussion, | and there were really great comments to be found amidst the | internet chaff . | | One of the things I speculated about then was that ai doesn't | need be sentient to ruin everything. Powerful tools in the hands | of malicious actors could wreak havoc on the internet. | | The internet could become compromised in so many ways via privacy | invasions and data theft, aggressive spam, misinformation, | propaganda and malicious code that nobody can reliably depend on | it for much of anything any longer. One could receive a phone | call from someone who sounds like their own mother, an ai that | says things only a mother would know. That voice could be very | persuasive. | | that was the kind of stuff we talked about years ago. There were | no instantaneous results, of course, and it became boring and | uncool to keep going on about such things. | | Nonetheless, it seems now that the tools and incentives needed to | create a dystopia such as what I described are really starting to | come into focus. | trompetenaccoun wrote: | The better the tech, the better it can be used for good as well | as evil purposes such as deception/scams, disinformation, mass | surveillance, election manipulation, etc. | | It's not the tech that's wrong, it's the populace in democratic | states losing more and more power, to the point where most of | these systems can only be described as hybrid regimes anymore. | The actual power does not lie with the voters but corporations, | the mega wealthy, their various lobby groups and corrupt | politicians. There is no monopoly of power exercised by elected | officials and law enforcement respecting the constitutions, | it's different groups and fractions fighting each other for | supremacy. AI is just another tool at their disposal, of course | they're making use of it. | | Guns can be used for protection as well as oppression. The | internet can be about free information or about censorship and | spying on users. We can live in digital Maoism or digital | liberalism. It's up to the common people and for them to | realize this before it's too late. | [deleted] | O__________O wrote: | Related paper: | | https://proceedings.neurips.cc/paper/2021/hash/ccf8111910291... | greggarious wrote: | danuker wrote: | Should downloading/looking at people's chess games not be | socially acceptable, in spite of it being a good way to learn? | greggarious wrote: | I think the issue is if you go beyond "who am I playing" to | connecting that to the rest of their online life, if that | makes sense. | | It's fine to want to tailor strategy but not say, out their | dissident writings. | dmurray wrote: | > They gave the system 100 games from each of about 3000 known | players, and 100 fresh games from a mystery player. To make the | task harder, they hid the first 15 moves of each game. The system | looked for the best match and identified the mystery player 86% | of the time...A non-AI method was only 28% accurate. | | This sounds incredible, to pick the right player out of 3000 | candidates 86% of the time. | | I am not sure that pruning the first 15 moves is enough to | eliminate the information you get from choice of opening (which | is presumably the intention of the restriction). For example, if | a player religiously plays the Najdorf Sicilian as Black, you can | immediately rule out many(most? ) positions that started with a | French or a Ruy Lopez. | | I'd like to see what the best results are from a model that just | looks at the position after move 15, and use that as a baseline. | oneoff786 wrote: | Does anyone good religiously play the same opening as black? | | I don't know chess but that sounds like a bad idea | bee_rider wrote: | It seems really impressive. | | I wonder how a chess GM would do at this test (although we'd | have to restrict it to other top GMs that are active at the | same time as them). | dmurray wrote: | > I'd like to see what the best results are from a model that | just looks at the position after move 15, and use that as a | baseline. | | Reading the paper, they have an "opening baseline" which | consists of frequency analysis on a player's first 5 moves. | That model has 93% accuracy! | | The mapping of first-five-move sequences to the positions | obtained after them is almost a bijection (there are some | transposition, but a small effect) so that's similar to my | proposal. | | I can't tell whether the 15-move cutoff is 15 half-moves, so | 7-8 moves per player, or 15 moves each which is how every chess | player would read the sentence. | | Either way, I haven't completely read the paper yet, but I | don't think it addresses the rebuttal of "I will just change my | openings and the machine won't detect me". | abecedarius wrote: | Yeah, that means it extracts roughly 12 bits of information | from observing a game after move 15. It takes 33 bits to pick | out one human from everyone alive. You get at least 3 bits just | knowing someone plays chess, so you're halfway there? | | Good point about baseline. | dmurray wrote: | From observing 100 games after move 15, I think you mean. | abecedarius wrote: | Oh, I thought the 100 applied only to training. But the | article agrees with you -- thanks. | kirse wrote: | Where can one learn more about this technique of equating | bits to pieces of information? I get that it's used in | various contexts like cryptography, randomness, compression, | games of Guess Who, etc. but basically just nod in fake | agreement when someone formally describes a system this way. | Like what first principles did you use to make this | statement: | | _It takes 33 bits to pick out one human from everyone alive_ | | Is it basically just 2^33 > ~8 billion humans, therefore | that's the minimum information context to identify a single | individual? But then what counts as an information bit - any | valid Yes/No question? And how do you calculate the bit value | of a piece of info (i.e. 3 bits for the knowledge of playing | chess)? | tijsvd wrote: | It's the field of information theory. | | https://en.m.wikipedia.org/wiki/Information_theory | notafraudster wrote: | It's any valid question at all, it needn't be yes or no. | Humans have, at minimum, red, black, brown, blonde, dyed, | grey, white, and no hair. Learning what colour hair a | person has eliminates the other categories. | | The way to think about the information content of a problem | or of something you learn is exactly what you're | suggesting. If you numbered every living person on earth, | it'd take more than 32 bits and not quite fill the 33rd | bit. | | If you then learn a person's gender, you can eliminate all | the people with the incorrect gender, which is going to | leave you either 31.x bits (assuming binary gender) or | 25-27 bits of remaining entropy (assuming some non-binary | gender and, say, a 1-3% incidence rate). | | When the parent you're responding to says you get 3 bits | for knowing someone plays chess, they're guessing that | 1/(2^3) = 1/8 of people, in an undifferentiated sense, play | chess. Of course if we knew someone's age or gender or | country of origin, the conditional information value in | knowing they play chess could be greater or lesser. And | realistically no one is ever trying to identify a human | among all humans (partially because it seems highly | unlikely that there are many questions that could equally | implicate the president of the United States and a six year | old on the Marshall Islands in their answer). Each bit of | information represents a halving of the entropy of the | target surface. | | I think you got to within 1 bit of the answer from first | principles ;) | InitialLastName wrote: | It's a rough estimation technique. If every choice/factor | divides the number of candidates in half (on average), you | can choose between 2^N candidates with N yes/no questions | (on average). | | I'd assume that they are estimate that 1/8 of the human | population plays chess, which feels like an over-estimate | to me (but not absurdly, depending on your threshold of | "plays"; by a similar process I'd estimate that at least | 1/8 of humans alive are under 10 years old). | abecedarius wrote: | > feels like an over-estimate | | That's why I said at _least_ 3 bits. If chess players are | rarer, then knowing someone is a chess player is a | stronger filter. (By the same token, knowing that they | 're not is a weaker one; but that's not the case we're | discussing.) ___________________________________________________________________ (page generated 2022-12-11 23:01 UTC)